via Internshala·2d ago

Applied Scientist

Internshala

Full-timeOn-site

Location:NanakramgudaType:Full-timePosted:2d ago

Apply on Internshala

About the job

About Us

We are building transformer models focused on multilingual translation and custom transformer architectures. Our team works on large-scale NLP and transformer-based models with a strong focus on research, experimentation, model optimization, and production-grade AI systems.We are looking for a highly technical Data Scientist who is passionate about developing and improving AI/ML models, working on transformer architectures, and contributing to advanced NLP research and development. This is a core engineering and research role not a prompt engineering or API integration role.Role Overview As a Data Scientist, you will work on the training, fine-tuning, evaluation, and optimization of transformer-based NLP systems and LLMs. You will collaborate closely with engineering and research teams to build scalable AI models and contribute to the development of advanced multilingual AI solutions. You should be comfortable working with model training pipelines, datasets, tokenization techniques, transformer architectures, and GPU-based training environments. A strong interest in experimentation, model performance improvement, and real-world AI deployment is highly valued.

Key Responsibilities

Lead the architecture and development of transformer-based AI systems
Drive technical direction for NLP, LLM, and multilingual AI initiatives
Mentor and guide junior ML engineers and researchers
Train and fine-tune transformer models on large-scale custom datasets
Work on Seq2Seq / encoder-decoder architectures for translation and text generation
Optimize model performance using LoRA, QLoRA, PEFT, quantization, and distributed training techniques
Design tokenization pipelines using BPE / SentencePiece
Evaluate models using BLEU, perplexity, accuracy, and custom benchmarks
Collaborate with platform teams for GPU infrastructure and scalable deployments

Required Qualifications

2+ years of experience in Data Science / Machine Learning / NLP
Strong hands-on expertise in Transformers, LLMs, Seq2Seq Architectures, and Attention Mechanisms
Proven experience in model training and fine-tuning using custom datasets
Hands-on experience with Hugging Face, PyTorch, or TensorFlow
Vector Databases: Pinecone, Milvus for embeddings and semantic search in translation or LLM applications.
Experience with distributed training, GPU optimization, and NLP evaluation metrics

Don't want to miss the next one?

Subscribe to daily email alerts for roles matching your interests.

Get email alerts

Applied Scientist

About the job

About Us

Key Responsibilities

Who can apply

Only those candidates can apply who

Salary

Number of openings