We are looking for the top 5% of India's scientific and mathematical minds to build frontier-level evaluation data. Your goal is to create complex, text-based reasoning problems that can challenge and evaluate the capabilities of the world's most advanced, next-generation AI models (such as GPT-5.4 and Opus 4.8).
This is not a basic data-entry or standard content-writing role. It involves high-level, programmatically verifiable problem-solving designed to push the limits of artificial intelligence.
Selected intern's day-to-day responsibilities include
Create complex, PhD-level STEM problems and evaluation datasets entirely in text and mathematical formats (non-multimodal).
Design rigorous, well-defined problems with solutions that can be programmatically verified as definitively correct or incorrect.
Develop evaluation tasks that stress-test next-generation AI models and maintain an elite difficulty standard, ensuring models are challenged effectively.
Create advanced problems across domains such as mathematics, theoretical physics, chemistry, and life sciences (molecular biology/biotechnology).
Contribute to AI alignment research, model benchmarking, and automated code-based verification frameworks.
Skill(s) required
Arduino C++ Programming C Programming Embedded Systems Internet of Things (IoT) Python Raspberry Pi Robotics
Who can apply
Only those candidates can apply who
are available for the work from home job/internship
can start the work from home job/internship between 30th Jun'26 and 4th Aug'26
are available for duration of 6 weeks
Don't want to miss the next one?
Subscribe to daily email alerts for roles matching your interests.
Mathematics (33% of project scope): Advanced theory, proofs, and algorithmic structures.
Other STEM fields (67% of project scope): Advanced theoretical physics, organic/inorganic chemistry, and life sciences (molecular biology/biotechnology).
Who Can Apply? (Strict Criteria):
PhD candidates or PhD graduates in Mathematics, Physics, Chemistry, or Biological Sciences from Tier-1 Indian institutes such as IITs, IISc, IISERs, or top central universities.
Final-year Master’s students (M.Sc./M.Tech.) from premier institutions with an exceptional academic track record in core research may also be considered.
Candidates with a deep interest in AI alignment, model benchmarking, and automated code-based verification frameworks.
Perks
Flexible work hours 5 days a week
Additional information
Stipend Structure
Incentive pay: $ 10,000 - 50,000 /month
What We Offer:
Fully remote work with highly flexible hours.
Premium compensation with tier-based variable pricing based on domain expertise and evaluation quality.
Opportunity to contribute directly to the training and evaluation protocols of world-class enterprise LLMs.
How to Apply:
Click "Apply Now" on Internshala.
Provide a short description of your research background or primary domain expertise.
Be prepared to share or demonstrate a brief sample of text-based, highly complex evaluation logic in your subject area during the initial vetting process.
Number of openings
200
About Vedron.ai
Delhi
Vedron is the operating layer for practical AI, partnering with frontier AI labs, enterprises, and government agencies to build and validate the next era of intelligent systems. Vedron's evaluation science platform combines model-agnostic methods with a verified network of 10,000+ domain experts across law, finance, science, medicine, engineering, public policy, and more. This expert network underpins structured, human-grounded evaluations, red team drills, and alignment workflows that meaningfully improve model performance and safety.