We are looking for the top 5% of India's scientific and mathematical minds to build frontier-level evaluation data. The goal is to create complex, text-based reasoning problems capable of challenging the world's most advanced next-generation AI models.
This is not a basic data-entry or content-writing role. It involves high-level, programmatically verifiable problem-solving designed to push the limits of artificial intelligence.
Selected intern's day-to-day responsibilities include
Create complex, PhD-level STEM problems and evaluation datasets in text and mathematical formats (non-multimodal).
Design rigorous problems with solutions that can be strictly verified using computer code as definitively correct or incorrect.
Develop challenging evaluation tasks to stress-test next-generation AI models and maintain an elite difficulty standard.
Ensure problems meet the required difficulty benchmark, where AI models fail to solve them correctly at least 50% of the time across multiple independent attempts (Pass@8 metric).
Skill(s) required
Mathematics
Who can apply
Only those candidates can apply who
are available for the work from home job/internship
can start the work from home job/internship between 30th Jun'26 and 4th Aug'26
are available for duration of 2 months
have relevant skills and interests
Other requirements
Who Can Apply? (Strict Criteria)
PhD scholars or PhD candidates in Mathematics, Physics, Chemistry, or Biological Sciences from Tier-1 Indian institutes such as IITs, IISc, IISERs, and top central universities.
Don't want to miss the next one?
Subscribe to daily email alerts for roles matching your interests.
Final-year Master’s students (M.Sc/M.Tech) from premier institutions with an exceptional academic track record in core research may also be considered.
Candidates with a deep interest in AI alignment, model benchmarking, and automated code-based verification frameworks.
Domain Requirements:
Candidates with expertise in the following domains are encouraged to apply:
Mathematics (33% of project scope): Advanced theory, proofs, and algorithmic structures.
Other STEM Fields (67% of project scope): Advanced theoretical physics, advanced organic/inorganic chemistry, and life sciences (molecular biology/biotechnology).
Perks
Certificate Letter of recommendation Flexible work hours 5 days a week
Additional information
Stipend Structure
Incentive pay: $ 5,000 - 10,000 /month
What We Offer:
Work From Home: Fully remote opportunity with highly flexible working hours.
Premium Compensation: Tier-based variable compensation based on domain expertise and evaluation quality.
Frontier Exposure: Opportunity to contribute directly to the training and evaluation protocols of world-class enterprise LLMs.
How to Apply:
Click "Apply Now" on Internshala.
Provide a short description of your research background or primary domain expertise.
Be prepared to share or demonstrate a brief sample of text-based, highly complex evaluation logic in your subject area during the initial vetting process.
Number of openings
400
About Vedron.ai
Delhi
Vedron is the operating layer for practical AI, partnering with frontier AI labs, enterprises, and government agencies to build and validate the next era of intelligent systems. Vedron's evaluation science platform combines model-agnostic methods with a verified network of 10,000+ domain experts across law, finance, science, medicine, engineering, public policy, and more. This expert network underpins structured, human-grounded evaluations, red team drills, and alignment workflows that meaningfully improve model performance and safety.