via Internshala·2d ago

AI/ML Inference Kernel Engineer

Internshala

InternshipRemote

Location:Work from homeType:InternshipStipend:₹10k – ₹25k/moPosted:2d ago

Apply on Internshala

About the work from home job/internship

About Us

We are building the cloud execution layer for Physical AI and next-generation multimodal workloads. We begin with the models powering robots, world simulators, spatial systems, and visual intelligence, and build the infrastructure that makes these compute-intensive AI workloads faster, easier, and more cost-effective to run.

The Role:

We are looking for an ML Systems & Inference Engineer to build the technical foundation of our platform. This role sits at the intersection of model serving, GPU systems, performance engineering, and cloud infrastructure.

This is not a generic backend role or a pure research role. The ideal candidate should be able to understand model code, profile runtimes, identify bottlenecks, develop optimizations, measure improvements, and ship reliable infrastructure solutions.

Selected intern's day-to-day responsibilities include

Build and optimize cloud inference pipelines for Physical AI, multimodal, generative, simulation, and world-model workloads.

Improve performance across startup time, queue time, latency, throughput, GPU utilization, reliability, and cost per output or job.

Develop platform execution components, including model packaging, warm pools, artifact and model caching, batching, queueing, scheduling, and model-aware execution policies.

Apply optimization techniques such as dynamic batching, quantized model variants, and torch. compile, TensorRT, ONNX Runtime, caching, routing, and distributed execution.

Profile bottlenecks across GPU compute, memory bandwidth, CPU preprocessing, I/O, model loading, serialization, queueing, and serving overhead.

Build benchmarking and evaluation systems to measure latency, throughput, startup time, memory usage, GPU utilization, cost, reliability, and workload quality.

Convert execution telemetry into product capabilities such as performance reporting, cost visibility, configuration recommendations, and workload comparisons.

Don't want to miss the next one?

Subscribe to daily email alerts for roles matching your interests.

Get email alerts

AI/ML Inference Kernel Engineer

About the work from home job/internship

About Us

Selected intern's day-to-day responsibilities include

Skill(s) required

Who can apply

Only those candidates can apply who

Other requirements

Perks

Number of openings

About Hubnine India Private Limited