Sr. Software Engineer- AI/ML, AWS Neuron Apps

About the job

Join the elite team behind AWS Neuron—the software stack powering AWS's next-generation AI accelerators Inferentia and Trainium. As a Senior Software Engineer in our Machine Learning Applications team, you'll be at the forefront of deploying and optimizing some of the world's most sophisticated AI models at unprecedented scale.

Responsibilities

Pioneer distributed inference solutions for industry-leading LLMs such as GPT, Llama, Qwen

Optimize breakthrough language and vision generative AI models

Collaborate directly with silicon architects and compiler teams to push the boundaries of AI acceleration

Drive performance benchmarking and tuning that directly impacts millions of inference calls globally

Spearhead distributed inference architecture for PyTorch and JAX using XLA

Engineer breakthrough performance optimizations for AWS Trainium and Inferentia

Develop ML tools to enhance LLM accuracy and efficiency

Transform complex tensor operations into highly optimized hardware implementations

Pioneer benchmarking methodologies that shape next-gen AI accelerator design

Qualifications

Minimum

Deep expertise in Python and ML framework internals

Strong understanding of distributed systems and ML optimization

Passion for performance tuning and system architecture

Preferred

Master's degree in computer science or equivalent

Master's degree in machine learning or equivalent

Experience with accuracy debugging & tooling, performance benchmarking of AI accelerators

Experience in developing CUDA kernels, HPC and inference optimization, tensors operations