About the job
Do you enjoy working on large-scale distributed systems and Agentic AI ? The team manages the Apache Spark Upgrade Agent for Analytics services (Glue, EMR, Athena), the Remote MCP Service (SageMaker Unified Studio MCP), and the AWS Glue Python Service - Product and Runtime. Apache Spark Upgrade Agent, reduces upgrade timelines from months to weeks through conversational interfaces, automated code transformation, and rigorous data quality validation, allowing engineers to use natural language for upgrades while maintaining approval control over changes and freeing engineering capacity for strategic innovation. The Remote MCP service is a fully managed Model Context Protocol server that provides customers using AI assistants (Kiro, Claude, Cline) and Agents secure access to value-added Spark Troubleshooting, Code Recommendation and Upgrade tools for Analytics services. The team also manages Glue Python Shell product and runtime.
Responsibilities
- Collaborate with experienced cross-disciplinary Amazonians to conceive, design, and bring innovative Agentic AI products and enable our customer to have seamless transformation experience.
- Design and build innovative technologies by applying Generative AI in a large distributed computing environment and help lead fundamental changes in the industry.
- Design and code the right solutions starting with broadly defined problems.
- Work in an agile environment to deliver high-quality software applying Generative AI.
Qualifications
Minimum
- 3+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- 3+ years of non-intternship professional software development experience
- 1+ years of software development engineer or related occupational experience
- 1+ years of designing and developing large-scale, multi-tiered, multi-threaded, embedded or distributed software applications, tools, systems, and services using: C#, C++, Java, or Perl experience
- Bachelor's degree or foreign equivalent in Computer Science, Engineering, Mathematics, or a related field
- Experience programming with at least one software programming language
Preferred
- 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Knowledge of Machine Learning and LLM fundamentals, including transformer architecture, training/inference lifecycles, and optimization techniques
- Experience with Machine Learning and Large Language Model fundamentals, including architecture, training/inference lifecycles, and optimization of model execution