About the job
A senior technical contributor that drives end-to-end delivery of software solutions, directly contributing to, and coordinating implementation and optimization across multiple teams for inference and training of machine learning models. The position will involve interfacing with software and hardware engineering teams and AMD partners to plan, develop and optimize use cases. This is an exciting opportunity to work on the cutting edge of GPU Computing for Machine Learning.
Responsibilities
Develop and implement the overall QA strategy and frameworks for testing GPU-based software products, spanning various hardware and software configurations.
Evaluate and improve existing QA methodologies, tools, and processes and best practices, including automation tools, testing methodologies, test configuration management, and performance testing techniques
Collaborate with software developers, program managers, QA teams, and other stakeholders to incorporate their feedback into test strategy and design.
Define cataloging methods for test plans, test suites, and test cases that cover functional and non-functional requirements
Analyze and debug complex failure scenarios in GPU software environment, including root cause analysis and implementation of corrective actions.
Establish and monitor metrics to assess the efficiency and effectiveness of the Software development process, utilizing data-driven insights to drive continuous improvement.
Provide training and mentorship to QA engineers and other stakeholders on best practices, testing methodologies, and tools used in the QA process.
Stay current with the latest trends and technologies in the Compute domain to ensure the implementation of best practices and cutting-edge testing methodologies.
Aware of industry standards and regulations, including ISO, IEEE, and other relevant standards.
Qualifications
Minimum
No minimum qualifications listed.
Preferred
Relevant experience in Machine Learning and/or GPU programming
Experience in deep learning frameworks (e.g. TensorFlow, Keras, PyTorch, Caffe, ONNX, etc) and familiarity with CNN/LSTM model architectures
Knowledge of CPU and GPU architecture, and experience in GPGPU programming technologies
Proven experience in a SW or QA Architect or Senior Technical Engineer role
Strong knowledge of software development methodologies, tools, and processes, including test planning, test design, test execution, and defect management.
Expertise in embedded software process, systems architecture and GPU technologies, including programming skills, such as C, C++ and Python
Familiarity with various GPU hardware platforms and wide variety of operating systems(Linux and Windows) variants
Experience with automated testing tools as well as experience in Continuous Integration and Continuous Deployment (CI/CD) pipelines process.
Strong analytical and problem-solving skills, with an ability to debug and resolve complex issues in software systems.
Excellent communication, collaboration skills, with the ability to effectively work with cross-functional teams and diverse stakeholders
Led or played key role in QA teams' transformations to agile development and validation methods