A company is looking for a Generative AI Inference Engineer.
Key Responsibilities
Lead the design and development of customer-facing multi-modal ML inference systems
Collaborate with Platform and Inference teams on optimization, model tuning, and deployment of inference systems
Prototype and productionize improvements and new features for the inference platform
Required Qualifications
7+ years of experience in productionizing machine learning systems, including inference pipeline development
Expert knowledge in writing and running Python services at scale
5+ years of experience with the Python scientific stack, PyTorch, and high-performance inference frameworks
Deep understanding of diffusion architecture and experience optimizing deep neural networks on Nvidia GPUs
Experience with cloud orchestration systems and deployment to cloud providers such as AWS, GCP, and Azure