ZS
is a place where passion changes lives. As a management consulting and technology firm focused on improving life and how we live it, we transform ideas into impact by
bringing together data, science, technology
and
human ingenuity
to deliver better outcomes for all. Here you'll work side-by-side with a powerful collective of thinkers and experts shaping life-changing solutions for patients, caregivers and consumers, worldwide. ZSers drive impact by bringing a
client-first mentality
to each and every engagement. We partner collaboratively with our clients to develop custom solutions and technology products that create value and deliver company results across critical areas of their business. Bring your curiosity for learning, bold ideas, courage and passion to drive life-changing impact to ZS.
What you'll do:
Lead AI Engineer
in the
Platforms and Products
will...
We are seeking a highly motivated
Applied AI Engineer
with a strong foundation in Machine Learning and a deep interest in
Large Language Models (LLMs)
and
Generative AI . This role focuses on building, optimizing, and evaluating
production-grade LLM systems , including
Retrieval-Augmented Generation (RAG) , fine-tuning workflows, and scalable inference pipelines.
Design and implement
LLM-powered applications
using state-of-the-art transformer models.
Build and optimize
RAG pipelines
using embeddings, chunking strategies, and vector search.
Experiment with
prompt engineering ,
structured outputs
(JSON schemas/function calling), and
tool-augmented LLMs
(agents/workflows).
Fine-tune models using techniques such as
LoRA ,
PEFT , and
instruction tuning .
Develop and evaluate
embedding models
for similarity search and semantic retrieval.
Conduct
LLM evaluation
using automated and
human-in-the-loop
techniques (offline + online).
Optimize inference workflows for
latency ,
GPU utilization , and
cost efficiency
(quantization, batching, caching).
Build and maintain REST API Services (FastAPI etc.) to deploy LLM/RAG endpoints, integrate with product systems, and support scalable inference.
Contribute to integration of AI systems into
production software environments
(CI/CD, monitoring, reliability).
Research and prototype cutting-edge approaches in Generative AI and share learnings with the team.
What you'll bring:
A master's or bachelor's degree in Computer Science or related field from a top university
4+ years' hands-on experience in Machine Learning (ML) with production LLM systems
Good fundamentals of machine learning, deep learning and fine tuning models (LLM) including:
Understanding of transformer architectures
Prompt engineering expertise
Embeddings and vector search
Experienced in backend API design with FastAPI, async patterns, rate limiting
Experience with vector DB including:
Pinecone, Weaviate, or Chroma
Embedding storage and similarity search
Hybrid search implementations
Strong programming expertise in Python is must including:
Async programming (asyncio, async/await)
Type hints and Pydantic
SOLID principles and design patterns
Experience in ML Ops to measure and track model performance including:
MLFlow for model tracking
Langfuse for LLM observability (strongly preferred)
Model versioning and A/B testing
Experience in working with NLP & computer vision
Fluency in English
Client-first mentality
Intense work ethic
Collaborative spirit and problem-solving approach
At ZS, your growth matters. We offer a comprehensive total rewards package that supports your health and well-being, financial future, time away, and professional development. With robust skills-building programs, multiple career progression paths, internal mobility, and a deeply collaborative culture, you'll have the opportunity to do meaningful work, expand your capabilities, and thrive as part of a global community. For details on total rewards in United States, visit ZS US office locations | Where we work | ZS.