Z

Lead AI Engineer

ZS
7 hours ago
Full-time
On-site
ZS

is a place where passion changes lives. As a management consulting and technology firm focused on improving life and how we live it, we transform ideas into impact by

bringing together data, science, technology

and

human ingenuity

to deliver better outcomes for all. Here you'll work side-by-side with a powerful collective of thinkers and experts shaping life-changing solutions for patients, caregivers and consumers, worldwide. ZSers drive impact by bringing a

client-first mentality

to each and every engagement. We partner collaboratively with our clients to develop custom solutions and technology products that create value and deliver company results across critical areas of their business. Bring your curiosity for learning, bold ideas, courage and passion to drive life-changing impact to ZS.

What you'll do:

Lead AI Engineer

in the

Platforms and Products

will...

We are seeking a highly motivated

Applied AI Engineer

with a strong foundation in Machine Learning and a deep interest in

Large Language Models (LLMs)

and

Generative AI . This role focuses on building, optimizing, and evaluating

production-grade LLM systems , including

Retrieval-Augmented Generation (RAG) , fine-tuning workflows, and scalable inference pipelines. Design and implement

LLM-powered applications

using state-of-the-art transformer models. Build and optimize

RAG pipelines

using embeddings, chunking strategies, and vector search. Experiment with

prompt engineering ,

structured outputs

(JSON schemas/function calling), and

tool-augmented LLMs

(agents/workflows). Fine-tune models using techniques such as

LoRA ,

PEFT , and

instruction tuning . Develop and evaluate

embedding models

for similarity search and semantic retrieval. Conduct

LLM evaluation

using automated and

human-in-the-loop

techniques (offline + online). Optimize inference workflows for

latency ,

GPU utilization , and

cost efficiency

(quantization, batching, caching). Build and maintain REST API Services (FastAPI etc.) to deploy LLM/RAG endpoints, integrate with product systems, and support scalable inference. Contribute to integration of AI systems into

production software environments

(CI/CD, monitoring, reliability). Research and prototype cutting-edge approaches in Generative AI and share learnings with the team. What you'll bring:

A master's or bachelor's degree in Computer Science or related field from a top university 4+ years' hands-on experience in Machine Learning (ML) with production LLM systems Good fundamentals of machine learning, deep learning and fine tuning models (LLM) including:

Understanding of transformer architectures Prompt engineering expertise Embeddings and vector search

Experienced in backend API design with FastAPI, async patterns, rate limiting Experience with vector DB including:

Pinecone, Weaviate, or Chroma Embedding storage and similarity search Hybrid search implementations

Strong programming expertise in Python is must including:

Async programming (asyncio, async/await) Type hints and Pydantic SOLID principles and design patterns

Experience in ML Ops to measure and track model performance including:

MLFlow for model tracking Langfuse for LLM observability (strongly preferred) Model versioning and A/B testing

Experience in working with NLP & computer vision Fluency in English Client-first mentality Intense work ethic Collaborative spirit and problem-solving approach

At ZS, your growth matters. We offer a comprehensive total rewards package that supports your health and well-being, financial future, time away, and professional development. With robust skills-building programs, multiple career progression paths, internal mobility, and a deeply collaborative culture, you'll have the opportunity to do meaningful work, expand your capabilities, and thrive as part of a global community. For details on total rewards in United States, visit ZS US office locations | Where we work | ZS.