Generative AI Engineer. No 3rd parties

Full-time

On-site

Raritan

NO 3 Party candidates

Please make sure you read the following details carefully before making any applications.

Candidates must be direct

DO NOT RESPOND UNLESS YOU ARE DIRECT

Key Responsibilities: High-Throughput RAG Pipeline Development: • Design and build scalable document processing pipelines to ingest and semantically chunk large batches of documents (PDF/DOCX) from sources like Azure Blob and AWS S3. • Integrate embedding models and tune vector databases like Milvus for high-performance, sub-100 ms k-NN retrieval. • Implement hybrid retrieval systems using BM25 and vector search, and continually track and improve retrieval performance using metrics like MRR and recall@k. Model Fine-Tuning & Prompt Engineering: • Apply large language models (LLMs) and NLP techniques to solve complex problems such as named-entity recognition, question answering, and summarization. • Build fine-tuning pipelines using frameworks like LoRA/PEFT and run hyperparameter sweeps in Azure ML. • Author multi-step prompt chains, enforce structured JSON outputs, and use validation guards to reduce hallucinations and improve model consistency. MLOps & Production Deployment: • Develop and containerize agent-based microservices using frameworks like FastAPI or Azure Functions. • Define Infrastructure as Code using Terraform/ARM and build CI/CD workflows in GitHub Actions for automated testing and canary rollouts. • Implement robust monitoring and alerting for latency (p50/p95) and error rates using tools like Prometheus, Grafana, or Azure Monitor to ensure SLA compliance. Performance, Cost & Standards: • Profile API calls and implement cost-reduction strategies like batching, caching, and early-stop logits. • Produce high-quality documentation, including architecture diagrams, sequence flows, and data schemas. • Enforce security and compliance standards, including data encryption and PII redaction, to align with HIPAA/GxP requirements. Required Qualifications: • BS/MS in Computer Science, AI/ML, or a related field. • 3+ years of experience building end-to-end LLM/RAG systems in a production environment. • Deep Python experience, including libraries like FastAPI, pandas, and NumPy. • Hands-on experience with LLM orchestration frameworks (LangChain, LlamaIndex), NLP libraries (HuggingFace), and OpenAI/Azure SDKs. • Proven expertise in MLOps including CI/CD (GitHub Actions/Azure DevOps) and containerization (Docker/Kubernetes). Preferred Qualifications (Nice-to-Haves): • Experience working in a regulated industry such as pharmaceuticals or life sciences. • Hands-on experience with vector databases like Milvus or Pinecone. • Familiarity with chatbot frameworks like Rasa or Botpress. • Experience with data-centric AI tools for validation and monitoring, such as Great Expectations or Deepchecks.

Apply now

Share this job

Twitter Facebook Linkedin Email

Generative AI Engineer. No 3rd parties

More jobs

GEN AI Engineer

The Dignify Solutions, LLC

Generative AI Engineer. No 3rd parties

C&G Consulting Services Inc