C

Generative AI Engineer. No 3rd parties

C&G Consulting Services Inc
Full-time
On-site
Raritan
NO 3 Party candidates

Candidates must be direct

DO NOT RESPOND UNLESS YOU ARE DIRECT

Key Responsibilities: High-Throughput RAG Pipeline Development: • Design and build scalable document processing pipelines to ingest and semantically chunk large batches of documents (PDF/DOCX) from sources like Azure Blob and AWS S3. • Integrate embedding models and tune vector databases like Milvus for high-performance, sub-100 ms k-NN retrieval. • Implement hybrid retrieval systems using BM25 and vector search, and continually track and improve retrieval performance using metrics like MRR and recall@k. Model Fine-Tuning & Prompt Engineering: • Apply large language models (LLMs) and NLP techniques to solve complex problems such as named-entity recognition, question answering, and summarization. • Build fine-tuning pipelines using frameworks like LoRA/PEFT and run hyperparameter sweeps in Azure ML. • Author multi-step prompt chains, enforce structured JSON outputs, and use validation guards to reduce hallucinations and improve model consistency. MLOps & Production Deployment: • Develop and containerize agent-based microservices using frameworks like FastAPI or Azure Functions. • Define Infrastructure as Code using Terraform/ARM and build CI/CD workflows in GitHub Actions for automated testing and canary rollouts. • Implement robust monitoring and alerting for latency (p50/p95) and error rates using tools like Prometheus, Grafana, or Azure Monitor to ensure SLA compliance. Performance, Cost & Standards: • Profile API calls and implement cost-reduction strategies like batching, caching, and early-stop logits. • Produce high-quality documentation, including architecture diagrams, sequence flows, and data schemas. • Enforce security and compliance standards, including data encryption and PII redaction, to align with HIPAA/GxP requirements.

Required Qualifications: • BS/MS in Computer Science, AI/ML, or a related field. • 3+ years of experience building end-to-end LLM/RAG systems in a production environment. • Deep Python experience, including libraries like FastAPI, pandas, and NumPy. • Hands-on experience with LLM orchestration frameworks (LangChain, LlamaIndex), NLP libraries (HuggingFace), and OpenAI/Azure SDKs. • Proven expertise in MLOps including CI/CD (GitHub Actions/Azure DevOps) and containerization (Docker/Kubernetes).

Preferred Qualifications (Nice-to-Haves): • Experience working in a regulated industry such as pharmaceuticals or life sciences. • Hands-on experience with vector databases like Milvus or Pinecone. • Familiarity with chatbot frameworks like Rasa or Botpress. • Experience with data-centric AI tools for validation and monitoring, such as Great Expectations or Deepchecks.