NO 3 Party candidates
Please make sure you read the following details carefully before making any applications.
Candidates must be direct
DO NOT RESPOND UNLESS YOU ARE DIRECT
Key Responsibilities:
High-Throughput RAG Pipeline Development:
• Design and build scalable document processing pipelines to ingest and semantically chunk large batches of documents (PDF/DOCX) from sources like Azure Blob and AWS S3.
• Integrate embedding models and tune vector databases like Milvus for high-performance, sub-100 ms k-NN retrieval.
• Implement hybrid retrieval systems using BM25 and vector search, and continually track and improve retrieval performance using metrics like MRR and recall@k.
Model Fine-Tuning & Prompt Engineering:
• Apply large language models (LLMs) and NLP techniques to solve complex problems such as named-entity recognition, question answering, and summarization.
• Build fine-tuning pipelines using frameworks like LoRA/PEFT and run hyperparameter sweeps in Azure ML.
• Author multi-step prompt chains, enforce structured JSON outputs, and use validation guards to reduce hallucinations and improve model consistency.
MLOps & Production Deployment:
• Develop and containerize agent-based microservices using frameworks like FastAPI or Azure Functions.
• Define Infrastructure as Code using Terraform/ARM and build CI/CD workflows in GitHub Actions for automated testing and canary rollouts.
• Implement robust monitoring and alerting for latency (p50/p95) and error rates using tools like Prometheus, Grafana, or Azure Monitor to ensure SLA compliance.
Performance, Cost & Standards:
• Profile API calls and implement cost-reduction strategies like batching, caching, and early-stop logits.
• Produce high-quality documentation, including architecture diagrams, sequence flows, and data schemas.
• Enforce security and compliance standards, including data encryption and PII redaction, to align with HIPAA/GxP requirements.
Required Qualifications:
• BS/MS in Computer Science, AI/ML, or a related field.
• 3+ years of experience building end-to-end LLM/RAG systems in a production environment.
• Deep Python experience, including libraries like FastAPI, pandas, and NumPy.
• Hands-on experience with LLM orchestration frameworks (LangChain, LlamaIndex), NLP libraries (HuggingFace), and OpenAI/Azure SDKs.
• Proven expertise in MLOps including CI/CD (GitHub Actions/Azure DevOps) and containerization (Docker/Kubernetes).
Preferred Qualifications (Nice-to-Haves):
• Experience working in a regulated industry such as pharmaceuticals or life sciences.
• Hands-on experience with vector databases like Milvus or Pinecone.
• Familiarity with chatbot frameworks like Rasa or Botpress.
• Experience with data-centric AI tools for validation and monitoring, such as Great Expectations or Deepchecks.