V

AI Engineering Lead

Vichara Technologies, Inc.
Full-time
On-site
Ridgewood
$125,000 - $150,000 USD yearly
Compensation: USD 200,000 - USD 300,000 - yearly Company Description

Vichara is a Financial Services focused products and services firm headquartered in NY and building systems for some of the largest i-banks and hedge funds in the world. Job Description

Key Responsibilities Architect, design, and lead

multi-agent LLM systems

using

LangGraph, LangChain, and Promptfoo

for prompt lifecycle management and benchmarking. Build

Retrieval-Augmented Generation (RAG)

pipelines leveraging

hybrid vector search

(dense + keyword) using

LanceDB, Pinecone, or Elasticsearch . Define system workflows for summarization, query routing, retrieval, and response generation, ensuring minimal latency and high precision. Develop

RAG evaluation frameworks

combining retrieval precision/recall, hallucination detection, and latency metrics — aligned with analyst and business use cases. Integrate

GPT-4o, PaLM 2, and open-weight models (LLaMA, Mistral)

for task-specific contextual Q&A. Fine-tune transformer models (BERT, SentenceTransformers) for document classification, summarization, and sentiment analysis. Manage prompt routing and variant testing using

Promptfoo

or equivalent tools. Implement

multi-agent architectures

with modular flows — enabling task-specific agents for summarization, retrieval, classification, and reasoning. Design

fallback and recovery behaviors

to ensure robustness in production. Employ

LangGraph

for parallel and stateful agent orchestration, error recovery, and deterministic flow control. Architect ingestion pipelines for structured and unstructured data — including financial statements, filings, and PDF documents. Leverage

MongoDB

for metadata storage and

Redis Streams

for async task execution and caching. Implement vector-based search and retrieval layers for high-throughput and low-latency AI systems. Observability & Production Deployment Deploy end-to-end AI systems on

AWS EKS / Azure Kubernetes Service , integrated with

CI/CD pipelines (Azure DevOps) . Build comprehensive

monitoring dashboards

using

OpenTelemetry

and

Signoz , tracking latency, retrieval precision, and application health. Enforce testing and regression validation using golden datasets and structured assertion checks for all LLM responses. Collaborate with DevOps, MLOps, and application development teams to integrate AI APIs with

React / FastAPI -based user interfaces. Work with business analysts to translate credit, compliance, and customer-support requirements into actionable AI agent workflows. Mentor a small team of GenAI developers and data engineers in RAG, embeddings, and orchestration techniques. Qualifications

Experience: 5+ years as an AI or ML Engineer Required Skills & Experience RAG Frameworks:

LanceDB, Pinecone, ElasticSearch, FAISS, MongoDB Agentic AI:

LangGraph multi-agent orchestration, routing logic, task decomposition Fine-Tuning:

BERT / domain-specific transformer tuning, evaluation framework design Knowledge of

Reranker-based retrieval

(MiniLM / CrossEncoder) Familiarity with

Prompt evaluation and scoring

(BLEU, ROUGE, Faithfulness) Domain exposure to

Credit Risk, Banking, and Investment Analytics Experience with

RAG benchmark automation

and

model evaluation dashboards Additional Information

Job Location #J-18808-Ljbffr