Role: Lead AI Engineer with Retirement & Wealth Domain
Location: Boston, MA or Windsor, CT
Key Skill: AI, LLM, API, MLOps, Retirement & Wealth Domain
Experience: 10+ years
Mode of Hire: Full Time
Experience
• 10+ years of progressive software engineering experience with sustained hands-on contributions (aligned with Citi C14/SVP benchmark for this level).
• 3+ years of dedicated experience building LLM-based systems and agentic architectures in production environments — not research or notebook work.
• Proven success architecting and delivering multiple enterprise-scale AI solutions into production; can speak to architecture decisions, failure modes encountered, and how systems were improved post-launch.
• Prior lead or staff-level role: set technical direction, owned critical systems end-to-end, influenced engineering practices across a team.
• Experience delivering AI systems in a regulated environment (financial services, healthcare, or similar) with compliance, audit trail, and governance requirements.
Programming & Core Engineering
• Rust (required, expert level): production systems development including memory safety, async programming with Tokio, error handling patterns, trait design, and testing — used for performance-critical AI service layers, data pipelines, and backend infrastructure.
• TypeScript / Node.js (required): production API services, async/await patterns, type-safe API contracts, and React-based front-end interfaces for advisor and participant-facing tools; full-stack TypeScript capability is expected, not optional.
• Solana / Solana programs (required): smart contract development using Anchor or native Solana program model; familiarity with Solana’s account model, transaction structure, and program-derived addresses (PDAs) as they apply to on-chain financial data and tokenized retirement or investment products.
• Software engineering fundamentals: system design, CI/CD pipeline ownership, testing strategy (unit, integration, contract, eval), resiliency patterns, security practices for AI services, and operational stability.
• API development: RESTful and event-driven API design using TypeScript/Node.js or Rust (Axum, Actix, or equivalent); authentication, rate limiting, versioning, and API contracts for AI services consumed by downstream systems.
• Data engineering: complex SQL proficiency; data pipeline construction in Rust or TypeScript (dbt, Airflow, Prefect, or equivalent); working with structured financial data at scale; experience with Snowflake, Spark, or similar.
• Front-end capability: React with TypeScript to build production-quality interfaces for advisor and participant-facing AI tools — not a specialization, but full ownership of the UI layer is expected.
• Databases: vector databases (Pinecone, Weaviate, pgvector, OpenSearch); relational (PostgreSQL, SQL Server); document (MongoDB); caching (Redis).
LLM & Generative AI Engineering — Required
• Production LLM integration: hands-on experience with OpenAI GPT-4o, Anthropic Claude, Google Gemini/Gemma, and/or AWS Bedrock in user-facing production applications — not just API experimentation.
• RAG system design and implementation: vector store selection and configuration, chunking and embedding strategies, hybrid search, re-ranking, and rigorous evaluation (RAGAS, custom eval frameworks, or equivalent).
• Prompt engineering at an engineering level: system prompt design for financial services safety constraints, few-shot construction, structured output extraction (JSON/XML), prompt version control, and regression testing.
• Agentic AI architecture: tool use and function calling; multi-step reasoning chains; agent orchestration frameworks (LangGraph, LangChain, Google ADK, AutoGen, CrewAI, or custom implementations); MCP (Model Context Protocol) server design and integration for financial data sources.
• LLM evaluation: building eval suites for correctness, hallucination, instruction-following, and task-specific quality; LLM-as-judge patterns; adversarial robustness testing for financial advice contexts.
• Output validation and safety layers: guardrails, output parsers, confidence scoring, fallback logic, and human-in-the-loop escalation patterns for production AI systems handling regulated financial outputs.
• ML frameworks: working knowledge of TensorFlow and PyTorch — sufficient to fine-tune, evaluate, and integrate transformer-based models; not required to build from scratch but must understand model mechanics to make architecture decisions.
Cloud, Infrastructure & MLOps
• Cloud platforms: production experience on AWS, Azure, or GCP — AI/ML services (SageMaker, Azure ML, Vertex AI), serverless compute, managed databases, and storage.
• Containerization and orchestration: Docker (required); Kubernetes working knowledge; experience deploying AI inference services in containerized environments with auto-scaling.
• MLOps: experiment tracking (MLflow, Weights & Biases, or equivalent); model versioning; deployment pipelines for AI systems; CI/CD for model updates with automated quality gates.
• Observability: logging, tracing, and metrics for AI services (Datadog, CloudWatch, OpenTelemetry, or equivalent); building dashboards and alerts for model quality, hallucination rates, and system health.