Hybrid in NYC (Tues-Thurs in office)
Min 1-year contract with potential for FT hire
3-step interview process
Market rate
Summary
We’re looking for an AI Engineer who can design and implement LLM-powered agents that integrate with internal systems to automate operational workflows. This isn’t a research role—it’s a building role. You’ll work at the intersection of software engineering and applied AI, creating reliable, observable, production-grade automations that teams across Operations and Finance can trust and use daily.
You’ll build agents that interact with enterprise data sources and APIs, design tool-use integrations via MCP servers, implement guardrails and human-in-the-loop patterns, and ensure that everything you ship is auditable and operationally sound.
This role begins as a consulting engagement with a right-to-hire path.
What You’ll Do
• Design and implement LLM-powered agents that automate operational workflows—from document processing to data validation to exception handling.
• Build tool-use integrations: connect agents to internal APIs, databases, and enterprise systems via MCP servers and structured tool definitions.
• Implement guardrails, validation layers, and human-in-the-loop patterns that ensure correctness and maintain trust in automated outputs.
• Partner with business stakeholders to identify high-value automation opportunities and translate them into scoped, deliverable agent workflows.
• Design for observability: structured logging, decision traces, cost tracking, and clear “what happened / why” visibility for every agent action.
• Build reusable patterns and frameworks for agent development—prompt management, evaluation harnesses, context assembly, and output validation.
• Stay current on LLM capabilities, API patterns, and tooling (Claude, GPT, open-source models) and make pragmatic recommendations on model selection and architecture.
• Collaborate with the architecture and engineering teams to ensure AI components integrate cleanly with the broader platform (auth, audit, data governance).
What We’re Looking For
• 7+ years of software engineering experience, with at least 2 years of hands-on work building LLM-based applications or AI-powered automation in production.
• Strong proficiency in Python and Java, with experience building production services (not just notebooks and prototypes).
• Hands-on experience with LLM APIs (Claude, OpenAI, or similar), including prompt engineering, function/tool calling, structured outputs, and context management.
• Experience building LLM agents with tool-use capabilities—MCP servers, function calling, API orchestration, and multi-step workflows.
• Strong understanding of AI safety and reliability patterns: output validation, hallucination mitigation, cost controls, rate limiting, and audit trails.
• Practical knowledge of enterprise data sources and integration patterns (REST APIs, SQL databases, messaging systems).
• Excellent engineering fundamentals: clean code, testing discipline, observability, and production-readiness.
• Strong communication skills; you can explain AI capabilities and limitations to non-technical stakeholders with clarity and honesty.
Nice to Have
• Experience with RAG (Retrieval-Augmented Generation) pipelines, vector databases, and document processing at scale.
• Familiarity with evaluation frameworks for LLM outputs (automated scoring, human-in-the-loop review, regression testing).
• Experience in financial services, operations, or control-oriented domains where accuracy and auditability are non-negotiable.
• Exposure to workflow orchestration (Temporal or similar) for managing multi-step agent processes.
Tech Environment
• Python and Java as primary languages.
• LLM APIs: Claude (Anthropic), with exposure to other providers as needed.
• MCP servers for tool-use integration; REST APIs for enterprise system connectivity.
• AKS, PostgreSQL, SQL Server, and enterprise data stores.
• GitHub Actions for CI/CD; observability tooling for agent monitoring.