B

AI Engineer

BLEN Corp
1 hour ago
Full-time
On-site
Washington, District of Columbia, United States
Job Description:

Design and build agentic systems — multi-step agents that plan, call tools, retrieve context, and take action with appropriate human-in-the-loop checkpoints Build MCP servers and clients to securely expose client data, internal tools, and APIs to LLMs in a standardized, auditable way Ship LLM-powered applications: copilots, document intelligence, search, summarization, and workflow automation Design and maintain RAG pipelines — chunking, embeddings, vector stores, retrieval, reranking, and grounding Integrate model APIs (OpenAI, Anthropic, Bedrock, Azure OpenAI, open-weight models) and pick the right model for the job based on quality, latency, and cost Develop evals and observability for agents and AI features so we know what's working in production and what's regressing Apply prompt engineering, structured outputs, function/tool calling, and guardrails to make agent behavior predictable Write production Python backends and APIs that expose AI capabilities to web and mobile clients Collaborate with engineers, designers, and product folks to scope what AI should (and shouldn't) do in a given product Help shape responsible AI practices for federal use — privacy, security, auditability, and human oversight Requirements:

5+ years of professional software engineering experience, with at least 1 year shipping LLM-based or AI-powered features to production Hands-on experience designing or building agentic systems — tool calling, multi-step reasoning, planning loops, or agent orchestration (LangGraph, CrewAI, OpenAI Agents SDK, Claude tool use, or equivalent) Working knowledge of the Model Context Protocol (MCP) — or demonstrated ability to pick it up quickly, plus familiarity with the broader landscape of agent/tool standards Strong Python and experience building and deploying backend services and APIs (FastAPI, Flask, or similar) Hands-on experience with at least one major LLM provider (OpenAI, Anthropic, Bedrock, Azure OpenAI, Vertex, or open-weight models via vLLM/Ollama) Working knowledge of RAG: embeddings, vector databases (pgvector, Pinecone, Weaviate, Qdrant, or similar), and retrieval evaluation Comfort with prompt engineering, structured outputs (JSON mode, schemas), and tool/function calling Experience writing evals — even lightweight ones — for non-deterministic systems Solid SQL and experience with relational and unstructured data Familiarity with at least one cloud platform (AWS, Azure, or GCP) Git, code review, and modern collaborative workflows Strong written and verbal communication — you can explain AI tradeoffs to non-technical stakeholders. Benefits:

Competitive pay Contribution toward health benefits Work from anywhere in the US High-visibility federal projects with real impact Small team where your ideas actually ship Generous exposure to the latest AI tooling and models