Position - Senior AI Engineer
Do you have the skills to fill this role Read the complete details below, and make your application today.
Type - Full Time
Location - Seattle WA
Accepting Candidates
We are looking for a senior generalist engineer who is energized by designing and building modern AI systems agents, retrieval pipelines, evaluation harnesses, and the backend services that hold them together. You are passionate about well-architected software, you make pragmatic build-vs-buy and self-host-vs-API calls, and you reach for the simplest approach that actually solves the problem. You will work on projects with real-world impact for global public good, integrated into a collaborative, mission-driven team alongside IDM researchers and their internal and external collaborators. You will be essential in turning AI ideas into working systems agents, services, evaluations, and the connective tissue that lets a small team ship a lot.
This is a 36-month limited-term position based in Seattle, WA. Relocation will be provided.
What You’ll Do
Design and build agentic AI systems — multi-step workflows, tool use, and autonomous task orchestration using modern LLM frameworks — and ship them as reliable backend services.
Build and operate retrieval systems: ingestion, chunking, embeddings, vector search, and knowledge-graph-backed retrieval where it earns its keep.
Design AI evaluation pipelines and benchmarks so the team can tell whether an agent, model, or retrieval system is actually getting better — and monitor them in production.
Architect, implement, and maintain scalable backend services and APIs that other engineers, researchers, and applications build on top of.
Make senior-level architectural calls — model choice, hosting (Azure OpenAI vs. self-hosted), framework selection, and infrastructure tradeoffs — and mentor other engineers on AI application patterns.
Develop data pipelines and workflows leveraging Azure (Azure AI Foundry, Azure OpenAI, Azure AI Search, Azure Databricks) and Hugging Face.
Harden successful prototypes into production: tests, observability, cost and latency monitoring, failure handling, and documentation that let other people trust and reuse them.
Collaborate directly with researchers, analysts, and program staff to translate fuzzy domain problems into shippable systems for global health and global development.
Identify knowledge, data, or tooling gaps in the settings in which we work and propose pragmatic solutions.
Your Experience
Bachelor’s degree in a technical field with 5+ years building production software, or equivalent experience. Advanced degree is a plus, not required.
Strong general-purpose backend engineering: you can pick up unfamiliar code, debug across systems, and ship services that hold up in use.
Proficiency in Python, including for AI work (e.g., PyTorch, Hugging Face, or similar).
Hands-on experience building LLM-powered applications: retrieval-augmented generation (RAG), agentic workflows, tool use, and prompt engineering at production scale.
Experience designing and operating backend APIs and services in cloud environments — ideally Azure, but AWS or GCP equivalents are fine.
Experience building AI evaluations and observability — measuring quality, cost, and latency of LLM systems and acting on the results.
Hands-on experience with data pipelines and ETL, MLOps/AI Ops workflows, and cloud data services.
Experience with Git, CI/CD, containerization (Docker), infrastructure-as-code, and broader DevOps practices.
Comfort making and defending architectural tradeoffs (managed service vs. self-host, fine-tune vs. prompt, agent vs. workflow) and mentoring others through them.
Comfort working directly with researchers and non-engineers — able to translate fuzzy problems into concrete software, and to push back when the simplest answer is “we don’t need to build that.”
Track record of taking projects from prototype to something other people rely on.
Other Attributes
Experience with vector databases, knowledge graphs, or information architecture for AI applications.
Experience with fine-tuning, distillation, or other model adaptation techniques.
Exposure to scientific, public health, geospatial, or climate-related datasets.
Engagement with the open-source AI community.
Ability to stand up lightweight interactive demos (Streamlit, Gradio) when needed to show work to non-engineering stakeholders. xsgimln
Publications, patents, or other public artifacts that show depth in an area — but not required and not weighted over shipped work.