The impact you’ll make
Set the technical architecture for our agent-powered platform, choosing the right mix of LLMs, retrieval tech, and micro-service patterns to meet reliability, cost, and latency targets.
Drive architecture
for high-throughput APIs built with .NET, C#,
Python 3.11+, FastAPI async, SQLModel, and Semantic Kernel —from design docs through production roll-out.
Lead multi-agent orchestration
(handoff, sequential, concurrent) that combines knowledge-grounded and tool-calling agents across
OpenAI GPT-5/4.1/mini, Google Gemini 2.5 Flash , and future providers.
Guide teams implementing
Retrieval-Augmented Generation
on
Azure AI Search, pgvector, Chroma , ensuring index quality, filtering, and safety checks.
Own
end-to-end CI/CD pipelines
(Bitbucket Pipelines, Jenkins or similar): lint, type-check, security scan, test, containerize, and deploy.
Mentor and hire
engineers; Adopt
AI coding agents (Cursor , Claude Code) as force multipliers while safeguarding code quality.
Champion
observability and FinOps
for LLM workloads—structlog JSON, OTEL tracing, LangFuse, cost dashboards; keep p95
Partner with Product and Security to translate business goals, compliance needs, and user feedback into a pragmatic technical roadmap.
5+ years
building and scaling production back-ends; 2 + years in a tech-lead or staff role.
Expert in REST API design and development:Python FastAPI or . NET C# APIs. With experience in dependency-injection patterns,middleware, profiling, and optimization.
Hands-on leader with
Semantic Kernel
(or equivalent agent/LLM frameworks) and production LLM orchestration.
Delivered at least one RAG system/pipeline using a vector store (Azure AI Search, pgvector, Chroma or equivalent) with measurable latency/quality KPIs.
Deep Postgres expertise plus
SQLModel / SQLAlchemy 2
and
Alembic
migrations at scale.
Proven track record integrating multiple LLM providers and implementing structured-output + tool-calling workflows.
Fluency with
Poetry , Docker, GitHub Actions (or Azure/Argo), IaC basics, and blue/green or canary release strategies.
Strong people-leadership: code reviews, technical mentoring, roadmap planning, and cross-team communication.
Message-queue architectures (RabbitMQ, Kafka, Service Bus).
Experience with GPU inference fleets or serverless model hosting.
Familiarity with cost-aware prompt engineering and automatic evaluation pipelines.
You’ll steer the core brains of our AI products—building an LLM-agnostic, agent-driven platform that delights millions of users while meeting enterprise-grade reliability and governance. If you thrive on big technical bets, love mentoring, and want room to shape both code and culture, we’d love to meet you.
Employment with Newfold Digital is at-will and nothing in this Job Description should be interpreted or construed to alter the at-will employment relationship.
This Job Description includes the essential job functions required to perform the job described above, as well as additional duties and responsibilities. This Job Description is not an exhaustive list of all functions that the employee performing this job may be required to perform. The Company reserves the right to revise the Job Description at any time, and to require the employee to perform functions in addition to those listed above.