N

Applied AI Engineer

Norbert Health
4 hours ago
Full-time
On-site
Brooklyn, New York, United States
The company Norbert is building autonomous robots that deliver healthcare.

Our AI sensing platform enables existing robotic platforms to become care team members: rounding on patients, capturing vitals without contact (FDA-cleared for pulse and respiratory rate, more in the pipeline), running assessments, documenting to the EMR, and escalating when something's wrong. Autonomously.

We're not building demos. We're deployed in real facilities today, monitoring hundreds of patients daily. We're solving one of healthcare's hardest problems: a global nursing shortage that will hit 40% by 2030.

We're a small, international team backed by top-tier VCs, with offices in Brooklyn, Paris, and Montreal. We ship things that matter.

The position We're looking for an Applied AI Engineer to take our growing collection of foundation models and ML components from manually run, sometimes locally trained workflows to fully automated, production-grade MLOps pipelines: deployed reliably on robots in nursing facilities. We need someone who knows the model landscape cold, treats evaluation as a first-class engineering problem, and has strong opinions about when to prompt, RAG, fine-tune, swap, or buy.

You'll work across cloud and edge deployments, and some of the systems you'll touch are on a SaMD pathway, so you'll need to be comfortable shipping under regulatory constraints.

What you'll do

Integrate foundation models and ML components (VLMs, LLMs, ASR/TTS, detection/segmentation, embeddings) into our production pipelines, using both open-weight models and third-party APIs

Build RAG and agent-style orchestration for clinical reporting and conversational interfaces

Ship real-time streaming pipelines (voice agents) alongside batch and request-response workloads

Build evaluation harnesses that catch regressions across model swaps and measure performance against clinical-grade accuracy targets

Fine-tune and retrain models (LoRA, PEFT, supervised fine-tuning) using data collected from our deployed fleet

Deploy across our inference surfaces: third-party APIs, self-hosted, and on-robot edge

Build the data flywheel: pipelines that collect, label, version, and feed production data back into model improvement

Partner with the algorithms team (signal processing, computer vision) on integration with their lower-level pipelines

What we're looking for

BS in Computer Science, Engineering, or a related field, or equivalent hands-on experience

4+ years shipping ML/AI systems in production outside of academic settings

Strong working knowledge of the modern foundation model landscape (open-weight LLMs and VLMs, common detection/segmentation backbones, embedding models)

Hands-on experience with PEFT/LoRA and supervised fine-tuning

Strong Python; comfortable with the deployment toolchain (ONNX, quantization, at least one inference runtime—TensorRT, vLLM, llama.cpp, etc.)

Experience with a cloud ML training/MLOps platform (GCP Vertex AI, AWS SageMaker, Azure ML, or equivalent)

Ability to work independently, solve complex problems, and drive projects to completion

Bonus points

Edge ML deployment (Jetson, ARM, mobile NPUs)

Real-time voice AI pipelines (STT, TTS, streaming LLM)

Production RAG systems beyond toy implementations

Medical devices, SaMD, or other regulated ML environments

MLOps tooling (Weights & Biases, MLflow, DVC, etc.)

Active learning or human-in-the-loop labeling workflows

C++ for integrating with our computer vision pipeline

What we offer

Real impact: your code provides care for patients today

High autonomy and technical ownership—you'll define how we operate AI in production

Work at the intersection of cutting-edge AI, edge computing, and healthcare

A talented, excellent, diverse and international team

Equity participation in the company's future

Cutting-edge stack: embedded AI, robotics, LLMs, multimodal sensing

Transparent, mission-driven culture focused on continuous learning

Competitive salary and equity