Role: AI Engineer – Voice & Conversational Systems
Location: Plano, TX - Hybrid
Position Overview:
We are seeking an experienced AI Engineer to design, build, and deploy next-generation conversational AI and real-time voice agents. In this role, you will bridge the gap between advanced Large Language Models (LLMs) and real-world telecommunication systems. You will be responsible for building ultra-low-latency voice pipelines, integrating interactive voice response (IVR) systems, implementing robust agent tool-calling frameworks via MCP, and ensuring system safety through rigorous evaluation and guardrails.
Key Responsibilities:
Voice Agent Development:
Design, optimize, and deploy end-to-end voice agents and real-time conversational pipelines, ensuring minimal latency and high contextual accuracy.
IVR & Telephony Integration:
Connect AI voice agents seamlessly with Contact Center IVR systems to automate customer interactions.
Context & Tool Orchestration:
Utilize MCP (Model Context Protocol) and FastMCP frameworks to give AI models structured access to secure data sources and enterprise tools.
Model Selection & Optimization:
Architect solutions leveraging state-of-the-art LLMs, including OpenAI GPT models and AWS Nova models via AWS Bedrock.
Speech Processing Pipelines:
Implement and fine-tune Speech-to-Text (STT) and Text-to-Speech (TTS) pipelines using DeepGram and ElevenLabs.
System Evaluation & Safety:
Establish evaluation frameworks (Evals) to measure agent performance and implement Guardrails to ensure deterministic, safe, and compliant model outputs.
Cloud Infrastructure:
Understanding of scalable AI microservices using Python, API Gateway, and AWS S3 storage.
Required Technical Skills
Core AI & Frameworks:
Strong proficiency in Python and standard AI/ML frameworks.
Hands-on experience with MCP (Model Context Protocol) and FastMCP for context standardizing.
Large Language Models (LLMs):
Experience deploying and prompting *OpenAI GPT models* and AWS Nova models.
Deep understanding of *AWS Bedrock* and orchestrating multi-step workflows with *AWS Strands*.
Voice & Audio Tech:
Speech-to-Text (STT):
Production experience with *DeepGram* or similar real-time streaming audio tools.
Text-to-Speech (TTS) : Experience generating natural, low-latency speech via *ElevenLabs*.
Production & Infrastructure:
Familiarity integrating AI pipelines into traditional *Contact Center IVR systems*.
Experience building robust REST/WebSocket APIs using AWS *API Gateway* and managing data persistence in *S3 Buckets*.
AI Quality & Safety:
Proven experience building *Evals* to benchmark model accuracy, latency, and hallucination rates.
Experience configuring *Guardrails* (e.g., input/output filtering, PII masking, safety alignment).
Preferred Qualifications:
Background in computational linguistics, audio signal processing, or real-time streaming protocols (WebSockets, WebRTC).
Experience tuning prompts specifically for voice/conversational contexts (where brevity and conversational pacing matter).
Familiarity with agile software development and CI/CD pipelines for AI workloads.