Platform Fullstack AI Engineer
Palo Alto CA and Santa Clara CA - Locals are preferred as Client need IN PERSON Interview
Long term contract
Own end-to-end delivery of major platform initiatives—from architecture and design through deployment and post-launch optimization
Lead deep technical ownership of
Kubernetes environments , including cluster management, networking, operators, container lifecycle, and multi-tenant orchestration
Design and build
scalable, reliable distributed systems
and cloud-native infrastructure on AWS and/or GCP
Drive engineering excellence through
code quality, design reviews, automation, and CI/CD best practices
Collaborate cross-functionally with
Product, AI, and Security teams
to align technical solutions with business objectives
Mentor engineers and guide architectural decisions, trade-offs, and delivery approaches
Partner with leadership to shape
engineering strategy, roadmap planning, and platform evolution
The experience expected from applicants, as well as additional skills and qualifications needed for this job are listed below.
Required Qualifications
6+ years of experience in software engineering, with a strong focus on
backend systems and infrastructure
Proficiency in
Python and/or Go , with a track record of delivering production-grade systems
Deep, hands-on experience with
Kubernetes , including building and operating clusters in production environments
Proven expertise in designing and managing
distributed systems at scale
Strong experience with
cloud platforms (AWS and/or GCP) , including compute, networking, storage, and IAM
Experience with
Infrastructure as Code (Terraform or similar)
and CI/CD pipelines
Familiarity with
applied AI tools and ecosystems , such as agent frameworks, AI gateways, or models like Claude and LiteLLM
Strong system design skills and architectural decision-making xsgimln capability
Excellent communication and collaboration skills across engineering, product, and security teams
Preferred Qualifications
Experience with observability tools such as
Prometheus, Grafana, Datadog, and OpenTelemetry
Exposure to
multi-cloud or hybrid infrastructure environments
Knowledge of
API gateways, AI gateways, and policy frameworks
(e.g., ABAC, OPA)
Experience in
service mesh architectures
or platform-as-a-service design
Demonstrated ability to improve
engineering productivity and operational efficiency at scale