Groq delivers fast, reliable AI inference. Our LPU-based system powers GroqCloudTM, giving businesses and developers the speed and scale they need. Headquartered in Silicon Valley, we’re on a mission to make high-performance AI compute more accessible and affordable. When real-time AI is within reach, anything is possible.

Requirements

Design, implement, and optimize large-scale multi-cluster Kubernetes deployments supporting mission-critical workloads.
Build Kubernetes controllers and operators in Go to support continuous deployment strategies for model instances and production workloads.
Implement advanced deployment patterns (blue/green, canary, progressive delivery) to ensure safe and reliable production rollouts.
Drive GitOps practices using (and building on top of) Flux (preferred) or ArgoCD, ensuring reproducible, declarative, and auditable deployments.
Inject observability into every deployment—leveraging Prometheus, VictoriaMetrics, Grafana, and OpenTelemetry for metrics, logging, and tracing.
Architect automated rollback, health checks, and failover mechanisms to maximize uptime and deployment confidence.
Operate and optimize deployments across multiple regions, clusters, and heterogeneous workloads.
Partner with application, platform, and infrastructure engineers to align deployment best practices across the organization.
Drive standards for deployment reliability, mentor peers on Kubernetes and GitOps practices, and raise the bar for automation across engineering.

Benefits

Comprehensive compensation package
Equity
Benefits

Requirements

Design, implement, and optimize large-scale multi-cluster Kubernetes deployments supporting mission-critical workloads.
Build Kubernetes controllers and operators in Go to support continuous deployment strategies for model instances and production workloads.
Implement advanced deployment patterns (blue/green, canary, progressive delivery) to ensure safe and reliable production rollouts.
Drive GitOps practices using (and building on top of) Flux (preferred) or ArgoCD, ensuring reproducible, declarative, and auditable deployments.
Inject observability into every deployment—leveraging Prometheus, VictoriaMetrics, Grafana, and OpenTelemetry for metrics, logging, and tracing.
Architect automated rollback, health checks, and failover mechanisms to maximize uptime and deployment confidence.
Operate and optimize deployments across multiple regions, clusters, and heterogeneous workloads.
Partner with application, platform, and infrastructure engineers to align deployment best practices across the organization.
Drive standards for deployment reliability, mentor peers on Kubernetes and GitOps practices, and raise the bar for automation across engineering.

Benefits

Comprehensive compensation package
Equity
Benefits

Staff Deployment Engineer

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Staff Deployment Engineer

Senior Staff Infrastructure Engineer, GroqCloud

Staff Production Engineer

Staff Deployment Engineer

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Staff Deployment Engineer

Senior Staff Infrastructure Engineer, GroqCloud

Staff Production Engineer

Job Details

About Groq