Curated careers, resources, tips and trends from the DevOps World.
Job Opportunity: This position was originally posted by RemoteOK DevOps Jobs. Apply for this position here →
Remote
USA
NVIDIA
NVIDIA is seeking a Remote Senior Site Reliability Engineer focused on ML platforms. The ideal candidate will enhance system performance and ensure reliability of their cutting-edge machine learning infrastructure. The role involves collaborating with data scientists and engineers to design scalable, reliable systems while monitoring service health and responding to incidents promptly.
The position requires a deep understanding of cloud architecture, CI/CD methodologies, and strong programming skills in Python, Go, or similar languages. Candidates should have experience with infrastructure as code (IaC) tools like Terraform or Ansible, and familiarity with Kubernetes for orchestration. Strong analytical and problem-solving skills are essential for optimizing system performance and reliability.
This remote role offers competitive compensation, comprehensive benefits, and the opportunity to work in a dynamic environment that values innovation and professional growth. NVIDIA emphasizes a collaborative culture that encourages continuous learning and development. The position is open to candidates across the USA, providing flexibility while contributing to groundbreaking technology in the ML domain.
Made with pure grit © 2024 Jetpack Labs Inc. All rights reserved. www.jetpacklabs.com