19 Apr
|
NVIDIA
|
Toronto
Apply on Kit Job: kitjob.ca/job/2g93o8
Lead the advancement of an innovative AI infrastructure as a Systems Engineer. Focus on creating effective solutions for training and serving large AI models with high performance.
In this role, you will help develop a unified platform integrating several NVIDIA technologies, from ML compilers to cluster schedulers. Design systems for the efficient scheduling of AI workloads on GPU resources across cloud environments, while addressing challenges in performance prediction and resource management. Your expertise will be instrumental in collaborating with various technical teams.
Key Responsibilities:
• Build AI platform solutions for training and inference • Design effective GPU scheduling systems for AI workloads • Address challenges in resource management and performance • Work with cross-functional teams on integrated solutions • Handle live migration of AI workloads
Requirements: • Bachelor’s in Computer Science or equivalent experience • Minimum 5 years in large-scale systems design and development • Proficient in Python, Go, Rust, or C/C++ programming • Familiar with container technologies like Kubernetes • Strong foundational knowledge in algorithms and AI applications
Deploy your skills to enhance our AI platform by developing advanced scheduling and resource management systems. #J-18808-Ljbffr
Apply on Kit Job: kitjob.ca/job/2g93o8
📌 Systems Engineer for AI Infrastructure (Toronto)
🏢 NVIDIA
📍 Toronto