19 Apr
|
Doghouse Recruitment
|
Toronto
19 Apr
Doghouse Recruitment
Toronto
Apply on Kit Job: kitjob.ca/job/2g9j3t
Transform AI infrastructures as an AI/ML Architect with a focus on large-scale GPU systems. This fully remote chance involves hands-on engagement with distributed training and GPU workload optimization.
In this technical role, you'll collaborate with a team of experts to build scalable GPU solutions. Responsibilities include designing production-grade systems, troubleshooting ML workloads, and delivering technical insights to clients. Your work will directly impact operational efficiency and performance in real-world environments, making significant contributions to AI projects.
Key Responsibilities:
• Create scalable distributed training systems for large GPU clusters • Optimize and debug ML workloads across customer environments • Guide GPU performance strategies and technical specifications • Work closely with product and engineering teams • Actively engage with clients for hands-on implementation
Requirements: • Proven experience in production-level multi-node GPU workloads • Strong knowledge of distributed deep learning methodologies • Understanding of advanced GPU architecture • Experience with Kubernetes or Slurm for workload management • Skilled in performance tuning and monitoring for GPUs
Drive innovation and efficiency in AI workloads through expert architectural design and engagement. #J-18808-Ljbffr
Apply on Kit Job: kitjob.ca/job/2g9j3t
📌 AI/ML Architect for GPU Systems (Toronto)
🏢 Doghouse Recruitment
📍 Toronto