19 Apr
|
Hiive
|
Vancouver
Apply on Kit Job: kitjob.ca/job/2g8mkg
Drive reliability and performance as an Infrastructure Engineer. Your efforts will support incident response, automation, and AI workloads to enhance platform availability and effectiveness.
This role involves working with a growing infrastructure team to ensure system reliability and availability. You'll be actively engaged in incident management and optimizing infrastructure processes. Additionally, your collaboration across teams will foster an excellent engineering culture and continuous improvement in platform performance.
Key Responsibilities:
• Optimize platform uptime and reliability • Maintain performance through proactive issue resolution • Collaborate to troubleshoot performance challenges • Implement monitoring and observability for all systems • Support and scale infrastructure for AI/ML workloads
Requirements: • Background in Site Reliability Engineering or equivalent • Experience with operating production Kubernetes clusters • Robust AWS knowledge, focusing on EKS and VPC • Proficient in Terraform for infrastructure management • Familiarity with PostgreSQL and observability tools
Contribute significantly to performance enhancements and ensure reliable operation across AI systems while participating in an engaging engineering culture. #J-18808-Ljbffr
Apply on Kit Job: kitjob.ca/job/2g8mkg
📌 Infrastructure Engineer for Reliability and Performance (Vancouver)
🏢 Hiive
📍 Vancouver