AI/ML Engineer (Toronto)

AI/ML Engineer (Toronto)

19 Apr
|
ThoughtStorm
|
Toronto

19 Apr

ThoughtStorm

Toronto

The AIOps Engineer is responsible for architecting, provisioning, and operationalizing multi-workplace AI platforms on Google Cloud (Sandbox, Dev, Prod). The role includes cloud environment setup, IAM governance, CI/CD pipeline development, AIOps automation, drift detection, lifecycle process design, documentation, and alignment with broader enterprise platforms. Responsibilities

Conduct workshops to gather GCP environment requirements. Design cloud architecture including VPC, IAM, subnetting, quotas, endpoints, and security controls. Lead the provisioning of Sandbox, Dev, and Prod GCP projects using Terraform. Oversee API enablement, configuration, and validation testing. Role Definitions IAM Governance

Define IAM roles for AI platform users (Owner, Support, ML Engineer, Viewer). Create IAM matrices, RACI charts, and detailed access control documentation. Ensure least-privilege access policies across Vertex AI and GCP services. Coordinate reviews and approvals with security and architecture teams. AIOps Framework Development

Design and implement drift detection, anomaly monitoring, canary releases, automated rollback, and observability components. Build reusable CI/CD pipelines using Vertex Pipelines and Cloud Build. Develop SOPs, diagrams, runbooks,



and the full AIOps operations playbook. Execute and validate synthetic drift, monitoring, and pipeline test scenarios. Lifecycle Processes

Define the complete ML lifecycle from environment setup through deployment, monitoring, retraining triggers, and retirement. Integrate lifecycle processes within CI/CD and AIOps automation. Document all lifecycle flows in Confluence and conduct validation sessions. Develop team structure, roles, and support plans. Build cost and usage models using GCP calculators and automation scripts. Prepare development and production usage forecasts and long-term TCO estimates. Core Technical Skills

Deep experience with Terraform and Infrastructure as Code workflows. Practical experience with AIOps and MLOps frameworks. Proficient in Python for automation and monitoring jobs. Experience designing and operating CI/CD pipelines for ML workloads. Knowledge of observability tools such as Cloud Monitoring, Logging, and OpenTelemetry. Preferred Qualifications

GCP Professional ML Engineer or Cloud Architect certification. Experience with Looker or other operational dashboards.

#J-18808-Ljbffr

📌 AI/ML Engineer (Toronto)
🏢 ThoughtStorm
📍 Toronto

Reply to this offer

Impress this employer describing Your skills and abilities, fill out the form below and leave Your personal touch in the presentation letter.

Subscribe to this job alert:
Enter Your E-mail address to receive the latest job offers for: ai/ml engineer (toronto) / toronto
Subscribe to this job alert:
Enter Your E-mail address to receive the latest job offers for: ai/ml engineer (toronto) / toronto