Senior Production Support Engineer (Toronto)

Senior Production Support Engineer (Toronto)

17 Apr
|
Aarorn Technologies
|
Toronto

17 Apr

Aarorn Technologies

Toronto

Overview

Job Title: Senior Production Support Engineer

Location: Toronto, ON (4x onsite a week)

Employment Type: Contract

Pay Rate: CAD$45 - $50/HR INC

Interview Type: Face 2 Face (Onsite Interview Only)

Job Description

Responsibilities

- Toil Removal & Infrastructure Maintenance (15%)
- Execute SSL/TLS certificate updates and renewals across production environments
- Perform Windows and Linux server patching and security updates
- Manage NPID password updates and credential rotation protocols
- Implement security vulnerability remediation in production systems
- Identify, document, and eliminate repetitive manual operational tasks
- Infrastructure & Database Cluster Management (20%)
- Manage and support Elasticsearch cluster operations (deployment, scaling, monitoring, troubleshooting, performance tuning)
- Administer MongoDB clusters including replication, sharding, backup, recovery, and maintenance
- Operate and maintain Redis instances for caching and session management
- Monitor cluster health, capacity planning, and optimization
- Execute failover and disaster recovery procedures
- Ensure data integrity and backup compliance
- Automation & SRE Activities (15%)
- Develop, maintain, and enhance Ansible playbooks for infrastructure automation
- Build infrastructure-as-code solutions to reduce manual intervention
- Create and maintain comprehensive runbooks and operational playbooks
- Design monitoring, alerting, and observability solutions
- Implement automated remediation for common operational issues
- Quantify and prioritize toil reduction opportunities
- Production Application Support (50%)




- Troubleshoot and resolve production incidents affecting digital applications
- Collaborate with application development and support teams on issue diagnosis
- Participate in incident response, root cause analysis, and post-mortems
- Monitor and respond to application performance degradation

Technical Requirements

Required Expertise (Must-Have)

- Ansible 2+ years hands-on experience writing playbooks roles and automation workflows
- Elasticsearch 2+ years managing and troubleshooting Elasticsearch clusters in production
- MongoDB 2+ years with replica sets sharding backup recovery and performance tuning
- Redis Proficiency in deployment configuration and operational support
- OpenShift Experience deploying and managing containerized applications on OpenShift
- Azure Knowledge of Azure cloud services resource management and deployments
- Linux Administration 3+ years with RHEL CentOS or Ubuntu in production environments
- Windows Server Administration Experience with patching certificate management and maintenance
- Shell Scripting Bash scripting for automation and operational tasks
- Incident Management Experience responding to and resolving critical production incidents

Preferred Skills

- Kubernetes or container orchestration platforms




- Python or Go scripting for automation
- CI/CD pipeline experience Jenkins GitLab CI Azure DevOps
- Monitoring and observability tools Prometheus Grafana ELK Stack Datadog
- Infrastructure-as-Code tools Terraform CloudFormation
- Security best practices and vulnerability management
- Relevant certifications AZ-900 CKA Elasticsearch etc

Required Qualifications

- Minimum 5 years of production infrastructure support or SRE experience
- Minimum 3 years with at least 2 of the core technologies Elasticsearch MongoDB Ansible OpenShift
- Experience working in regulated financial services environment preferred
- Ability to work independently and in teams
- Robust troubleshooting and analytical capabilities
- Excellent documentation and communication skills
- Must be available for on-call support rotation with reasonable notice

Operational Expectations

- On-Call Rotation Participates in production support on-call schedule
- Incident Response Available for critical incident resolution outside standard business hours as required
- Availability Core business hours + flexibility for critical production issues
- Response Time First response to critical incidents within 30 minutes
- Documentation Maintains detailed runbooks playbooks and knowledge base articles
- Collaboration Regular communication with infrastructure development and operations teams

Disclaimer: AI tools may assist in the recruitment process; however, all hiring decisions are made by the recruitment team based on a comprehensive evaluation of candidates.

#J-18808-Ljbffr

📌 Senior Production Support Engineer (Toronto)
🏢 Aarorn Technologies
📍 Toronto

Reply to this offer

Impress this employer describing Your skills and abilities, fill out the form below and leave Your personal touch in the presentation letter.

Subscribe to this job alert:
Enter Your E-mail address to receive the latest job offers for: senior production support engineer (toronto) / toronto
Subscribe to this job alert:
Enter Your E-mail address to receive the latest job offers for: senior production support engineer (toronto) / toronto