Site Reliability Engineer, Observability (Toronto)

19 Apr

Priceline

Toronto

19 Apr

Priceline

Toronto

This role is eligible for our hybrid work model: Two days in-office.

Site Reliability Engineer, Observability Our Technology team is the backbone of our company: constantly creating, testing, learning and iterating to better meet the needs of our customers. If you thrive in a fast‑paced, ideas‑led environment, you’re in the right place.

Why This Job’s a Big Deal As Priceline continues to scale globally, reliable production visibility is critical to delivering seamless customer experiences. We are investing in strengthening our observability foundations to improve detection, diagnosis, and overall system reliability. This role plays a key part in maturing our observability capabilities—standardizing instrumentation, improving telemetry quality, and enabling faster root cause analysis that directly impacts MTTR and MTTD.

In This Role You Will Get To

Support and evolve end-to-end observability solutions for collecting, shipping, storing, and querying OpenTelemetry signals (metrics, logs, and traces) across infrastructure, containers, and Kubernetes environments. Administer and operate core observability platforms (Splunk, New Relic, ClickHouse, Grafana, Lightrun), including service onboarding, access management, configuration, upgrades, and ongoing platform health.

Contribute to building and advancing a modern OpenTelemetry-based observability ecosystem that supports multiple telemetry types at scale.

Improve and standardize instrumentation practices across services, driving consistent logging, metrics, and distributed tracing implementation.

Partner with product and platform engineering teams to enhance production visibility and support SLO-driven reliability practices.

Optimize telemetry pipelines for performance, data quality, scalability, and cost efficiency.

Help define and support governance standards for observability, ensuring consistency, reliability, and scalability across teams.

Contribute to evolving our observability platform toward intelligent and AI-enabled capabilities,

exploring opportunities to integrate AI or MCP-based solutions to improve signal quality, incident triage, and operational efficiency.

Ensure observability platform reliability, security, and performance meet defined SLAs and operational standards.

Who You Are

Bachelor’s degree in Computer Science or equivalent practical experience.

3+ years of experience in Observability, SRE, DevOps, or platform engineering roles supporting production systems.

Strong understanding of APM and SRE fundamentals, including MELT (Metrics, Events, Logs, Traces), latency analysis, error rate monitoring, service dependency mapping, SLIs/SLOs, alert tuning, and root cause analysis.

Hands‑on experience administering at least one contemporary observability/APM platform (e.g., Splunk, New Relic, Grafana), with practical exposure to metrics, logs, distributed tracing, and platform configuration. Experience supporting full‑stack observability coverage across infrastructure, application, browser monitoring layers.

Experience building dashboards and actionable alerts, including configuring alert workflows and integrations with incident management tools such as PagerDuty. Experience implementing or supporting OpenTelemetry-based instrumentation and improving telemetry quality across services. Familiarity with Kubernetes and cloud‑native environments – an understanding of how applications are deployed, monitored, and scaled.

Experience managing telemetry pipelines and agents (e.g., collectors, forwarders, sidecars), including onboarding services and troubleshooting ingestion issues.

Working knowledge of scripting or automation (e.g., Shell, Python) and CI/CD concepts.

Experience or familiarity with infrastructure‑as‑code tools such as Terraform for managing platform configurations and integrations is a plus.

Comfortable collaborating with engineering teams to improve monitoring standards, instrumentation quality, and overall production visibility.

Relevant certifications such as New Relic APM Practitioner, Reliability Engineer – Professional, Splunk Admin, or GCP Associate Cloud Engineer are a plus.

Demonstrated history of living the values important to Priceline: Customer, Innovation, Team, Accountability and Trust.

The Right Results, the Right Way is not just a motto at Priceline; it’s a way of life. It’s therefore essential that you also meet our high standard of ethics, honesty, transparency and compliance.

The salary range for this position is $110,000K to $130,000K CAD.

Benefits

Health & wellness coverage including medical, dental, vision, and mental health resources.

Generous time off including PTO, holidays, a company‑wide Priceline Pause reset week, and paid volunteer days.

Work/life support including the ability to work up to 4 weeks per year from anywhere, parental leave, dependent care and family support resources, Summer Fridays, and office perks like stocked kitchens and catered meals.

Financial security programs such as retirement plans with company contributions, life and disability coverage, and tax‑advantaged accounts.

Signature travel perks including employee‑only discounts on hotels and flights, VIP deals, and Big Deal Bucks credits.

Additional perks & discounts like travel and partner discounts, tuition support, legal support, and pet advantages.

A people‑first culture with Employee Resource Groups (ERGs), social events, recognition programs, and service awards that help you connect, grow, and celebrate together.

Priceline is a proud equal‑opportunity employer. We embrace and celebrate the unique lenses through which our employees see the world.

#J-18808-Ljbffr

📌 Site Reliability Engineer, Observability (Toronto)
🏢 Priceline
📍 Toronto

Reply to this offer

Impress this employer describing Your skills and abilities, fill out the form below and leave Your personal touch in the presentation letter.

Manager, Special Collections Operations (Toronto)

06 May

Toronto Public Library

Toronto

06 May
Toronto Public Library
Toronto

Manager, Special Collections Operations JOB SUMMARY: Oversee the operations, services, and staff of Toronto Public Library’s Special Collections team to ensure effective public access, preservation [...]

GP/Family Physician (Toronto)

06 May

Prospect Health

Toronto

06 May
Prospect Health
Toronto

Home Jobs GP/Family Physician Back to search results Apply Save International GP Team 01423 813451 Linkedin More from International GP Team Back to search results GP/Family Physician J679866 Toronto O [...]

SEO Expert In Toronto

07 May

Pankajkumarseo

Toronto

07 May
Pankajkumarseo
Toronto

If you are running a business in Canada’s most competitive market, Toronto, then having the right SEO expert in Toronto is no longer an option – it’s a necessity. With thousands of businesses co [...]

Director of Human Resources and Administration (Toronto)

08 May

YWCA Toronto

Toronto

08 May
YWCA Toronto
Toronto

Employment Type: Full-Time, Permanent Work Hours: 35 hours per week Position Status: Existing Vacancy Salary: $96,135 to 110,964 annually; plus, comprehensive benefits package Location: He [...]

Site Reliability Engineer, Observability (Toronto)

Site Reliability Engineer, Observability (Toronto)

Reply to this offer

Manager, Special Collections Operations (Toronto)

Manager, Special Collections Operations (Toronto)

GP/Family Physician (Toronto)

GP/Family Physician (Toronto)

Subscribe to this job alert:

Enter Your E-mail address to receive the latest job offers for: site reliability engineer, observability (toronto) / toronto

SEO Expert In Toronto

SEO Expert In Toronto

Director of Human Resources and Administration (Toronto)

Director of Human Resources and Administration (Toronto)

Subscribe to this job alert:

Enter Your E-mail address to receive the latest job offers for: site reliability engineer, observability (toronto) / toronto