This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Data Systems Operations Lead in Canada.
This role sits at the core of a mission-critical, real-time data infrastructure powering institutional-grade market data and price feeds used across global financial ecosystems. You will be responsible for ensuring the reliability, accuracy, and resilience of distributed systems operating at high scale and low latency. The position combines deep technical operations ownership with strategic system design, requiring someone who can move seamlessly between incident response, automation design, and long-term infrastructure improvement. You will work in a fast-paced, globally distributed environment where operational excellence directly impacts financial data integrity and user trust. The role also involves close collaboration with institutional partners, translating complex operational challenges into scalable technical solutions. This is a high-impact position for an operator who thrives in ambiguity, values reliability, and enjoys building systems that must perform flawlessly under pressure.
Accountabilities
- Own end-to-end operations for mission-critical real-time price feed systems, including monitoring, provisioning, SLA management, and performance optimization for institutional-grade data services.
- Define operational standards, service targets, and success metrics to continuously improve reliability, latency, and risk reduction across distributed systems.
- Identify and drive automation opportunities to eliminate repetitive operational work, reduce human error, and improve system scalability.
- Lead incident response and data forensics efforts, analyzing system failures, quantifying user impact, and implementing corrective actions to prevent recurrence.
- Design resilient system improvements by anticipating failure modes and proactively addressing vulnerabilities across the data pipeline.
- Collaborate with engineering, product, and external institutional partners to ensure operational requirements are clearly defined and consistently met.
- Balance hands-on incident management with long-term strategic improvements that enhance system stability and operational efficiency.
Requirements
- 5+ years of experience in technical operations, SRE, data systems engineering, or infrastructure-focused roles within distributed or real-time systems environments.
- Proven experience managing mission-critical systems with strong exposure to incident management, on-call responsibilities, and high-severity production issues.
- Deep understanding of distributed systems architecture, data pipelines, and operational reliability at scale.
- Experience working with financial markets, trading infrastructure, market data systems, or other low-latency real-time environments.
- Strong ability to analyze complex technical problems and communicate solutions clearly to both engineering teams and senior business stakeholders.
- Experience driving automation, system optimization, or infrastructure improvements in production environments.
- Highly analytical, structured, and proactive mindset with strong ownership of system outcomes and user impact.
- Bonus: exposure to crypto infrastructure, DevOps/SRE tooling, or relationships with institutional trading or exchange ecosystems.
Benefits
- Competitive compensation package with performance-based incentives
- Fully remote work setup across North America
- Opportunity to work on high-impact, real-time financial data infrastructure
- Strong autonomy and ownership over mission-critical systems
- Exposure to institutional partners and global financial infrastructure networks
- High-growth, fast-paced environment with significant technical challenges
- Collaborative, globally distributed engineering culture
- Opportunities for deep technical ownership and career growth in distributed systems and operations