View jobs

Site Reliability Engineer - IoT Fleet

  • Software Development
  • Full-time
  • C0penhagen, DK

About Us

At Teton, we are redefining the role of healthcare workers through cutting-edge AI technology. In the face of a global nursing shortage, our solutions provide vital support to overburdened health systems.

With thousands of residents and patient rooms across 4 countries and 2 continents already benefiting from our technology, we are rapidly expanding our reach. What sets us apart is our relentless focus on product excellence and user experience, deploying solutions at speed that make a tangible difference in patient care.

At this stage of our journey, we require a physical presence at our office in Copenhagen, Denmark. We believe close collaboration enables the fastest and most efficient iteration cycles—helping us build an impactful product that healthcare professionals truly love to use.

The Job

We're seeking a Site Reliability Engineer to join our Platform team and enhance the reliability and performance of our IoT device fleet deployed in healthcare facilities around the world. You will be responsible for ensuring that our AI-driven solutions run reliably in real-world clinical environments, where downtime directly impacts patient care. This means diving deep into device failures, building sophisticated monitoring and alerting systems, and continuously improving the resilience of our fleet platform.
This is not a traditional SRE role managing web services. You'll be working at the intersection of IoT hardware, embedded Linux systems, and distributed infrastructure, solving unique challenges that arise when thousands of medical devices need to operate flawlessly 24/7.

Your day-to-day will include:

  • Incident Response: Investigate and resolve device failures, connectivity issues, and system anomalies across our deployed fleet

  • Monitoring & Observability: Design and build monitoring systems that provide deep visibility into device health, performance metrics, and early warning signals

  • Tooling & Automation: Develop tools and automation to diagnose issues faster, streamline deployments, and reduce manual intervention

  • Fleet Management: Work with our Smith platform to manage device lifecycle, remote debugging, and fleet-wide updates

  • Root Cause Analysis: Conduct thorough post-mortems and implement fixes that prevent recurring issues

  • Reliability Engineering: Identify systemic weaknesses and work with the platform team to improve architecture and resilience

  • On-Call Rotation: Participate in on-call coverage to ensure rapid response to critical incidents affecting patient care

Qualifications Required:

  • Strong experience with Linux systems (Ubuntu, systemd, containerization, networking)

  • Proficiency in Python, Bash, or Rust for automation and tooling

  • Deep understanding of debugging methodologies—comfortable reading logs, traces, and using debugging tools to diagnose complex issues

  • Experience with monitoring and observability tools (Prometheus, Grafana, or similar)

  • Familiarity with distributed systems and network protocols

  • Systematic approach to troubleshooting and problem-solving under pressure

Strong Plus

  • Experience managing IoT or edge device fleets at scale

  • Background in embedded systems or hardware-software integration

  • Experience with remote device management, OTA updates, or fleet orchestration

  • Familiarity with AWS, GCP, or Azure infrastructure

  • Healthcare or regulated industry experience

  • Experience building internal developer tools

What makes you successful in this role:

  • You enjoy the detective work of finding root causes in complex systems

  • You think proactively about what could go wrong before it does

  • You care deeply about operational excellence and reliability

  • You communicate clearly during incidents and write thorough documentation

  • You balance urgency (fix it now) with long-term improvements (fix it forever)

What We Offer

  • Participation in our warrant program (stock options).

  • Work with state-of-the-art technology in a pioneering field.

  • A vibrant, learning-focused work environment.

  • A role where you can make a significant impact on healthcare delivery.

Join Our Team

We're looking for individuals who share our vision and value ownership and entrepreneurship over a conventional job.

At Teton, you'll have the chance to make a real impact with your work.

If you're ready for more than just a job, come aboard and be part of our mission to transform healthcare.