How Peloton Shifted Left – and Fixed Their CI Pipeline to Unlock Full-Platform Testing

When Peloton set out to expand its reach across Android TV and wearable devices, their QA pipeline hit a wall. Despite a solid foundation using Firebase Test Lab (FTL), critical limitations around platform support stalled progress and put release timelines at risk. This is how they solved it.
The Challenge:
Peloton’s existing testing infrastructure was built on Firebase Test Lab (FTL), which initially provided value — until it didn’t.
Wearables: FTL’s physical wearable device worked well for 3–4 months, until Peloton’s app required a Google Mobile Services (GMS) library that the device didn’t support. Google’s only response was: “we might add that someday.” With no fix in sight, Peloton dropped remote testing and moved to a local setup using a phone and wearable — but with no sharding and an expanded test suite, runtimes grew to 1.5 hours per cycle. These tests were triggered daily, often before and after PR merges and every release candidate — compounding time and cost.
TV: FTL’s Android TV devices were another source of instability. They were slow, outdated (API 28), and frequently failed tests with errors like "Test Instrumentation Fail", making CI/CD unreliable.

Frequent test runs and reliance on physical devices meant high cloud spend — just as the wearable project entered high development velocity.
The Shift to Marathon Labs:
Peloton’s engineering team turned to Marathon Labs for a scalable, shift-left testing solution. Marathon provided a fast, stable cloud-native test environment that supported Android TV and wearable flows — including pre-instrumented builds, PR-level test triggers, and clean CI integration.
With Marathon:
- Wearable test runtime dropped from 1.5 hours to ~10 minutes
- Android TV tests stopped failing due to outdated devices and instrumentation errors
- PR-level testing could run as often as needed, without overloading infrastructure or budget

Results That Mattered:
With Marathon Labs, Peloton:
- Wearable test time reduced from 1.5 hrs to ~10 minutes
- Eliminated Android TV flakiness and outdated APIs (no more "Test Instrumentation Fail")
- Expanded test coverage for PRs and RCs across wearables and TV
- Reduced cloud costs by eliminating overuse of physical devices
- CI/CD pipeline stability restored
Business Impact:
This transformation didn’t just improve QA metrics. It accelerated Peloton’s roadmap, reduced time-to-market, and helped control costs across their engineering organization.
- 3–5 weeks saved per year in release time
- Developer confidence and velocity restored
- Enabled roadmap execution for TV/wearable experiences
- Eliminated flaky tests and manual reruns

Takeaway:
Peloton’s move to Marathon Labs unlocked the full potential of their multi-platform roadmap. By addressing limitations in Firebase’s support for Android TV and wearables, they turned testing from a bottleneck into a growth enabler. Marathon delivered faster test runs, broader coverage, and the infrastructure to support innovation — all with fewer bugs and higher team morale.
Subscribe to get our latest news
Other articles

Tinder Scaled Mobile Testing Without Physical Devices — Here’s How

How Peloton Cut Mobile Testing Time by 82% — And Gained Weeks in Their Release Cycle

I want to run any number of Android UI tests on each PR. Existing solutions. Part III

I want to run any number of Android UI tests on each PR. Cost. Part II
