Staking ETFs Have Arrived
In October, Grayscale became the first U.S. issuer to begin staking the ETH and SOL underlying its spot crypto ETFs.

At Coinbase, we're on a mission to increase economic freedom in the world by building the most trusted and easy-to-use crypto products and services. Coinbase One embodies this mission by offering our members a premium experience across our product suite. Since its launch, Coinbase One has grown rapidly to serve a global community and has continued to expand in 2025 as crypto adoption accelerates globally.
Supporting this scale and rapid evolution requires building a modern subscription platform from the ground up. In this series, we'll take you inside the design and engineering of Coinbase One's subscription platform. Part 1 focuses on how we manage subscription lifecycles, covering the challenge, principles, architecture, and learnings.
At its core, a subscription is a long-running state machine. Each member's subscription transitions through multiple states throughout its lifecycle, from signup activation through recurring billing cycles, with optional plan changes, pauses and cancellations along the way. Each state transition can be triggered either by a user action or automatically on a schedule (e.g., monthly renewals).

The diagram above illustrates a simplified version of Coinbase One subscription states and transitions. Let's walk through what happens when a subscription transitions to the “Active” state during signup.
When a user signs up for Coinbase One, our subscription platform executes the following logic to activate the subscription:
Process signup promotion: validate and apply any promotional discounts
Generate invoice: calculate the payment amount based on the user's plan, promotions, and taxes
Process payment: execute payment via the user's chosen payment method
Activate subscription and benefits: initialize the subscription status and enable benefits
Send notification: send a customized welcome email based on the user's plan
This represents a simplified view of the activation flow, the real flow is more nuanced and complex as business requirements evolve. Additionally, this flow depends on 10+ downstream services, each with varying availability SLOs, meaning any dependency issue can block the entire activation flow. For example, invoice generation might fail due to a tax service outage, or payment processing might fail due to payment system unavailability. These intermittent service failures can easily cause state transitions to get stuck in intermediate states if the system is not designed to handle them gracefully.
To build a platform capable of managing this complexity reliably and at scale, we established clear design principles that guided our architectural choices and helped us navigate trade-offs throughout development:
Extensibility
The subscription lifecycle should be easy to extend as product requirements evolve. Adding new subscription states, or adjusting the timing of lifecycle events should require minimal changes. For example, if we need to introduce a new "grace period" that grants users 14 days to resolve payment failures, the system should allow us to quickly plug this logic into the existing subscription lifecycle.
Scalability
The architecture must scale horizontally to support millions of active subscriptions. As membership grows, the system should handle increased load without fundamental redesign.
Resilience
The subscription platform must be resilient to unpredictable disruptions from dependent services. The system should gracefully handle failures, ensuring that state transitions can recover and proceed even when services are temporarily unavailable.
Testability
Subscription lifecycles are complex and time-dependent, with state transitions occurring months apart. The system should make it easy to simulate, inspect, and test every possible state and transition, allowing engineers to validate behavior throughout the subscription lifecycle with confidence.
Guided by these principles, we designed our subscription platform around four core components: Cadence Service for workflow orchestration, Cadence Workflows for lifecycle events, Scanner Service for workflow scheduling, and MongoDB for state persistence.
Instead of modeling each subscription as a single, long-lived Cadence workflow, we decompose a subscription lifecycle into many short-lived, modular Cadence workflows. Each workflow handles a specific lifecycle event and is scheduled independently. This separation of concerns enables the system to remain extensible and adaptable as product requirements evolve.

We use Cadence, a fault-tolerant workflow orchestration engine, to model and execute business logic for subscription lifecycle events. Cadence provides:
Durable execution state: Workflows resume exactly where they left off after any failure
Automatic retries: Configurable policies ensure resilience against transient dependency outages.
With Cadence, we can define complex state transitions in fault-tolerant workflows. For example, if payment processing in the Activation Workflow is blocked by a transient issue from a downstream system, Cadence automatically retries the payment according to its configured policy until the operation succeeds or times out.
This fault tolerance also significantly reduces operational overhead. When downstream services are temporarily unavailable during short outages or maintenance, Cadence workflows continue retrying automatically and run to completion once the affected services recover, eliminating the need for on-call intervention or a separate reconciliation process.
Our subscription platform supports 19 distinct workflows that execute for different subscription lifecycle events. We follow two core design principles when building workflows:
Idempotency: Each workflow should produce the same result regardless of how many times it executes, ensuring retries don't create side effects like duplicate notifications. For example, each notification we send includes an idempotency key tied to a specific billing cycle, this idempotency key ensures that the same email won't be sent multiple times during workflow retries.
Separation of orchestration and execution: A workflow defines the sequence of operations. Each operation is executed through a Cadence Activity, which contains the actual business logic. This separation keeps workflows focused on orchestration while business logic evolves only within Activities. It avoids frequent workflow changes and prevents potential non-deterministic errors in Cadence.
Example Workflows:

The Scanner Service is our custom-built workflow scheduler that continuously polls the database for workflows whose scheduled time has arrived and triggers their execution via Cadence Service. Workflow scheduling is sharded by user ID, allowing the Scanner to scale across multiple instances as subscription volume increases.
Beyond basic scheduling, the Scanner also provides additional features:
Workflow Dependencies: The Scanner supports workflow dependencies, allowing one workflow to run only after another completes. This is essential for maintaining consistency during complex operations. For example, when a user changes their subscription plan, the new plan's Activation Workflow depends on the old plan's Termination Workflow completing first. By explicitly specifying this dependency in the schedule, we prevent out-of-order plan changes and ensure a clean transition.
Workflow Tracking: All workflows, including those that need immediate execution, are triggered through the Scanner. This ensures every workflow execution is recorded in the database, providing a complete audit trail and enabling the Scanner's built-in validation checks to run consistently across all workflows.
MongoDB serves as the source of truth for workflow schedules. All workflows are tracked through explicit state transitions (PENDING → READY → RUNNING → COMPLETED), providing full visibility into system behavior and enabling operational control.
With these components working in harmony, our platform delivers extensibility through modular workflow design, scalability via sharded architecture, and resilience with fault-tolerant execution.
Testing is key to building a high-quality subscription product. However, testing subscription states has historically been challenging due to the time-spanning nature of subscription lifecycles. For example, setting up a test subscription in the grace period state previously required either manually overriding multiple database records (fragile and error-prone), or creating a real subscription and waiting for payment results across two billing cycles (30+ days). Neither approach met our lifecycle testing requirement.
We built a generic lifecycle testing approach that allows us to test any subscription state within minutes. Since every subscription event is represented as a scheduled workflow, we can simulate time progression by executing workflows on demand rather than waiting for their scheduled time.
We implemented this through a test clock feature. Each test subscription has an associated test clock that testers can manipulate through our admin tool. We extended our Scanner Service to support the test clock: when a tester advances the test clock to a future date, the Scanner executes all workflows scheduled up to that test clock time instead of the current wall-clock time. This allows test subscriptions to progress through their natural lifecycle.

Now, testing the grace period scenario mentioned earlier takes just less than a minute: create a subscription, fast-forward the test clock 30 days to trigger renewal, mock a payment failure to enter the grace period, then fast-forward 4 more days to verify the payment retry logic in the grace period. The entire 34-day scenario could be tested in a minute.
We built this subscription architecture in 2021, and its foundation continues to power Coinbase One's subscription platform today. Throughout this journey, we've identified several key learnings that may help teams building similar systems.
Invest in Platform When It Unlocks Product Velocity
We evaluated third-party subscription platforms but ultimately built our own. This decision paid off as Coinbase One rapidly evolved from a single plan to supporting free trials, multiple tiers, various billing frequencies, and diverse payment methods globally including crypto. Some of these features would have not only been difficult to implement quickly but may also be completely infeasible on external platforms. If your product iterates rapidly and requires unique features, investing in a platform upfront provides the flexibility to move quickly without third-party constraints.
Prioritize Extensibility from Day One
The long-running nature of subscriptions demands extensibility. Subscriptions can persist for months or years, during which product requirements inevitably change. For systems managing long-lived entities, prioritize extensibility early. You will likely need to add capabilities and modify behavior throughout the entity's lifecycle without breaking what's already running.
Minimize Operational Overhead
Building products with complex state management inevitably involves failures, and fixing broken states can be time-consuming and error-prone. We minimized the operational burden through approaches like Cadence's fault-tolerant retry mechanism, idempotent workflow design, and built-in dependency validation in the scheduling logic. This approach reduces problematic subscription states and allows the team to focus on building features rather than firefighting operational issues.
The subscription platform we started building in 2021 continues to power Coinbase One's growth today, handling complex lifecycle management at scale. By prioritizing extensibility, resilience, and testability from day one, we created a foundation that enables rapid iteration as our product evolves.
In Part 2, we'll explore the platform we built to power the plan expansion from a single US plan to a global offering, supporting multiple tiers, billing frequencies, and international plans. Stay tuned!