Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storcon: Updating of observed state is racy #9124

Open
VladLazar opened this issue Sep 24, 2024 · 0 comments
Open

storcon: Updating of observed state is racy #9124

VladLazar opened this issue Sep 24, 2024 · 0 comments
Labels
c/storage/controller Component: Storage Controller c/storage Component: storage t/bug Issue Type: Bug

Comments

@VladLazar
Copy link
Contributor

The "normal" code path for updating tenant observed state is via Service::process_result: once a reconciler finishes, it pushes the result on a channel. We read from the channel in a background loop and set the observed state of the shard to whatever the ReconcileResult suggests. Reconcilers themselves operate on a snapshot on the observed state.

However, we also update the observed state inline by grabbing the lock (potentially non-exhaustive list below):

  • Service::node_configure (probably the biggest offender): updates observed state in response to nodes coming online/offline
  • Service::re_attach: this looks safe at a first approximation 🤷
  • Service::do_tenant_shard_split
  • Service::node_drop
  • Service::node_delete

It's probably obvious by now, that this pattern is no bueno, but let's use https://github.com/neondatabase/cloud/issues/17362 as an example race:

  1. In response to the attached node A going unavailable, we migrate shard X to node B. We are now in AttachedSingle with both A and B (different generations tho). Node A is unavailable, so we skip detaching from it for now.
  2. Node A comes back online. We update the observed state for all shards on A here. This includes a location with A: AttachedSingle for shard X.
  3. Reconcile from step (1) finishes and clobbers the updates we made in step (2). Now we've lost knowledge of X's stale attachment on A
    and can't detach it.

Note that we now pass the observed state around with storage controller rolling restarts, so these inconsistencies propagate through restarts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/controller Component: Storage Controller c/storage Component: storage t/bug Issue Type: Bug
Projects
None yet
Development

No branches or pull requests

1 participant