storcon: Updating of observed state is racy #9124
Labels
c/storage/controller
Component: Storage Controller
c/storage
Component: storage
t/bug
Issue Type: Bug
The "normal" code path for updating tenant observed state is via Service::process_result: once a reconciler finishes, it pushes the result on a channel. We read from the channel in a background loop and set the observed state of the shard to whatever the ReconcileResult suggests. Reconcilers themselves operate on a snapshot on the observed state.
However, we also update the observed state inline by grabbing the lock (potentially non-exhaustive list below):
Service::node_configure
(probably the biggest offender): updates observed state in response to nodes coming online/offlineService::re_attach
: this looks safe at a first approximation 🤷Service::do_tenant_shard_split
Service::node_drop
Service::node_delete
It's probably obvious by now, that this pattern is no bueno, but let's use https://github.com/neondatabase/cloud/issues/17362 as an example race:
AttachedSingle
with both A and B (different generations tho). Node A is unavailable, so we skip detaching from it for now.A: AttachedSingle
for shard X.and can't detach it.
Note that we now pass the observed state around with storage controller rolling restarts, so these inconsistencies propagate through restarts.
The text was updated successfully, but these errors were encountered: