pull: stale data when converted from "imported" to pipeline-local file #10457
Labels
A: data-sync
Related to dvc get/fetch/import/pull/push
bug
Did we break something?
p3-nice-to-have
It should be done this or next sprint
triage
Needs to be triaged
Bug Report
Description
Imagine scenario with a DVC data pipeline using imported file from a Data Registry as a stage dependency. There are two users working on the same pipeline.
If user1 changes the imported data file locally and reproduces the pipeline, the imported file is automatically 'dvc commited' but local changes are not pushed to remote. This is by design I guess, because one should not change imported files locally, but change them in the data registry. If user2 clones the pipeline and dvc pulls, he or she naturally won't get local changes made by the first user.
Further if user1 tries to rectify situation by converting the imported data file to a "local" pipeline datafile by removing .dvc-file and running 'dvc add' this will indeed do the trick and the updated file will be dvc pushed to remote.
Nevertheless, and that's where this bug manifests itself, when user2 'git pull' 'dvc pull' the file content is still the "dataregistry" one and not pipeline-local.
The fix for user2 is to remove the datafile file and .dvc/cache folder and rerun 'dvc pull' again, that would pull correct version of the file.
Reproduce
Expected
Included as echo "INFO: " in the reproduce script
Environment information
Output of
dvc doctor
:Additional Information (if any):
The text was updated successfully, but these errors were encountered: