Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database schema update v1 #151

Open
tmichela opened this issue Dec 8, 2023 · 6 comments
Open

Database schema update v1 #151

tmichela opened this issue Dec 8, 2023 · 6 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@tmichela
Copy link
Member

tmichela commented Dec 8, 2023

No description provided.

@tmichela tmichela added the enhancement New feature or request label Dec 8, 2023
@tmichela tmichela added this to the sprint 1 milestone Dec 8, 2023
@JamesWrigley
Copy link
Member

JamesWrigley commented Jan 15, 2024

Copying the checklist from #134:

  • Test the migrations
  • Forbid amore-proto from opening unmigrated databases
  • Add a summary column to store which summary method was used
  • Add a view for the variable timestamps
  • Migrate v0.5 databases to v1

@JamesWrigley
Copy link
Member

When it comes to testing the migrations, how do we feel about generating a small database with all the edge cases we can think of (comments, variables, images, etc) and storing it and the HDF5 files in git? The tests would then migrate the database and check the results.

@takluyver
Copy link
Member

I might store the SQL to recreate the database, rather than the binary database file, assuming the extra size isn't enough to be a problem. Then it's transparent what it does, and easy to add to if we think of more corner cases. If you make the database, you can ask SQLite to dump it out as SQL instructions, so we don't have to write this by hand.

@takluyver
Copy link
Member

But other than that, it sounds like a good plan. 👍

@JamesWrigley
Copy link
Member

JamesWrigley commented Jan 19, 2024

Changes to the storage format between the original #134 and today:

  • Removed the stored_type attribute on the summary datasets in the HDF5 files in favour of a _damnit_objtype attribute on the variable group. All of the object type strings were converted to lowercase, and some were deleted.
  • run_variables.stored_type (type in the HDF5 files) -> run_variables.summary_type (type in the DB).
  • Blobs are identified by their binary header rather than an attribute in the HDF5 files.
  • Thumbnails are stored as PNGs rather than arrays in the HDF5 files, and as binary PNG blobs rather than pickled arrays in the DB.
  • Thumbnails are downsampled to 35x35 immediately by the backend.
  • Added a summary_method column.

(commenting to get this right in my head)

@CammilleCC
Copy link

Database v1 is successfully deployed and most of the existing database have been migrated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants