Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registering new file formats for the ContentsManager #1456

Open
martinRenou opened this issue Aug 30, 2024 · 0 comments
Open

Registering new file formats for the ContentsManager #1456

martinRenou opened this issue Aug 30, 2024 · 0 comments

Comments

@martinRenou
Copy link
Contributor

martinRenou commented Aug 30, 2024

cc. @davidbrochart @trungleduc @brichet

Problem

When building extensions like JupyterGIS or JupyterCAD, we soon feel the need for supporting advanced file formats like .qgz, .fcstd etc. Because those are binary formats, the ContentsManager returns the base64 encoded version of the file when requesting its content: As of today the default ContentsManager supports text files, binary files as base64 and Notebooks as JSON.

In the case of the FreeCAD's .fcstd format, We were able to fix that problem ourselves by handling the base64 source with the FreeCAD's Python library, but this means the file is read and write mutliple times. It also only works in the specific case of jupyter-collaboration where we can hook on the file loading logic to turn the original source into something we understand for the collaboration logic (a JSON representation of the content).

Proposed Solution

It would be great to handle those specific file formats ourselves directly at the ContentsManager level, preventing those issues mentioned above.

One solution could be to provide our own ContentsManager, but because we are building libraries we don't want to overwrite a potential custom ContentsManager the user would have already set. Also our libraries would collide.

I would like to suggest being able to configure (maybe through traitlets) the ContentsManager, providing custom "file adapters" that would handle the read/write for specific file extensions.

Additional context

Notes from the jupyter-server meeting August 29th 2024 where we discussed this:

  • Could the base ContentsManager register custom file types?
  • Today, we set the type of content using two keys in the contents REST API model, type and format.
  • Extending would mean adding new type/format combinations.
  • Problems we observed:
    • The contents handler in the Jupyter Server owns the logic to check these keys, not the manager.
    • There is discrepancy between the formats allowed/not allowed in the ContentsHandlers and the ContentsManager. For example, "json" is not a valid format in the handler, but it is valid in the contents manager.
    • there is a lot of logic in the handler that should be pushed down into the manager.
    • Contents manager throws HTTP Errors? that's strange (and probably wrong)
  • In general, it would be ideal to have a registration/plugin point to bring your own file types to Jupyter Server without requiring folks to bring a custom contents manager to support them. Instead, we want to use the default contents manager, but we can act on files that aren't notebooks or text or base64 strings.
  • Comments in the chat:
    • The limitation of the contents GET approach will always be that it does not support streaming the content. Unless we have any ideas
    • maybe the "adapter" could have a public serialize() method and in a future an async aserialize() if we ever support streaming?
    • Compressed notebooks is also an idea
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant