feat: embedding generation from web scraping #16
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduce new APIs under the
/embeddings
router.Precisely:
POST /embeddings/generate
that, given an URL, starts a web crawling, scraping and generation of embeddings from scraped text of text/html web pages. The task is executed in background: a response is returned as soon as the process starts. Executable once at time because of a lock: if another POST is requested, the response on that time is a 409.GET /embeddings/status
that returns a{ status: "idle" }
if no embeddings are getting generated,{ status: "running" }
if the process is runningAlso: