Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: embedding generation from web scraping #16

Merged
merged 8 commits into from
Sep 5, 2024

Conversation

ThisIsDemetrio
Copy link
Contributor

@ThisIsDemetrio ThisIsDemetrio commented Aug 23, 2024

Introduce new APIs under the /embeddings router.

Precisely:

  • POST /embeddings/generate that, given an URL, starts a web crawling, scraping and generation of embeddings from scraped text of text/html web pages. The task is executed in background: a response is returned as soon as the process starts. Executable once at time because of a lock: if another POST is requested, the response on that time is a 409.
  • GET /embeddings/status that returns a { status: "idle" } if no embeddings are getting generated, { status: "running" } if the process is running

Also:

  • include some basic unit tests
  • documentation is updated

@ThisIsDemetrio ThisIsDemetrio changed the title feat/embedding generation from scraping feat: embedding generation from web scraping Aug 23, 2024
@ThisIsDemetrio ThisIsDemetrio merged commit f9effe8 into main Sep 5, 2024
1 check passed
@ThisIsDemetrio ThisIsDemetrio deleted the feat/embedding-generation-from-scraping branch September 5, 2024 07:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant