Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ZYTE_API_MAX_REQUESTS #114

Merged
merged 6 commits into from
Aug 3, 2023
Merged

Introduce ZYTE_API_MAX_REQUESTS #114

merged 6 commits into from
Aug 3, 2023

Conversation

BurnzZ
Copy link
Member

@BurnzZ BurnzZ commented Jul 19, 2023

An attempt on #111 to gather feedback.

This has minimal changes that only addresses the limits on ZAPI Requests.

In the future, we also might want to impose other limits like the following which may affect the implementation:

  • item count
  • per data type
  • per page type
  • etc

TODO

  • README

@codecov
Copy link

codecov bot commented Jul 19, 2023

Codecov Report

Merging #114 (d0d4f8f) into main (23762b6) will increase coverage by 9.00%.
Report is 1 commits behind head on main.
The diff coverage is 100.00%.

❗ Current head d0d4f8f differs from pull request most recent head 2820d56. Consider uploading reports for the commit 2820d56 to get more accurate results

@@            Coverage Diff             @@
##             main     #114      +/-   ##
==========================================
+ Coverage   89.48%   98.49%   +9.00%     
==========================================
  Files           9        9              
  Lines         647      666      +19     
==========================================
+ Hits          579      656      +77     
+ Misses         68       10      -58     
Files Changed Coverage Δ
scrapy_zyte_api/_downloader_middleware.py 100.00% <100.00%> (ø)
scrapy_zyte_api/_params.py 100.00% <100.00%> (ø)

... and 2 files with indirect coverage changes

@BurnzZ BurnzZ requested review from kmike, Gallaecio and wRAR July 19, 2023 12:04

if self._max_requests_reached(downloader):
self._crawler.engine.close_spider(spider, "closespider_max_zapi_requests")
raise IgnoreRequest("Reached max Zyte API requests")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As, I think, this code will be executed several times, I wonder how will it look in the log, hopefully not too ugly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should just be a few of them since the spider is immediately closed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as the message reads “this request is skipped because the max has been reached” rather than just “the max has been reached”, I think it is OK for this to appear several times.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Message updated in 660cdb2

@wRAR
Copy link
Contributor

wRAR commented Jul 19, 2023

I think the approach is fine, though I haven't thought about the download_req_count calculation.

@BurnzZ BurnzZ marked this pull request as ready for review August 2, 2023 11:56
@BurnzZ
Copy link
Member Author

BurnzZ commented Aug 2, 2023

Hey folks, I've unmarked this one as draft.

I think we can think about other things to set as max like item count, per data type, per page type, etc in another future PR.

README.rst Outdated Show resolved Hide resolved
README.rst Outdated Show resolved Hide resolved
scrapy_zyte_api/_downloader_middleware.py Outdated Show resolved Hide resolved
Co-authored-by: Adrián Chaves <[email protected]>
@kmike kmike merged commit 3c09dfc into main Aug 3, 2023
14 checks passed
@kmike
Copy link
Member

kmike commented Aug 3, 2023

Thanks @BurnzZ!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants