Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to batch crawl? #12

Open
zglaozhu opened this issue Jan 4, 2024 · 6 comments
Open

How to batch crawl? #12

zglaozhu opened this issue Jan 4, 2024 · 6 comments

Comments

@zglaozhu
Copy link

zglaozhu commented Jan 4, 2024

Can this tool only be crawled individually?
May I ask how to batch crawl?

@enenumxela
Copy link
Member

You can use --seeds to specify batch URLs. --domain must be specified with this.

@zglaozhu
Copy link
Author

zglaozhu commented Jan 5, 2024

You can use --seeds to specify batch URLs. --domain must be specified with this.

Sorry, I'm still not sure,In the Windows10 system, I have a urls file 1. txt
like
http://t.github.com/1
http://t.github.com/4

What command should I use to batch scan?
Like xcrawl3r -- seeds -- domain 1. txt -- user agent -- debug - o 2. txt?

@enenumxela
Copy link
Member

You can use --seeds to specify batch URLs. --domain must be specified with this.

Sorry, I'm still not sure,In the Windows10 system, I have a urls file 1. txt like http://t.github.com/1 http://t.github.com/4

What command should I use to batch scan? Like xcrawl3r -- seeds -- domain 1. txt -- user agent -- debug - o 2. txt?

xcrawl3r --seeds 1.txt --domain github.com -o 2.txt

Support URLs with the same root domain only

@zglaozhu
Copy link
Author

zglaozhu commented Jan 5, 2024

Support URLs with the same root domain only

For different root domain?
like
http://t.github.com/1
http://t.github.com/4
http://t.google.com/1
http://t.googlw.com/4

Like
xcrawl3r --seeds 1.txt --domain github.com,google.com, googlw.com -o 2.txt?

@enenumxela
Copy link
Member

Support URLs with the same root domain only

For different root domain? like http://t.github.com/1 http://t.github.com/4 http://t.google.com/1 http://t.googlw.com/4

Like xcrawl3r --seeds 1.txt --domain github.com,google.com, googlw.com -o 2.txt?

That's not supported yet. Nice Idea though will try implement the same

@zglaozhu
Copy link
Author

zglaozhu commented Jan 7, 2024

Support URLs with the same root domain only

For different root domain? like http://t.github.com/1 http://t.github.com/4 http://t.google.com/1 http://t.googlw.com/4
Like xcrawl3r --seeds 1.txt --domain github.com,google.com, googlw.com -o 2.txt?

That's not supported yet. Nice Idea though will try implement the same

Okay, thank you all the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants