mirror of
https://github.com/Garmelon/PFERD.git
synced 2023-12-21 10:23:01 +01:00
Change limiter logic
Now download tasks are a subset of all tasks.
This commit is contained in:
19
CONFIG.md
19
CONFIG.md
@ -64,17 +64,14 @@ crawlers:
|
||||
remote file is different.
|
||||
- `transform`: Rules for renaming and excluding certain files and directories.
|
||||
For more details, see [this section](#transformation-rules). (Default: empty)
|
||||
- `max_concurrent_crawls`: The maximum number of concurrent crawl actions. What
|
||||
constitutes a crawl action might vary from crawler to crawler, but it usually
|
||||
means an HTTP request of a page to analyze. (Default: 1)
|
||||
- `max_concurrent_downloads`: The maximum number of concurrent download actions.
|
||||
What constitutes a download action might vary from crawler to crawler, but it
|
||||
usually means an HTTP request for a single file. (Default: 1)
|
||||
- `request_delay`: Time (in seconds) that the crawler should wait between
|
||||
subsequent requests. Can be used to avoid unnecessary strain for the crawl
|
||||
target. Crawl and download actions are handled separately, meaning that a
|
||||
download action might immediately follow a crawl action even if this is set to
|
||||
a nonzero value. (Default: 0)
|
||||
- `max_concurrent_tasks`: The maximum number of concurrent tasks (such as
|
||||
crawling or downloading). (Default: 1)
|
||||
- `max_concurrent_downloads`: How many of those tasks can be download tasks at
|
||||
the same time. Must not be greater than `max_concurrent_tasks`. When not set,
|
||||
this is the same as `max_concurrent_tasks`. (Optional)
|
||||
- `delay_between_tasks`: Time (in seconds) that the crawler should wait between
|
||||
subsequent tasks. Can be used as a sort of rate limit to avoid unnecessary
|
||||
load for the crawl target. (Default: 0.0)
|
||||
|
||||
Some crawlers may also require credentials for authentication. To configure how
|
||||
the crawler obtains its credentials, the `auth` option is used. It is set to the
|
||||
|
Reference in New Issue
Block a user