mirror of
https://github.com/Garmelon/PFERD.git
synced 2023-12-21 10:23:01 +01:00
Document local crawler and auth
This commit is contained in:
parent
60cd9873bc
commit
f9b2fd60e2
24
CONFIG.md
24
CONFIG.md
@ -65,6 +65,23 @@ crawlers:
|
|||||||
- `transform`: Rules for renaming and excluding certain files and directories.
|
- `transform`: Rules for renaming and excluding certain files and directories.
|
||||||
For more details, see [this section](#transformation-rules). (Default: empty)
|
For more details, see [this section](#transformation-rules). (Default: empty)
|
||||||
|
|
||||||
|
Some crawlers may also require credentials for authentication. To configure how
|
||||||
|
the crawler obtains its credentials, the `auth` option is used. It is set to the
|
||||||
|
full name of an auth section (including the `auth:` prefix).
|
||||||
|
|
||||||
|
Here is a simple example:
|
||||||
|
|
||||||
|
```
|
||||||
|
[auth:example]
|
||||||
|
type = simple
|
||||||
|
username = foo
|
||||||
|
password = bar
|
||||||
|
|
||||||
|
[crawl:something]
|
||||||
|
type = some-complex-crawler
|
||||||
|
auth = auth:example
|
||||||
|
```
|
||||||
|
|
||||||
## The `auth:*` sections
|
## The `auth:*` sections
|
||||||
|
|
||||||
Sections whose names start with `auth:` are used to configure authenticators. An
|
Sections whose names start with `auth:` are used to configure authenticators. An
|
||||||
@ -82,7 +99,12 @@ authenticators is `type`:
|
|||||||
|
|
||||||
## Crawler types
|
## Crawler types
|
||||||
|
|
||||||
TODO Fill in as crawlers are implemented
|
### The `local` crawler
|
||||||
|
|
||||||
|
This crawler crawls a local directory. It is really simple and mostly useful for
|
||||||
|
testing different setups.
|
||||||
|
|
||||||
|
- `path`: Path to the local directory to crawl. (No default, must be specified)
|
||||||
|
|
||||||
## Authenticator types
|
## Authenticator types
|
||||||
|
|
||||||
|
@ -22,7 +22,7 @@ class LocalCrawler(Crawler):
|
|||||||
):
|
):
|
||||||
super().__init__(name, config, section)
|
super().__init__(name, config, section)
|
||||||
|
|
||||||
self._path = section.path()
|
self._path = config.working_dir / section.path()
|
||||||
|
|
||||||
async def crawl(self) -> None:
|
async def crawl(self) -> None:
|
||||||
await self._crawl_path(self._path, PurePath())
|
await self._crawl_path(self._path, PurePath())
|
||||||
|
Loading…
Reference in New Issue
Block a user