Document local crawler and auth

This commit is contained in:
Joscha 2021-05-09 01:33:47 +02:00
parent 60cd9873bc
commit f9b2fd60e2
2 changed files with 24 additions and 2 deletions

View File

@ -65,6 +65,23 @@ crawlers:
- `transform`: Rules for renaming and excluding certain files and directories. - `transform`: Rules for renaming and excluding certain files and directories.
For more details, see [this section](#transformation-rules). (Default: empty) For more details, see [this section](#transformation-rules). (Default: empty)
Some crawlers may also require credentials for authentication. To configure how
the crawler obtains its credentials, the `auth` option is used. It is set to the
full name of an auth section (including the `auth:` prefix).
Here is a simple example:
```
[auth:example]
type = simple
username = foo
password = bar
[crawl:something]
type = some-complex-crawler
auth = auth:example
```
## The `auth:*` sections ## The `auth:*` sections
Sections whose names start with `auth:` are used to configure authenticators. An Sections whose names start with `auth:` are used to configure authenticators. An
@ -82,7 +99,12 @@ authenticators is `type`:
## Crawler types ## Crawler types
TODO Fill in as crawlers are implemented ### The `local` crawler
This crawler crawls a local directory. It is really simple and mostly useful for
testing different setups.
- `path`: Path to the local directory to crawl. (No default, must be specified)
## Authenticator types ## Authenticator types

View File

@ -22,7 +22,7 @@ class LocalCrawler(Crawler):
): ):
super().__init__(name, config, section) super().__init__(name, config, section)
self._path = section.path() self._path = config.working_dir / section.path()
async def crawl(self) -> None: async def crawl(self) -> None:
await self._crawl_path(self._path, PurePath()) await self._crawl_path(self._path, PurePath())