Joscha
803e5628a2
Clean up logging
...
Paths are now (hopefully) logged consistently across all crawlers
2021-05-23 11:37:19 +02:00
Joscha
c88f20859a
Explain config file dumping
2021-05-23 11:04:50 +02:00
Joscha
54dd2f8337
Clean up main and improve error handling
2021-05-22 16:47:24 +02:00
Joscha
3300886120
Explain config file loading
2021-05-19 18:11:43 +02:00
Joscha
0d10752b5a
Configure explain log level via cli and config file
2021-05-19 17:50:10 +02:00
Joscha
05573ccc53
Add fancy CLI options
2021-05-15 22:22:01 +02:00
Joscha
b70b62cef5
Make crawler sections start with "crawl:"
...
Also, use only the part of the section name after the "crawl:" as the crawler's
output directory. Now, the implementation matches the documentation again
2021-05-15 17:24:37 +02:00
Joscha
595de88d96
Fix authenticator and crawler names
...
Now, the "auth:" and "crawl:" parts are considered part of the name. This fixes
crawlers not being able to find their authenticators.
2021-05-15 15:25:05 +02:00
Joscha
296a169dd3
Make limiter logic more complex
...
The limiter can now distinguish between crawl and download actions and has a
fancy slot system and delay logic.
2021-05-15 15:25:05 +02:00
Joscha
0459ed093e
Add simple authenticator
...
... including some required authenticator infrastructure
2021-05-11 00:28:03 +02:00
Joscha
d5f29f01c5
Use global conductor instance
...
The switch from crawler-local conductors to a single pferd-global conductor was
made to prepare for auth section credential providers.
2021-05-11 00:05:04 +02:00
Joscha
273d56c39a
Properly load crawler config
2021-05-05 23:45:10 +02:00
Joscha
91c33596da
Load crawlers from config file
2021-04-30 16:22:14 +02:00
Joscha
fbebc46c58
Load and dump config
2021-04-29 09:51:50 +02:00