Commit Graph

24 Commits

Author SHA1 Message Date
c454fabc9d Add support for exercises in ILIAS crawler 2021-05-15 21:40:17 +02:00
7d323ec62b Implement video downloads in ilias crawler 2021-05-15 21:32:32 +02:00
c7494e32ce Start implementing crawling in ILIAS crawler
The ilias crawler can now crawl quite a few filetypes, splits off
folders and crawls them concurrently.
2021-05-15 20:42:18 +02:00
1123c8884d Implement an IliasPage
This allows PFERD to semantically understand ILIAS HTML and is the
foundation for the ILIAS crawler. This patch extends the ILIAS crawler
to crawl the personal desktop and print the elements on it.
2021-05-15 18:59:23 +02:00
8c32da7f19 Let authenticators provide username and password separately 2021-05-15 18:27:03 +02:00
d63494908d Properly invalidate exceptions
The simple authenticator now properly invalidates its credentials. Also, the
invalidation functions have been given better names and documentation.
2021-05-15 17:37:05 +02:00
868f486922 Rename local crawler path to target 2021-05-15 17:12:25 +02:00
b2a2b5999b Implement ILIAS auth and crawl home page
This commit introduces the necessary machinery to authenticate with
ILIAS and crawl the home page.

It can't do much yet and just silently fetches the homepage.
2021-05-15 15:25:05 +02:00
b0f731bf84 Make crawlers use transformers 2021-05-15 15:25:05 +02:00
302b8c0c34 Fix errors loading local crawler config
Apparently getint and getfloat may return a None even though this is not
mentioned in their type annotations.
2021-05-15 15:25:05 +02:00
ed2e19a150 Add reasons for invalid values 2021-05-15 15:25:05 +02:00
1591cb9197 Add options to slow down local crawler
These options are meant to make the local crawler behave more like a
network-based crawler for purposes of testing and debugging other parts of the
code base.
2021-05-15 15:25:01 +02:00
94d6a01cca Use file mtime in local crawler 2021-05-13 19:42:40 +02:00
0acdee15a0 Let crawlers obtain authenticators 2021-05-13 18:57:20 +02:00
c3ce6bb31c Fix crawler cleanup not being awaited 2021-05-11 00:28:45 +02:00
d5f29f01c5 Use global conductor instance
The switch from crawler-local conductors to a single pferd-global conductor was
made to prepare for auth section credential providers.
2021-05-11 00:05:04 +02:00
595ba8b7ab Remove dummy crawler 2021-05-10 23:47:46 +02:00
f9b2fd60e2 Document local crawler and auth 2021-05-09 01:33:47 +02:00
60cd9873bc Add local file crawler 2021-05-06 01:02:40 +02:00
273d56c39a Properly load crawler config 2021-05-05 23:45:10 +02:00
f776186480 Use PurePath instead of Path
Path should only be used when we need to access the file system. For all other
purposes (mainly crawling), we use PurePath instead since the paths don't
correspond to paths in the local file system.
2021-04-29 20:20:25 +02:00
d96a361325 Test and fix exclusive output 2021-04-29 15:27:16 +02:00
ac3bfd7388 Make progress bars easier to use
The crawler now supports two types of progress bars
2021-04-29 13:53:16 +02:00
bbc792f9fb Implement Crawler and DummyCrawler 2021-04-29 13:44:29 +02:00