Commit Graph

217 Commits

Author SHA1 Message Date
f9b2fd60e2 Document local crawler and auth 2021-05-09 01:33:47 +02:00
60cd9873bc Add local file crawler 2021-05-06 01:02:40 +02:00
273d56c39a Properly load crawler config 2021-05-05 23:45:10 +02:00
5497dd2827 Add @noncritical and @repeat decorators 2021-05-05 23:36:54 +02:00
bbfdadc463 Implement output directory 2021-05-05 18:08:34 +02:00
07e831218e Add sync report 2021-05-02 00:56:10 +02:00
91c33596da Load crawlers from config file 2021-04-30 16:22:14 +02:00
e7a51decb0 Elaborate on transforms and implement changes 2021-04-29 20:24:18 +02:00
f776186480 Use PurePath instead of Path
Path should only be used when we need to access the file system. For all other
purposes (mainly crawling), we use PurePath instead since the paths don't
correspond to paths in the local file system.
2021-04-29 20:20:25 +02:00
0096d83387 Simplify Limiter implementation 2021-04-29 20:20:25 +02:00
502654d853 Fix mypy errors 2021-04-29 15:47:52 +02:00
d2103d7c44 Document crawler 2021-04-29 15:43:20 +02:00
d96a361325 Test and fix exclusive output 2021-04-29 15:27:16 +02:00
2e85d26b6b Use conductor via context manager 2021-04-29 14:23:28 +02:00
6431a3fb3d Fix some mypy errors 2021-04-29 14:23:09 +02:00
ac3bfd7388 Make progress bars easier to use
The crawler now supports two types of progress bars
2021-04-29 13:53:16 +02:00
3ea86d18a0 Jerry-rig DummyCrawler to run 2021-04-29 13:45:04 +02:00
bbc792f9fb Implement Crawler and DummyCrawler 2021-04-29 13:44:29 +02:00
7e127cd5cc Clean up and fix conductor and limiter
Turns out you have to await an async lock, who knew...
2021-04-29 13:44:04 +02:00
c4fb92c658 Make type hints compatible with Python 3.8 2021-04-29 13:11:58 +02:00
a18db57e6f Implement terminal conductor 2021-04-29 11:44:47 +02:00
b915e393dd Implement limiter 2021-04-29 10:24:28 +02:00
3a74c23d09 Implement transformer 2021-04-29 09:51:50 +02:00
fbebc46c58 Load and dump config 2021-04-29 09:51:50 +02:00
5595a908d8 Configure entry point 2021-04-27 00:32:21 +02:00
29cd5d1a3c Reflect totality of sanitize_windows_path in return type 2021-04-19 11:10:02 +02:00
1f2af3a290 Retry on more I/O Errors 2021-04-13 11:43:22 +02:00
14cdfb6a69 Fix typo in date demangler doc 2021-04-13 11:19:51 +02:00
946b7a7931 Also crawl .c/.java/.zip from IPD page 2021-02-09 12:30:59 +01:00
fb78a6e98e Retry ILIAS downloads a few times and only fail that file 2021-01-06 13:08:10 +01:00
f0562049b6 Remove Python 3.9 method in crawler 2020-12-30 17:18:04 +01:00
c978e9edf4 Resolve a few pylint warnings 2020-12-30 14:45:46 +01:00
2714ac6be6 Send CSRF token to Shibboleth 2020-12-30 14:34:11 +01:00
9b048a9cfc Canonize meeting names to a properly formatted date 2020-12-30 14:32:59 +01:00
f47b137b59 Fix ILIAS init.py and Pferd.py authenticators 2020-12-06 13:15:32 +01:00
83ea15ee83 Use system keyring service for password auth 2020-12-06 13:15:30 +01:00
0f5e55648b Tell user when the conflict resolver kept existing files 2020-12-05 14:12:45 +01:00
4ce385b262 Treat file overwrite and marked file overwrite differently 2020-12-05 14:03:43 +01:00
fcb3884a8f Add --remote-first, --local-first and --no-delete flags 2020-12-05 13:49:05 +01:00
9f6dc56a7b Use a strategy to decide conflict resolution 2020-12-02 19:32:57 +01:00
f3a4663491 Add passive/no_prompt flag 2020-12-02 18:24:07 +01:00
ba3c7f85fa Replace "\" in ILIAS paths as well
I am not sure whether anybody really uses a backslash in their names,
but I guess it can't hurt to do this for windows users.
2020-11-19 19:37:28 +01:00
8ebf0eab16 Sort download summary 2020-11-17 21:36:04 +01:00
cd90a60dee Move "sanitize_windows_path" to PFERD.transform 2020-11-12 20:52:46 +01:00
55e9e719ad Sanitize "/" in ilias path names 2020-11-12 20:21:24 +01:00
316b9d7bf4 Prevent too many retries when fetching an ILIAS page 2020-11-04 22:23:56 +01:00
f830b42a36 Fix duplicate files in download summary 2020-11-04 21:49:35 +01:00
ef343dec7c Merge organizer download summaries 2020-11-04 15:06:58 +01:00
0da2fafcd8 Fix links outside tables 2020-11-04 14:46:15 +01:00
f4abe3197c Add ipd crawler 2020-11-03 21:15:40 +01:00