60cd9873bc
Add local file crawler
2021-05-06 01:02:40 +02:00
273d56c39a
Properly load crawler config
2021-05-05 23:45:10 +02:00
5497dd2827
Add @noncritical and @repeat decorators
2021-05-05 23:36:54 +02:00
bbfdadc463
Implement output directory
2021-05-05 18:08:34 +02:00
07e831218e
Add sync report
2021-05-02 00:56:10 +02:00
91c33596da
Load crawlers from config file
2021-04-30 16:22:14 +02:00
e7a51decb0
Elaborate on transforms and implement changes
2021-04-29 20:24:18 +02:00
f776186480
Use PurePath instead of Path
...
Path should only be used when we need to access the file system. For all other
purposes (mainly crawling), we use PurePath instead since the paths don't
correspond to paths in the local file system.
2021-04-29 20:20:25 +02:00
0096d83387
Simplify Limiter implementation
2021-04-29 20:20:25 +02:00
502654d853
Fix mypy errors
2021-04-29 15:47:52 +02:00
d2103d7c44
Document crawler
2021-04-29 15:43:20 +02:00
d96a361325
Test and fix exclusive output
2021-04-29 15:27:16 +02:00
2e85d26b6b
Use conductor via context manager
2021-04-29 14:23:28 +02:00
6431a3fb3d
Fix some mypy errors
2021-04-29 14:23:09 +02:00
ac3bfd7388
Make progress bars easier to use
...
The crawler now supports two types of progress bars
2021-04-29 13:53:16 +02:00
3ea86d18a0
Jerry-rig DummyCrawler to run
2021-04-29 13:45:04 +02:00
bbc792f9fb
Implement Crawler and DummyCrawler
2021-04-29 13:44:29 +02:00
7e127cd5cc
Clean up and fix conductor and limiter
...
Turns out you have to await an async lock, who knew...
2021-04-29 13:44:04 +02:00
c4fb92c658
Make type hints compatible with Python 3.8
2021-04-29 13:11:58 +02:00
a18db57e6f
Implement terminal conductor
2021-04-29 11:44:47 +02:00
b915e393dd
Implement limiter
2021-04-29 10:24:28 +02:00
3a74c23d09
Implement transformer
2021-04-29 09:51:50 +02:00
fbebc46c58
Load and dump config
2021-04-29 09:51:50 +02:00
5595a908d8
Configure entry point
2021-04-27 00:32:21 +02:00
29cd5d1a3c
Reflect totality of sanitize_windows_path in return type
2021-04-19 11:10:02 +02:00
1f2af3a290
Retry on more I/O Errors
2021-04-13 11:43:22 +02:00
14cdfb6a69
Fix typo in date demangler doc
2021-04-13 11:19:51 +02:00
946b7a7931
Also crawl .c/.java/.zip from IPD page
2021-02-09 12:30:59 +01:00
fb78a6e98e
Retry ILIAS downloads a few times and only fail that file
2021-01-06 13:08:10 +01:00
f0562049b6
Remove Python 3.9 method in crawler
2020-12-30 17:18:04 +01:00
c978e9edf4
Resolve a few pylint warnings
2020-12-30 14:45:46 +01:00
2714ac6be6
Send CSRF token to Shibboleth
2020-12-30 14:34:11 +01:00
9b048a9cfc
Canonize meeting names to a properly formatted date
2020-12-30 14:32:59 +01:00
f47b137b59
Fix ILIAS init.py and Pferd.py authenticators
2020-12-06 13:15:32 +01:00
83ea15ee83
Use system keyring service for password auth
2020-12-06 13:15:30 +01:00
0f5e55648b
Tell user when the conflict resolver kept existing files
2020-12-05 14:12:45 +01:00
4ce385b262
Treat file overwrite and marked file overwrite differently
2020-12-05 14:03:43 +01:00
fcb3884a8f
Add --remote-first, --local-first and --no-delete flags
2020-12-05 13:49:05 +01:00
9f6dc56a7b
Use a strategy to decide conflict resolution
2020-12-02 19:32:57 +01:00
f3a4663491
Add passive/no_prompt flag
2020-12-02 18:24:07 +01:00
ba3c7f85fa
Replace "\" in ILIAS paths as well
...
I am not sure whether anybody really uses a backslash in their names,
but I guess it can't hurt to do this for windows users.
2020-11-19 19:37:28 +01:00
8ebf0eab16
Sort download summary
2020-11-17 21:36:04 +01:00
cd90a60dee
Move "sanitize_windows_path" to PFERD.transform
2020-11-12 20:52:46 +01:00
55e9e719ad
Sanitize "/" in ilias path names
2020-11-12 20:21:24 +01:00
316b9d7bf4
Prevent too many retries when fetching an ILIAS page
2020-11-04 22:23:56 +01:00
f830b42a36
Fix duplicate files in download summary
2020-11-04 21:49:35 +01:00
ef343dec7c
Merge organizer download summaries
2020-11-04 15:06:58 +01:00
0da2fafcd8
Fix links outside tables
2020-11-04 14:46:15 +01:00
f4abe3197c
Add ipd crawler
2020-11-03 21:15:40 +01:00
38d4f5b4c9
Do not fail only empty courses
2020-11-03 20:09:54 +01:00