Commit Graph

208 Commits

Author SHA1 Message Date
Joscha
0096d83387 Simplify Limiter implementation 2021-04-29 20:20:25 +02:00
Joscha
502654d853 Fix mypy errors 2021-04-29 15:47:52 +02:00
Joscha
d2103d7c44 Document crawler 2021-04-29 15:43:20 +02:00
Joscha
d96a361325 Test and fix exclusive output 2021-04-29 15:27:16 +02:00
Joscha
2e85d26b6b Use conductor via context manager 2021-04-29 14:23:28 +02:00
Joscha
6431a3fb3d Fix some mypy errors 2021-04-29 14:23:09 +02:00
Joscha
ac3bfd7388 Make progress bars easier to use
The crawler now supports two types of progress bars
2021-04-29 13:53:16 +02:00
Joscha
3ea86d18a0 Jerry-rig DummyCrawler to run 2021-04-29 13:45:04 +02:00
Joscha
bbc792f9fb Implement Crawler and DummyCrawler 2021-04-29 13:44:29 +02:00
Joscha
7e127cd5cc Clean up and fix conductor and limiter
Turns out you have to await an async lock, who knew...
2021-04-29 13:44:04 +02:00
Joscha
c4fb92c658 Make type hints compatible with Python 3.8 2021-04-29 13:11:58 +02:00
Joscha
a18db57e6f Implement terminal conductor 2021-04-29 11:44:47 +02:00
Joscha
b915e393dd Implement limiter 2021-04-29 10:24:28 +02:00
Joscha
3a74c23d09 Implement transformer 2021-04-29 09:51:50 +02:00
Joscha
fbebc46c58 Load and dump config 2021-04-29 09:51:50 +02:00
Joscha
5595a908d8 Configure entry point 2021-04-27 00:32:21 +02:00
I-Al-Istannen
29cd5d1a3c Reflect totality of sanitize_windows_path in return type 2021-04-19 11:10:02 +02:00
I-Al-Istannen
1f2af3a290 Retry on more I/O Errors 2021-04-13 11:43:22 +02:00
I-Al-Istannen
14cdfb6a69 Fix typo in date demangler doc 2021-04-13 11:19:51 +02:00
I-Al-Istannen
946b7a7931 Also crawl .c/.java/.zip from IPD page 2021-02-09 12:30:59 +01:00
I-Al-Istannen
fb78a6e98e Retry ILIAS downloads a few times and only fail that file 2021-01-06 13:08:10 +01:00
I-Al-Istannen
f0562049b6 Remove Python 3.9 method in crawler 2020-12-30 17:18:04 +01:00
I-Al-Istannen
c978e9edf4 Resolve a few pylint warnings 2020-12-30 14:45:46 +01:00
I-Al-Istannen
2714ac6be6 Send CSRF token to Shibboleth 2020-12-30 14:34:11 +01:00
I-Al-Istannen
9b048a9cfc Canonize meeting names to a properly formatted date 2020-12-30 14:32:59 +01:00
I-Al-Istannen
f47b137b59 Fix ILIAS init.py and Pferd.py authenticators 2020-12-06 13:15:32 +01:00
Scriptim
83ea15ee83 Use system keyring service for password auth 2020-12-06 13:15:30 +01:00
I-Al-Istannen
0f5e55648b Tell user when the conflict resolver kept existing files 2020-12-05 14:12:45 +01:00
I-Al-Istannen
4ce385b262 Treat file overwrite and marked file overwrite differently 2020-12-05 14:03:43 +01:00
I-Al-Istannen
fcb3884a8f Add --remote-first, --local-first and --no-delete flags 2020-12-05 13:49:05 +01:00
I-Al-Istannen
9f6dc56a7b Use a strategy to decide conflict resolution 2020-12-02 19:32:57 +01:00
Christophe
f3a4663491 Add passive/no_prompt flag 2020-12-02 18:24:07 +01:00
I-Al-Istannen
ba3c7f85fa Replace "\" in ILIAS paths as well
I am not sure whether anybody really uses a backslash in their names,
but I guess it can't hurt to do this for windows users.
2020-11-19 19:37:28 +01:00
I-Al-Istannen
8ebf0eab16 Sort download summary 2020-11-17 21:36:04 +01:00
I-Al-Istannen
cd90a60dee Move "sanitize_windows_path" to PFERD.transform 2020-11-12 20:52:46 +01:00
I-Al-Istannen
55e9e719ad Sanitize "/" in ilias path names 2020-11-12 20:21:24 +01:00
I-Al-Istannen
316b9d7bf4 Prevent too many retries when fetching an ILIAS page 2020-11-04 22:23:56 +01:00
I-Al-Istannen
f830b42a36 Fix duplicate files in download summary 2020-11-04 21:49:35 +01:00
I-Al-Istannen
ef343dec7c Merge organizer download summaries 2020-11-04 15:06:58 +01:00
I-Al-Istannen
0da2fafcd8 Fix links outside tables 2020-11-04 14:46:15 +01:00
I-Al-Istannen
f4abe3197c Add ipd crawler 2020-11-03 21:15:40 +01:00
I-Al-Istannen
38d4f5b4c9 Do not fail only empty courses 2020-11-03 20:09:54 +01:00
I-Al-Istannen
73c3eb0984 Add option to skip videos in sync_url 2020-10-06 17:20:47 +02:00
I-Al-Istannen
c1ccb6c53e Allow crawling videos with sync_url 2020-10-06 10:46:06 +02:00
I-Al-Istannen
51a713fa04 Allow crawling courses or folders with sync_url
Video folders do not work, if they are passed directly. Their containing
folder must be specified instead.
2020-09-28 20:00:01 +02:00
I-Al-Istannen
e32a49480b Expose methods to look up course/element names by id / url 2020-09-28 19:16:52 +02:00
I-Al-Istannen
3f0ae729d6 Expand "is course" check to not download magazines or other weird things 2020-09-28 16:43:58 +02:00
I-Al-Istannen
55678d7fee Pass string down to FileCookieJar
Some python versions just can't handle it *despite the documentation
stating they should*.
2020-08-12 09:09:14 +02:00
I-Al-Istannen
a57ee8b96b Add timeout to video downloads to work around requests IPv6 bug 2020-08-11 14:40:30 +02:00
Joscha
77a109bb7e Fix ilias shibboleth authenticator
The shibboleth site got a visual overhaul that slightly changed the classes of a
form we need.
2020-07-28 19:13:51 +00:00