Commit Graph

42 Commits

Author SHA1 Message Date
I-Al-Istannen
1f2af3a290 Retry on more I/O Errors 2021-04-13 11:43:22 +02:00
I-Al-Istannen
f0562049b6 Remove Python 3.9 method in crawler 2020-12-30 17:18:04 +01:00
I-Al-Istannen
c978e9edf4 Resolve a few pylint warnings 2020-12-30 14:45:46 +01:00
I-Al-Istannen
9b048a9cfc Canonize meeting names to a properly formatted date 2020-12-30 14:32:59 +01:00
I-Al-Istannen
ba3c7f85fa Replace "\" in ILIAS paths as well
I am not sure whether anybody really uses a backslash in their names,
but I guess it can't hurt to do this for windows users.
2020-11-19 19:37:28 +01:00
I-Al-Istannen
55e9e719ad Sanitize "/" in ilias path names 2020-11-12 20:21:24 +01:00
I-Al-Istannen
316b9d7bf4 Prevent too many retries when fetching an ILIAS page 2020-11-04 22:23:56 +01:00
I-Al-Istannen
73c3eb0984 Add option to skip videos in sync_url 2020-10-06 17:20:47 +02:00
I-Al-Istannen
c1ccb6c53e Allow crawling videos with sync_url 2020-10-06 10:46:06 +02:00
I-Al-Istannen
51a713fa04 Allow crawling courses or folders with sync_url
Video folders do not work, if they are passed directly. Their containing
folder must be specified instead.
2020-09-28 20:00:01 +02:00
I-Al-Istannen
e32a49480b Expose methods to look up course/element names by id / url 2020-09-28 19:16:52 +02:00
I-Al-Istannen
3f0ae729d6 Expand "is course" check to not download magazines or other weird things 2020-09-28 16:43:58 +02:00
I-Al-Istannen
2aed4f6d1f Only query the dir_filter for directories 2020-07-13 13:36:12 +02:00
I-Al-Istannen
23bfa42a0d Never use the direct download button, as it is currently broken 2020-06-11 13:31:01 +02:00
I-Al-Istannen
8198c9ecaa Reorder methods a bit 2020-05-30 19:06:36 +02:00
I-Al-Istannen
086b15d10f Crawl a bit more iteratively 2020-05-30 15:47:15 +02:00
I-Al-Istannen
9d6ce331a5 Use IliasCrawlerEntry entries in the ilias scraper 2020-05-30 15:20:51 +02:00
I-Al-Istannen
821c7ade26 Move video url extraction logic to crawler 2020-05-30 00:22:31 +02:00
I-Al-Istannen
62535b4452 Unpack videos in ILIAS downloader 2020-05-21 22:12:52 +02:00
I-Al-Istannen
c0056e5669 Correctly crawl video pages with multiple pages 2020-05-21 21:38:07 +02:00
I-Al-Istannen
4f56c8f192 Pass element type to ilias directory filter 2020-05-12 14:41:13 +02:00
I-Al-Istannen
a0f9d31d94 Use PrettyLogger warning everywhere 2020-05-10 21:56:12 +02:00
I-Al-Istannen
9850ab1d73 Allow crawling the ILIAS Personal Desktop 2020-05-10 12:16:42 +02:00
I-Al-Istannen
f6faacabb0 Move FatalException to errors.py 2020-05-09 00:11:21 +02:00
I-Al-Istannen
19c1e3ac6f Fail on invalid ILIAS course ids 2020-05-09 00:11:20 +02:00
I-Al-Istannen
a4c518bf4c Update date find regex 2020-05-08 22:17:58 +02:00
I-Al-Istannen
42345ecc61 Demangle "Morgen" too 2020-04-30 12:05:25 +02:00
I-Al-Istannen
920d521d68 Change PrettyLogger.warn to PrettyLogger.warning 2020-04-25 20:11:51 +02:00
I-Al-Istannen
e0b46a306a Use warn method in IliasCrawler 2020-04-25 20:07:40 +02:00
I-Al-Istannen
8a42a2a396 Move logging into its own file 2020-04-25 20:02:01 +02:00
Joscha
292e516297 Change crawler and downloader output 2020-04-24 18:24:44 +00:00
I-Al-Istannen
13bc78c889 Display reason for ignoring an element in ilias crawler 2020-04-23 13:54:58 +02:00
I-Al-Istannen
dc964a9d98 Remove finished TODOs 2020-04-23 13:30:34 +02:00
I-Al-Istannen
c2b14f3db9 ilias crawler: Use direct download link if possible 2020-04-23 13:08:12 +02:00
I-Al-Istannen
bef210ae77 Rename and implement IliasDirectoryFilter 2020-04-23 12:35:18 +02:00
Joscha
2de4255a78 Add Pferd class 2020-04-23 09:50:32 +00:00
I-Al-Istannen
a051e3bcca ilias crawler: Add some unhelpful documentation 2020-04-22 17:58:19 +02:00
I-Al-Istannen
eb7df036df WIP: ilias crawler: Also crawl assignments 2020-04-22 14:32:20 +02:00
I-Al-Istannen
23db59e733 WIP: ilias-crawler: Demangle dates 2020-04-22 12:58:44 +02:00
I-Al-Istannen
ac65b06a8e Satisfy pylint a bit 2020-04-22 01:37:34 +02:00
I-Al-Istannen
8891041069 WIP: crawler: Add opencast video crawler 2020-04-21 23:01:19 +02:00
I-Al-Istannen
70d63e3e90 WIP: Start small ILIAS crawler 2020-04-21 13:32:03 +02:00