Commit Graph

28 Commits

Author SHA1 Message Date
I-Al-Istannen
846c29aee1 Download page descriptions 2022-05-11 21:16:56 +02:00
I-Al-Istannen
a5015fe9b1 Correctly parse day-only meeting dates
I failed to recognize the correct format in the previous adjustment, so
this (hopefully) fixes it for good.
Meetings apparently don't always have a time portion.
2022-05-08 23:22:26 +02:00
I-Al-Istannen
bcc537468c Fix crawling of expanded meetings
The last meeting on every page is expanded by default.
Its content is then shown inline *and* in the meeting page itself.
We should skip the inline content.
2022-05-05 22:53:37 +02:00
I-Al-Istannen
694ffb4d77 Fix meeting date parsing
Apparently the new pattern "<relative time qualifier>: <date>," was
added. This patch adds support for it.
2022-05-05 22:28:30 +02:00
I-Al-Istannen
7872fe5221 Fix tables with more columns than expected 2022-01-18 22:38:48 +01:00
I-Al-Istannen
4ee919625d Add rudimentary support for content pages 2022-01-08 20:47:35 +01:00
I-Al-Istannen
43c5453e10 Correctly crawl files on desktop
The files on the desktop do not include a download link, so we need to
rewrite it.
2022-01-08 20:00:53 +01:00
I-Al-Istannen
5f527bc697 Remove Python 3.9 Pattern typehints 2022-01-08 17:14:40 +01:00
I-Al-Istannen
ced8b9a2d0 Fix some accordions 2022-01-08 16:58:30 +01:00
I-Al-Istannen
6f3cfd4396 Fix personal desktop crawling 2022-01-08 16:58:15 +01:00
I-Al-Istannen
a99356f2a2 Fix video stream extraction 2022-01-08 00:27:34 +01:00
I-Al-Istannen
e42ab83d32 Add support for ILIAS cards 2021-10-30 18:13:44 +02:00
I-Al-Istannen
f9a3f9b9f2 Handle multi-stream videos 2021-10-30 18:12:29 +02:00
I-Al-Istannen
ee67f9f472 Sort elements by ILIAS id to ensure deterministic ordering 2021-07-06 17:45:48 +02:00
I-Al-Istannen
8ec3f41251 Crawl ilias booking objects as links 2021-07-06 16:15:25 +02:00
I-Al-Istannen
6e4d423c81 Crawl all video stages in one crawl bar
This ensures folders are not renamed, as they are crawled twice
2021-06-13 17:18:45 +02:00
I-Al-Istannen
70ec64a48b Fix wrong base URL for multi-stage pages 2021-06-13 15:44:47 +02:00
I-Al-Istannen
8ab462fb87 Use the exercise label instead of the button name as path 2021-06-04 19:24:23 +02:00
I-Al-Istannen
1fba96abcb Fix exercise date parsing for non-group submissions
ILIAS apparently changes the order of the fields as it sees fit, so we
now try to parse *every* column, starting at from the right, as a date.
The first column that parses successfully is then used.
2021-05-31 18:15:12 +02:00
I-Al-Istannen
1ca6740e05 Improve log messages when parsing ILIAS HTML
Previously some logs were split around an "await", which isn't a great
idea.
2021-05-27 17:59:22 +02:00
I-Al-Istannen
5beb4d9a2d Fix renaming conflict with multi-stage video elements 2021-05-27 15:41:00 +02:00
Joscha
aabce764ac Clean up TODOs 2021-05-25 15:54:01 +02:00
I-Al-Istannen
651b087932 Use cl/dl deduplication mechanism for ILIAS crawler 2021-05-25 12:15:38 +02:00
I-Al-Istannen
85f89a7ff3 Interpret accordions and expandable headers as virtual folders
This allows us to find a file named "Test" in an accordion "Acc" as "Acc/Test".
2021-05-24 18:54:26 +02:00
I-Al-Istannen
492ec6a932 Detect and skip ILIAS tests 2021-05-24 16:36:15 +02:00
I-Al-Istannen
342076ee0e Handle exercise detail containers in ILIAS html parser 2021-05-24 16:22:51 +02:00
I-Al-Istannen
fca62541ca De-duplicate element names in ILIAS crawler
This prevents any conflicts caused by multiple files with the same name.
Conflicts may still arise due to transforms, but that is out of our
control and a user error.
2021-05-24 00:24:31 +02:00
Joscha
2fdf24495b Restructure crawling and auth related modules 2021-05-23 19:16:42 +02:00