Commit Graph

67 Commits

Author SHA1 Message Date
I-Al-Istannen
4c230ef6dd Fix exercise crawling 2025-04-25 13:45:57 +02:00
I-Al-Istannen
bdf17f5c87 Ignore wikis 2025-04-23 16:03:37 +02:00
I-Al-Istannen
477234ad0d Support ILIAS 9 2025-04-15 21:35:20 +02:00
I-Al-Istannen
63f25277b0 Fix crawling of empty forum threads 2025-03-09 23:44:25 +01:00
I-Al-Istannen
be175f9347 Download only new/updated forum threads 2025-02-19 16:16:37 +01:00
I-Al-Istannen
ba2833dba5 Crawl all threads in a forum
Before this patch the row count was unconditionally changed to 800. This
patch tries to detect how many rows the forum has and then fetches this
amount, if it is larger than 800.
2025-02-19 12:19:33 +01:00
I-Al-Istannen
5f88539f7e Fix page size increase for forum threads 2025-02-19 12:19:11 +01:00
I-Al-Istannen
bd9d7efe64 "Fix" mypy errors
Thank you mypy, very cool. These types make things *so much better*.
They don't just complicate everything and don't really help because they
can not detect that an element queried by a tag is no navigable
string...
2025-02-19 12:15:41 +01:00
I-Al-Istannen
712217e959 Handle groups in cards 2024-11-11 12:53:08 +01:00
I-Al-Istannen
fa71a9f44f Add support for mob videos in page descriptions 2024-10-28 20:35:30 +01:00
I-Al-Istannen
8fbd1978af Fix crawling of nested courses 2024-10-28 18:52:27 +01:00
I-Al-Istannen
c54c3bcfa1 Fix crawling of favorites 2024-10-27 10:50:59 +01:00
I-Al-Istannen
52fdeae752 Crawl custom item groups as folders 2024-10-21 23:43:48 +02:00
I-Al-Istannen
f9bb2e41cf Sanitize slashes in exercise container names 2024-10-21 22:30:16 +02:00
I-Al-Istannen
c897d9e2f5 Support finding entries for course overview page
Related to issue #93
2024-06-26 16:54:07 +02:00
Philipp Fruck
b01f093474 fix: Element detection for other universities
Other universities might use other URL schemes
for different element types
2024-05-04 17:52:06 +02:00
I-Al-Istannen
3e831c7e23 Fix normalization of meeting names in cards 2024-04-24 22:32:26 +02:00
I-Al-Istannen
3db186a978 Fix personal desktop crawling HTML warnings 2024-04-10 11:15:25 +02:00
I-Al-Istannen
4a5959fd58 Fix personal desktop crawling without favorites 2024-04-10 11:15:25 +02:00
I-Al-Istannen
eb0c956d32 Add compatibility with ILIAS 8 2024-04-05 19:08:05 +02:00
I-Al-Istannen
266812f90e Move is_logged_in helper to kit_ilias_html 2023-11-16 11:19:20 +01:00
I-Al-Istannen
50b50513c6 Ignore SCORM learning modules 2023-08-29 13:51:19 +02:00
I-Al-Istannen
df3514cd03 Crawl paginated past meetings 2023-08-29 12:41:21 +02:00
I-Al-Istannen
87b67e9271 Crawl files in the info tab 2023-08-29 12:41:15 +02:00
I-Al-Istannen
b54b3b979c Remove size suffix for content pages 2023-08-27 11:43:05 +02:00
I-Al-Istannen
2184ac8040 Add support for ILIAS mediacast listings 2023-08-27 11:43:05 +02:00
I-Al-Istannen
68c398f1fe Add support for ILIAS learning modules 2023-08-02 13:34:54 +02:00
I-Al-Istannen
d204dac8ce Detect unexpected root page redirects and abort operation 2023-07-29 18:36:33 +02:00
I-Al-Istannen
6f30c6583d Fix crawling of cards without descriptions 2023-03-21 23:52:33 +01:00
I-Al-Istannen
467fc526e8 Fix crawling of file/video cards 2023-03-21 23:52:24 +01:00
I-Al-Istannen
722d2eb393 Fix crawling of courses with preselected timeline tab 2023-03-21 23:36:47 +01:00
c0derMo
55a2de6b88 Fix crawling English opencast 2022-11-29 18:13:56 +01:00
I-Al-Istannen
c020cccc64 Include found paths in "second path found" warning 2022-10-29 14:08:29 +02:00
I-Al-Istannen
1b6be6bd79 Handle content pages in cards 2022-10-24 18:37:26 +02:00
I-Al-Istannen
e1430e6298 Handle (and ignore) surveys 2022-10-24 18:37:26 +02:00
I-Al-Istannen
5fdd40204b Unwrap future meetings when ILIAS hides them behind a pagination 2022-10-24 14:33:58 +02:00
I-Al-Istannen
d72fc2760b Handle empty forums 2022-10-24 13:12:17 +02:00
I-Al-Istannen
4a51aaa4f5 Fix forum crawling crashing for empty threads 2022-10-19 22:59:33 +02:00
I-Al-Istannen
46fb782798 Add forum crawling
This downloads all forum posts when needed and saves each thread in its
own html file, named after the thread title.
2022-05-24 23:43:53 +02:00
I-Al-Istannen
846c29aee1 Download page descriptions 2022-05-11 21:16:56 +02:00
I-Al-Istannen
a5015fe9b1 Correctly parse day-only meeting dates
I failed to recognize the correct format in the previous adjustment, so
this (hopefully) fixes it for good.
Meetings apparently don't always have a time portion.
2022-05-08 23:22:26 +02:00
I-Al-Istannen
bcc537468c Fix crawling of expanded meetings
The last meeting on every page is expanded by default.
Its content is then shown inline *and* in the meeting page itself.
We should skip the inline content.
2022-05-05 22:53:37 +02:00
I-Al-Istannen
694ffb4d77 Fix meeting date parsing
Apparently the new pattern "<relative time qualifier>: <date>," was
added. This patch adds support for it.
2022-05-05 22:28:30 +02:00
I-Al-Istannen
7872fe5221 Fix tables with more columns than expected 2022-01-18 22:38:48 +01:00
I-Al-Istannen
4ee919625d Add rudimentary support for content pages 2022-01-08 20:47:35 +01:00
I-Al-Istannen
43c5453e10 Correctly crawl files on desktop
The files on the desktop do not include a download link, so we need to
rewrite it.
2022-01-08 20:00:53 +01:00
I-Al-Istannen
5f527bc697 Remove Python 3.9 Pattern typehints 2022-01-08 17:14:40 +01:00
I-Al-Istannen
ced8b9a2d0 Fix some accordions 2022-01-08 16:58:30 +01:00
I-Al-Istannen
6f3cfd4396 Fix personal desktop crawling 2022-01-08 16:58:15 +01:00
I-Al-Istannen
a99356f2a2 Fix video stream extraction 2022-01-08 00:27:34 +01:00