Commit Graph

69 Commits

Author SHA1 Message Date
34564cedb4 Add support for link collections 2025-05-27 16:25:59 +02:00
2b0d20a1f6 Fix crawling of exercises with instructions
We do not want a second path and the instruction field has an identical
link...
2025-05-26 14:42:38 +02:00
4c230ef6dd Fix exercise crawling 2025-04-25 13:45:57 +02:00
bdf17f5c87 Ignore wikis 2025-04-23 16:03:37 +02:00
477234ad0d Support ILIAS 9 2025-04-15 21:35:20 +02:00
63f25277b0 Fix crawling of empty forum threads 2025-03-09 23:44:25 +01:00
be175f9347 Download only new/updated forum threads 2025-02-19 16:16:37 +01:00
ba2833dba5 Crawl all threads in a forum
Before this patch the row count was unconditionally changed to 800. This
patch tries to detect how many rows the forum has and then fetches this
amount, if it is larger than 800.
2025-02-19 12:19:33 +01:00
5f88539f7e Fix page size increase for forum threads 2025-02-19 12:19:11 +01:00
bd9d7efe64 "Fix" mypy errors
Thank you mypy, very cool. These types make things *so much better*.
They don't just complicate everything and don't really help because they
can not detect that an element queried by a tag is no navigable
string...
2025-02-19 12:15:41 +01:00
712217e959 Handle groups in cards 2024-11-11 12:53:08 +01:00
fa71a9f44f Add support for mob videos in page descriptions 2024-10-28 20:35:30 +01:00
8fbd1978af Fix crawling of nested courses 2024-10-28 18:52:27 +01:00
c54c3bcfa1 Fix crawling of favorites 2024-10-27 10:50:59 +01:00
52fdeae752 Crawl custom item groups as folders 2024-10-21 23:43:48 +02:00
f9bb2e41cf Sanitize slashes in exercise container names 2024-10-21 22:30:16 +02:00
c897d9e2f5 Support finding entries for course overview page
Related to issue #93
2024-06-26 16:54:07 +02:00
b01f093474 fix: Element detection for other universities
Other universities might use other URL schemes
for different element types
2024-05-04 17:52:06 +02:00
3e831c7e23 Fix normalization of meeting names in cards 2024-04-24 22:32:26 +02:00
3db186a978 Fix personal desktop crawling HTML warnings 2024-04-10 11:15:25 +02:00
4a5959fd58 Fix personal desktop crawling without favorites 2024-04-10 11:15:25 +02:00
eb0c956d32 Add compatibility with ILIAS 8 2024-04-05 19:08:05 +02:00
266812f90e Move is_logged_in helper to kit_ilias_html 2023-11-16 11:19:20 +01:00
50b50513c6 Ignore SCORM learning modules 2023-08-29 13:51:19 +02:00
df3514cd03 Crawl paginated past meetings 2023-08-29 12:41:21 +02:00
87b67e9271 Crawl files in the info tab 2023-08-29 12:41:15 +02:00
b54b3b979c Remove size suffix for content pages 2023-08-27 11:43:05 +02:00
2184ac8040 Add support for ILIAS mediacast listings 2023-08-27 11:43:05 +02:00
68c398f1fe Add support for ILIAS learning modules 2023-08-02 13:34:54 +02:00
d204dac8ce Detect unexpected root page redirects and abort operation 2023-07-29 18:36:33 +02:00
6f30c6583d Fix crawling of cards without descriptions 2023-03-21 23:52:33 +01:00
467fc526e8 Fix crawling of file/video cards 2023-03-21 23:52:24 +01:00
722d2eb393 Fix crawling of courses with preselected timeline tab 2023-03-21 23:36:47 +01:00
55a2de6b88 Fix crawling English opencast 2022-11-29 18:13:56 +01:00
c020cccc64 Include found paths in "second path found" warning 2022-10-29 14:08:29 +02:00
1b6be6bd79 Handle content pages in cards 2022-10-24 18:37:26 +02:00
e1430e6298 Handle (and ignore) surveys 2022-10-24 18:37:26 +02:00
5fdd40204b Unwrap future meetings when ILIAS hides them behind a pagination 2022-10-24 14:33:58 +02:00
d72fc2760b Handle empty forums 2022-10-24 13:12:17 +02:00
4a51aaa4f5 Fix forum crawling crashing for empty threads 2022-10-19 22:59:33 +02:00
46fb782798 Add forum crawling
This downloads all forum posts when needed and saves each thread in its
own html file, named after the thread title.
2022-05-24 23:43:53 +02:00
846c29aee1 Download page descriptions 2022-05-11 21:16:56 +02:00
a5015fe9b1 Correctly parse day-only meeting dates
I failed to recognize the correct format in the previous adjustment, so
this (hopefully) fixes it for good.
Meetings apparently don't always have a time portion.
2022-05-08 23:22:26 +02:00
bcc537468c Fix crawling of expanded meetings
The last meeting on every page is expanded by default.
Its content is then shown inline *and* in the meeting page itself.
We should skip the inline content.
2022-05-05 22:53:37 +02:00
694ffb4d77 Fix meeting date parsing
Apparently the new pattern "<relative time qualifier>: <date>," was
added. This patch adds support for it.
2022-05-05 22:28:30 +02:00
7872fe5221 Fix tables with more columns than expected 2022-01-18 22:38:48 +01:00
4ee919625d Add rudimentary support for content pages 2022-01-08 20:47:35 +01:00
43c5453e10 Correctly crawl files on desktop
The files on the desktop do not include a download link, so we need to
rewrite it.
2022-01-08 20:00:53 +01:00
5f527bc697 Remove Python 3.9 Pattern typehints 2022-01-08 17:14:40 +01:00
ced8b9a2d0 Fix some accordions 2022-01-08 16:58:30 +01:00