I-Al-Istannen
46fb782798
Add forum crawling
...
This downloads all forum posts when needed and saves each thread in its
own html file, named after the thread title.
2022-05-24 23:43:53 +02:00
I-Al-Istannen
846c29aee1
Download page descriptions
2022-05-11 21:16:56 +02:00
I-Al-Istannen
a5015fe9b1
Correctly parse day-only meeting dates
...
I failed to recognize the correct format in the previous adjustment, so
this (hopefully) fixes it for good.
Meetings apparently don't always have a time portion.
2022-05-08 23:22:26 +02:00
I-Al-Istannen
bcc537468c
Fix crawling of expanded meetings
...
The last meeting on every page is expanded by default.
Its content is then shown inline *and* in the meeting page itself.
We should skip the inline content.
2022-05-05 22:53:37 +02:00
I-Al-Istannen
694ffb4d77
Fix meeting date parsing
...
Apparently the new pattern "<relative time qualifier>: <date>," was
added. This patch adds support for it.
2022-05-05 22:28:30 +02:00
Joscha
602044ff1b
Fix mypy errors and add missing await
2022-04-27 22:52:50 +02:00
I-Al-Istannen
a2831fbea2
Fix shib authentication
...
Authentication failed previously if the shib session was still valid.
If Shibboleth gets a request and the session is still valid, it directly
responds without a second redirect.
2022-04-27 13:55:24 +02:00
I-Al-Istannen
86e2e226dc
Notify user when shibboleth presents new entitlements
2022-04-03 11:37:08 +02:00
I-Al-Istannen
7872fe5221
Fix tables with more columns than expected
2022-01-18 22:38:48 +01:00
Joscha
4f022e2d19
Reword changelog
2022-01-15 15:06:02 +01:00
I-Al-Istannen
f47e7374d2
Use fixed windows path for video cache
2022-01-15 12:00:30 +01:00
I-Al-Istannen
57ec51e95a
Fix login after shib url parser change
2022-01-14 20:17:27 +01:00
I-Al-Istannen
4ee919625d
Add rudimentary support for content pages
2022-01-08 20:47:35 +01:00
I-Al-Istannen
d30f25ee97
Detect shib login page as login page
...
And do not assume we are logged in...
2022-01-08 20:28:45 +01:00
I-Al-Istannen
10d9d74528
Bail out when crawling recursive courses
2022-01-08 20:28:30 +01:00
I-Al-Istannen
43c5453e10
Correctly crawl files on desktop
...
The files on the desktop do not include a download link, so we need to
rewrite it.
2022-01-08 20:00:53 +01:00
I-Al-Istannen
e32c1f000f
Fix mtime for single streams
2022-01-08 18:05:48 +01:00
I-Al-Istannen
5f527bc697
Remove Python 3.9 Pattern typehints
2022-01-08 17:14:40 +01:00
I-Al-Istannen
ced8b9a2d0
Fix some accordions
2022-01-08 16:58:30 +01:00
I-Al-Istannen
6f3cfd4396
Fix personal desktop crawling
2022-01-08 16:58:15 +01:00
I-Al-Istannen
462d993fbc
Fix local video path cache (hopefully)
2022-01-08 00:27:48 +01:00
I-Al-Istannen
a99356f2a2
Fix video stream extraction
2022-01-08 00:27:34 +01:00
I-Al-Istannen
eac2e34161
Fix is_logged_in for ILIAS 7
2022-01-07 23:32:31 +01:00
I-Al-Istannen
90cb6e989b
Do not download single videos if cache does not exist
2021-11-06 23:21:15 +01:00
I-Al-Istannen
6289938d7c
Do not stop crawling files when encountering a CrawlWarning
2021-11-06 12:09:51 +01:00
I-Al-Istannen
e42ab83d32
Add support for ILIAS cards
2021-10-30 18:13:44 +02:00
I-Al-Istannen
f9a3f9b9f2
Handle multi-stream videos
2021-10-30 18:12:29 +02:00
I-Al-Istannen
ee67f9f472
Sort elements by ILIAS id to ensure deterministic ordering
2021-07-06 17:45:48 +02:00
I-Al-Istannen
8ec3f41251
Crawl ilias booking objects as links
2021-07-06 16:15:25 +02:00
I-Al-Istannen
89be07d4d3
Use final crawl path in HTML parsing message
2021-07-03 17:05:48 +02:00
I-Al-Istannen
91200f3684
Fix nondeterministic name deduplication
2021-07-03 12:09:55 +02:00
I-Al-Istannen
6e4d423c81
Crawl all video stages in one crawl bar
...
This ensures folders are not renamed, as they are crawled twice
2021-06-13 17:18:45 +02:00
I-Al-Istannen
70ec64a48b
Fix wrong base URL for multi-stage pages
2021-06-13 15:44:47 +02:00
I-Al-Istannen
8ab462fb87
Use the exercise label instead of the button name as path
2021-06-04 19:24:23 +02:00
Joscha
f40820c41f
Warn if using concurrent tasks with kit-ilias-web
2021-05-31 20:18:20 +00:00
I-Al-Istannen
1fba96abcb
Fix exercise date parsing for non-group submissions
...
ILIAS apparently changes the order of the fields as it sees fit, so we
now try to parse *every* column, starting at from the right, as a date.
The first column that parses successfully is then used.
2021-05-31 18:15:12 +02:00
I-Al-Istannen
1ca6740e05
Improve log messages when parsing ILIAS HTML
...
Previously some logs were split around an "await", which isn't a great
idea.
2021-05-27 17:59:22 +02:00
I-Al-Istannen
5beb4d9a2d
Fix renaming conflict with multi-stage video elements
2021-05-27 15:41:00 +02:00
I-Al-Istannen
19eed5bdff
Fix authentication logic conflicts with videos
2021-05-27 15:41:00 +02:00
I-Al-Istannen
2d8dcc87ff
Send CSRF token in TFA request
2021-05-25 22:50:40 +02:00
I-Al-Istannen
66f0e398a1
Await result in tfa authenticate path
2021-05-25 19:19:51 +02:00
I-Al-Istannen
a848194601
Rename plaintext link option to "plaintext"
2021-05-25 17:15:13 +02:00
Joscha
aabce764ac
Clean up TODOs
2021-05-25 15:54:01 +02:00
I-Al-Istannen
486699cef3
Create anonymous TFA authenticator in ilias crawler
...
This ensures that *some* TFA authenticator is always present when
authenticating, even if none is specified in the config.
The TfaAuthenticator does not depend on any configured values, so it can
be created on-demand.
2021-05-25 15:11:52 +02:00
Joscha
61430c8739
Overhaul config and CLI option names
2021-05-25 14:23:38 +02:00
I-Al-Istannen
651b087932
Use cl/dl deduplication mechanism for ILIAS crawler
2021-05-25 12:15:38 +02:00
I-Al-Istannen
ffda4e43df
Add extension to link files
2021-05-25 11:41:57 +02:00
I-Al-Istannen
69cb2a7734
Add Links option to ilias crawler
...
This allows you to configure what type the link files should have and
whether to create them at all.
2021-05-25 11:41:57 +02:00
I-Al-Istannen
85f89a7ff3
Interpret accordions and expandable headers as virtual folders
...
This allows us to find a file named "Test" in an accordion "Acc" as "Acc/Test".
2021-05-24 18:54:26 +02:00
I-Al-Istannen
492ec6a932
Detect and skip ILIAS tests
2021-05-24 16:36:15 +02:00