Commit Graph

484 Commits

Author SHA1 Message Date
I-Al-Istannen
e1430e6298 Handle (and ignore) surveys 2022-10-24 18:37:26 +02:00
I-Al-Istannen
5fdd40204b Unwrap future meetings when ILIAS hides them behind a pagination 2022-10-24 14:33:58 +02:00
I-Al-Istannen
fb4631ba18 Fix ilias background login 2022-10-24 13:13:36 +02:00
I-Al-Istannen
d72fc2760b Handle empty forums 2022-10-24 13:12:17 +02:00
I-Al-Istannen
4a51aaa4f5 Fix forum crawling crashing for empty threads 2022-10-19 22:59:33 +02:00
Joscha
66a5b1ba02 Bump version to 3.4.1 2022-08-17 13:24:01 +02:00
I-Al-Istannen
d9b111cec2 Correctly nest description entries 2022-08-14 21:45:33 +02:00
I-Al-Istannen
345f52a1f6 Detect new login button 2022-08-14 21:41:29 +02:00
Joscha
ed24366aba Add pass authenticator 2022-06-05 10:04:42 +02:00
I-Al-Istannen
46fb782798 Add forum crawling
This downloads all forum posts when needed and saves each thread in its
own html file, named after the thread title.
2022-05-24 23:43:53 +02:00
I-Al-Istannen
846c29aee1 Download page descriptions 2022-05-11 21:16:56 +02:00
I-Al-Istannen
a5015fe9b1 Correctly parse day-only meeting dates
I failed to recognize the correct format in the previous adjustment, so
this (hopefully) fixes it for good.
Meetings apparently don't always have a time portion.
2022-05-08 23:22:26 +02:00
Joscha
616b0480f7 Simplify IPD crawler link regex 2022-05-08 18:18:05 +02:00
I-Al-Istannen
bcc537468c Fix crawling of expanded meetings
The last meeting on every page is expanded by default.
Its content is then shown inline *and* in the meeting page itself.
We should skip the inline content.
2022-05-05 22:53:37 +02:00
I-Al-Istannen
694ffb4d77 Fix meeting date parsing
Apparently the new pattern "<relative time qualifier>: <date>," was
added. This patch adds support for it.
2022-05-05 22:28:30 +02:00
Joscha
af2cc1169a Mention href for users of link_regex option 2022-05-05 14:36:03 +02:00
Joscha
bc3fa36637 Fix IPD crawler crashing on weird HTML comments 2022-05-05 14:35:42 +02:00
I-Al-Istannen
b8fe25c580 Add .cpp to ipd link regex 2022-05-04 14:19:26 +02:00
Joscha
a241672726 Bump version to 3.4.0 2022-05-01 22:29:06 +02:00
Joscha
a8f76e9be7 Use utf-8 for credential file 2022-04-29 23:15:12 +02:00
Joscha
b56475450d Use utf-8 for cookies 2022-04-29 23:12:41 +02:00
Joscha
aa74604d29 Use utf-8 for report 2022-04-29 23:11:27 +02:00
Joscha
d2e6d91880 Make PFERD executable via python -m 2022-04-27 22:52:50 +02:00
Joscha
602044ff1b Fix mypy errors and add missing await 2022-04-27 22:52:50 +02:00
I-Al-Istannen
a709280cbf Try to detect unsupported config file encoding
The encoding detection is quite rudimentary, but should detect the
default windows encoding in many cases.
2022-04-27 22:03:47 +02:00
I-Al-Istannen
a99ddaa0cc Read and write config in UTF-8 2022-04-27 21:47:51 +02:00
I-Al-Istannen
a2831fbea2 Fix shib authentication
Authentication failed previously if the shib session was still valid.
If Shibboleth gets a request and the session is still valid, it directly
responds without a second redirect.
2022-04-27 13:55:24 +02:00
I-Al-Istannen
da72863b47 Placate newer mypy 2022-04-03 13:19:08 +02:00
I-Al-Istannen
86e2e226dc Notify user when shibboleth presents new entitlements 2022-04-03 11:37:08 +02:00
I-Al-Istannen
7872fe5221 Fix tables with more columns than expected 2022-01-18 22:38:48 +01:00
Joscha
86947e4874 Bump version to 3.3.1 2022-01-15 15:11:22 +01:00
Joscha
4f022e2d19 Reword changelog 2022-01-15 15:06:02 +01:00
I-Al-Istannen
f47e7374d2 Use fixed windows path for video cache 2022-01-15 12:00:30 +01:00
I-Al-Istannen
57ec51e95a Fix login after shib url parser change 2022-01-14 20:17:27 +01:00
Joscha
0045124a4e Bump version to 3.3.0 2022-01-09 21:09:09 +01:00
I-Al-Istannen
e467b38d73 Only reject 1970 timestamps on windows 2022-01-09 18:23:00 +01:00
I-Al-Istannen
4bf0c972e6 Update types for rich 11 2022-01-09 11:48:26 +01:00
I-Al-Istannen
4ee919625d Add rudimentary support for content pages 2022-01-08 20:47:35 +01:00
I-Al-Istannen
d30f25ee97 Detect shib login page as login page
And do not assume we are logged in...
2022-01-08 20:28:45 +01:00
I-Al-Istannen
10d9d74528 Bail out when crawling recursive courses 2022-01-08 20:28:30 +01:00
I-Al-Istannen
43c5453e10 Correctly crawl files on desktop
The files on the desktop do not include a download link, so we need to
rewrite it.
2022-01-08 20:00:53 +01:00
I-Al-Istannen
eb4de8ae0c Ignore 1970 dates as windows crashes when calling .timestamp() 2022-01-08 18:14:43 +01:00
I-Al-Istannen
e32c1f000f Fix mtime for single streams 2022-01-08 18:05:48 +01:00
I-Al-Istannen
5f527bc697 Remove Python 3.9 Pattern typehints 2022-01-08 17:14:40 +01:00
I-Al-Istannen
ced8b9a2d0 Fix some accordions 2022-01-08 16:58:30 +01:00
I-Al-Istannen
6f3cfd4396 Fix personal desktop crawling 2022-01-08 16:58:15 +01:00
I-Al-Istannen
462d993fbc Fix local video path cache (hopefully) 2022-01-08 00:27:48 +01:00
I-Al-Istannen
a99356f2a2 Fix video stream extraction 2022-01-08 00:27:34 +01:00
I-Al-Istannen
eac2e34161 Fix is_logged_in for ILIAS 7 2022-01-07 23:32:31 +01:00
I-Al-Istannen
a82a0b19c2 Collect crawler warnings/errors and include them in the report 2021-11-07 21:48:55 +01:00