Commit Graph

601 Commits

Author SHA1 Message Date
Joscha
4f022e2d19 Reword changelog 2022-01-15 15:06:02 +01:00
I-Al-Istannen
f47e7374d2 Use fixed windows path for video cache 2022-01-15 12:00:30 +01:00
I-Al-Istannen
57ec51e95a Fix login after shib url parser change 2022-01-14 20:17:27 +01:00
Joscha
0045124a4e Bump version to 3.3.0 2022-01-09 21:09:09 +01:00
I-Al-Istannen
9618aae83b Add content pages to changelog 2022-01-09 18:32:58 +01:00
I-Al-Istannen
33453ede2d Update dependency versions in setup.py 2022-01-09 18:31:42 +01:00
I-Al-Istannen
e467b38d73 Only reject 1970 timestamps on windows 2022-01-09 18:23:00 +01:00
I-Al-Istannen
e9d2d05030 Update changelog 2022-01-09 11:48:26 +01:00
I-Al-Istannen
4bf0c972e6 Update types for rich 11 2022-01-09 11:48:26 +01:00
I-Al-Istannen
4ee919625d Add rudimentary support for content pages 2022-01-08 20:47:35 +01:00
I-Al-Istannen
d30f25ee97 Detect shib login page as login page
And do not assume we are logged in...
2022-01-08 20:28:45 +01:00
I-Al-Istannen
10d9d74528 Bail out when crawling recursive courses 2022-01-08 20:28:30 +01:00
I-Al-Istannen
43c5453e10 Correctly crawl files on desktop
The files on the desktop do not include a download link, so we need to
rewrite it.
2022-01-08 20:00:53 +01:00
I-Al-Istannen
eb4de8ae0c Ignore 1970 dates as windows crashes when calling .timestamp() 2022-01-08 18:14:43 +01:00
I-Al-Istannen
e32c1f000f Fix mtime for single streams 2022-01-08 18:05:48 +01:00
I-Al-Istannen
5f527bc697 Remove Python 3.9 Pattern typehints 2022-01-08 17:14:40 +01:00
I-Al-Istannen
ced8b9a2d0 Fix some accordions 2022-01-08 16:58:30 +01:00
I-Al-Istannen
6f3cfd4396 Fix personal desktop crawling 2022-01-08 16:58:15 +01:00
I-Al-Istannen
462d993fbc Fix local video path cache (hopefully) 2022-01-08 00:27:48 +01:00
I-Al-Istannen
a99356f2a2 Fix video stream extraction 2022-01-08 00:27:34 +01:00
I-Al-Istannen
eac2e34161 Fix is_logged_in for ILIAS 7 2022-01-07 23:32:31 +01:00
I-Al-Istannen
a82a0b19c2 Collect crawler warnings/errors and include them in the report 2021-11-07 21:48:55 +01:00
I-Al-Istannen
90cb6e989b Do not download single videos if cache does not exist 2021-11-06 23:21:15 +01:00
I-Al-Istannen
6289938d7c Do not stop crawling files when encountering a CrawlWarning 2021-11-06 12:09:51 +01:00
I-Al-Istannen
13b8c3d9c6 Add regex option to config and CLI parser 2021-11-02 09:30:46 +01:00
I-Al-Istannen
88afe64a92 Refactor IPD crawler a bit 2021-11-02 01:25:01 +00:00
Julius Rüberg
6b2a657573 Fix IPD crawler for different subpages (#42)
This patch reworks the IPD crawler to support subpages which do not use
"/intern" for links and fetches the folder names from table headings.
2021-11-02 01:25:01 +00:00
Toorero
d6f38a61e1 Fixed minor spelling mistakes 2021-11-02 01:54:00 +01:00
I-Al-Istannen
ad3f4955f7 Update changelog 2021-10-30 18:14:39 +02:00
I-Al-Istannen
e42ab83d32 Add support for ILIAS cards 2021-10-30 18:13:44 +02:00
I-Al-Istannen
f9a3f9b9f2 Handle multi-stream videos 2021-10-30 18:12:29 +02:00
I-Al-Istannen
ef7d5ea2d3 Allow storing crawler-specific data in reports 2021-10-30 18:09:05 +02:00
lukasprobst
55ea304ff3 Disable interpolation of ConfigParser 2021-10-25 23:37:42 +02:00
Joscha
fee12b3d9e Fix changelog 2021-10-25 17:44:12 +00:00
I-Al-Istannen
6673077397 Add kit-ipd crawler 2021-10-21 13:20:21 +02:00
Joscha
742632ed8d Bump version to 3.2.0 2021-08-04 18:27:26 +00:00
Joscha
544d45cbc5 Catch non-critical exceptions at crawler top level 2021-07-13 15:42:11 +02:00
Joscha
86f79ff1f1 Update changelog 2021-07-07 15:23:58 +02:00
I-Al-Istannen
ee67f9f472 Sort elements by ILIAS id to ensure deterministic ordering 2021-07-06 17:45:48 +02:00
I-Al-Istannen
8ec3f41251 Crawl ilias booking objects as links 2021-07-06 16:15:25 +02:00
I-Al-Istannen
89be07d4d3 Use final crawl path in HTML parsing message 2021-07-03 17:05:48 +02:00
I-Al-Istannen
91200f3684 Fix nondeterministic name deduplication 2021-07-03 12:09:55 +02:00
Joscha
9ffd603357 Error when using multiple segments with -name->
Previously, PFERD just silently never matched the -name-> arrow. Now, it errors
when loading the config file.
2021-07-01 11:14:50 +02:00
Joscha
80eeb8fe97 Add --skip option 2021-07-01 11:02:21 +02:00
Joscha
75fde870c2 Bump version to 3.1.0 2021-06-13 17:23:18 +02:00
I-Al-Istannen
6e4d423c81 Crawl all video stages in one crawl bar
This ensures folders are not renamed, as they are crawled twice
2021-06-13 17:18:45 +02:00
Joscha
57aef26217 Fix name arrows
I seem to have (re-)implemented them incorrectly and never tested them.
2021-06-13 16:33:29 +02:00
I-Al-Istannen
70ec64a48b Fix wrong base URL for multi-stage pages 2021-06-13 15:44:47 +02:00
Joscha
70b33ecfd9 Add migration notes to changelog
Also clean up some other formatting for consistency
2021-06-13 15:06:50 +02:00
Joscha
601e4b936b Use new arrow logic in README example config 2021-06-12 15:00:52 +02:00