Commit Graph

18 Commits

Author SHA1 Message Date
Joscha
eb8b915813 Fix path prefix on windows
Previously, the path prefix was only set if "windows_paths" was true, regardless
of OS. Now the path prefix is always set on windows and never set on other OSes.
2021-05-25 14:23:38 +02:00
I-Al-Istannen
651b087932 Use cl/dl deduplication mechanism for ILIAS crawler 2021-05-25 12:15:38 +02:00
Joscha
bce3dc384d Deduplicate path names in crawler
Also rename files so they follow the restrictions for windows file names if
we're on windows.
2021-05-25 12:11:15 +02:00
I-Al-Istannen
ffda4e43df Add extension to link files 2021-05-25 11:41:57 +02:00
I-Al-Istannen
69cb2a7734 Add Links option to ilias crawler
This allows you to configure what type the link files should have and
whether to create them at all.
2021-05-25 11:41:57 +02:00
I-Al-Istannen
85f89a7ff3 Interpret accordions and expandable headers as virtual folders
This allows us to find a file named "Test" in an accordion "Acc" as "Acc/Test".
2021-05-24 18:54:26 +02:00
I-Al-Istannen
9ce20216b5 Do not set a timeout for whole HTTP request
Downloads might take longer!
2021-05-24 18:54:26 +02:00
Joscha
86ba47541b Fix cookie loading and saving 2021-05-24 16:55:11 +02:00
I-Al-Istannen
492ec6a932 Detect and skip ILIAS tests 2021-05-24 16:36:15 +02:00
I-Al-Istannen
342076ee0e Handle exercise detail containers in ILIAS html parser 2021-05-24 16:22:51 +02:00
I-Al-Istannen
d44f6966c2 Log authentication attempts in HTTP crawler 2021-05-24 16:22:11 +02:00
Joscha
c687d4a51a Implement cookie sharing 2021-05-24 13:10:44 +02:00
I-Al-Istannen
fca62541ca De-duplicate element names in ILIAS crawler
This prevents any conflicts caused by multiple files with the same name.
Conflicts may still arise due to transforms, but that is out of our
control and a user error.
2021-05-24 00:24:31 +02:00
I-Al-Istannen
3ab3581f84 Add timeout for HTTP connection 2021-05-23 23:41:05 +02:00
Joscha
27b5a8e490 Rename log.action to log.status 2021-05-23 22:40:33 +02:00
Joscha
ce1dbda5b4 Overhaul colours
"Crawled" and "Downloaded" are now printed less bright than "Crawling" and
"Downloading" as they're not as important. Explain topics are printed in yellow
to stand out a bit more from the cyan action messages.
2021-05-23 21:33:04 +02:00
Joscha
6ca0ecdf05 Load and store reports 2021-05-23 20:46:29 +02:00
Joscha
2fdf24495b Restructure crawling and auth related modules 2021-05-23 19:16:42 +02:00