Commit Graph

  • 84f775013f Use event loop workaround only on windows Joscha 2021-05-31 11:41:20 +0200
  • b78eb64f3d Document versioning scheme Joscha 2021-05-29 21:38:36 +0200
  • d65efed561 Slightly adjust phrasing Joscha 2021-05-28 21:21:04 +0000
  • 1ca6740e05 Improve log messages when parsing ILIAS HTML I-Al-Istannen 2021-05-27 17:59:22 +0200
  • 474aa7e1cc Use sorted path order when debugging transforms Joscha 2021-05-27 15:41:00 +0000
  • 5beb4d9a2d Fix renaming conflict with multi-stage video elements I-Al-Istannen 2021-05-27 00:55:46 +0200
  • 19eed5bdff Fix authentication logic conflicts with videos I-Al-Istannen 2021-05-27 00:31:36 +0200
  • 6fa9cfd4c3 Fix error when capturing group is None Joscha 2021-05-27 13:56:01 +0200
  • 80acc4b50d Implement new name arrows Joscha 2021-05-27 13:42:49 +0200
  • 2c72a9112c Reword -name-> and -name-re-> docs and remove -name-exact-> Joscha 2021-05-27 13:20:37 +0200
  • 17207546e9 Document --debug-transforms Joscha 2021-05-26 11:47:51 +0200
  • 533f75ea71 Add --debug-transforms flag Joscha 2021-05-26 11:37:32 +0200
  • adb5d4ade3 Print files that are *not* deleted by cleanup Joscha 2021-05-26 10:58:19 +0200
  • a879c6ab6e Fix function being printed Joscha 2021-05-26 10:52:04 +0200
  • 915e42fd07 Fix report not being printed if pferd exits normally Joscha 2021-05-26 10:51:41 +0200
  • 2d8dcc87ff Send CSRF token in TFA request I-Al-Istannen 2021-05-25 19:23:06 +0200
  • 66f0e398a1 Await result in tfa authenticate path I-Al-Istannen 2021-05-25 19:19:51 +0200
  • 30be4e29fa Add workaround for RuntimeError after program finishes on Windows Joscha 2021-05-25 16:34:18 +0000
  • 263780e6a3 Use certifi to ensure CA certificates are bundled in pyinstaller I-Al-Istannen 2021-05-25 18:09:51 +0200
  • 07a75a37c3 Fix FileNotFoundError on Windows Joscha 2021-05-25 15:57:03 +0000
  • f85b75df8c Switch from exit() to sys.exit() Joscha 2021-05-25 17:33:05 +0200
  • 6644126b5d Fix package discovery Joscha 2021-05-25 17:29:39 +0200
  • c665c36d88 Update README, CHANGELOG Joscha 2021-05-25 17:16:57 +0200
  • 519a7ef435 Split --dump-config into two options Joscha 2021-05-25 16:57:14 +0200
  • a848194601 Rename plaintext link option to "plaintext" I-Al-Istannen 2021-05-25 17:15:13 +0200
  • aabce764ac Clean up TODOs Joscha 2021-05-25 15:54:01 +0200
  • 5a331663e4 Rename functions for consistency Joscha 2021-05-25 15:49:06 +0200
  • 40144f8bd8 Fix rule error messages Joscha 2021-05-25 15:47:09 +0200
  • f68849c65f Fix rules not being parsed entirely Joscha 2021-05-25 15:42:46 +0200
  • edb52a989e Print report even if exiting due to Ctrl+C Joscha 2021-05-25 15:35:36 +0200
  • 980578d05a Avoid downloading in some cases Joscha 2021-05-25 15:12:07 +0200
  • 486699cef3 Create anonymous TFA authenticator in ilias crawler I-Al-Istannen 2021-05-25 15:11:52 +0200
  • 0096a0c077 Remove section and config parameter from Authenticator I-Al-Istannen 2021-05-25 15:11:33 +0200
  • d905e95dbb Allow invalidation of keyring authenticator I-Al-Istannen 2021-05-25 15:02:35 +0200
  • 61430c8739 Overhaul config and CLI option names Joscha 2021-05-25 14:12:19 +0200
  • eb8b915813 Fix path prefix on windows Joscha 2021-05-25 14:21:52 +0200
  • 22c2259adb Clean up authenticator exceptions Joscha 2021-05-25 14:21:12 +0200
  • c15a1aecdf Rename keyring authenticator file for consistency Joscha 2021-05-25 13:32:00 +0200
  • 16d50b6626 Document why /pferd.py exists Joscha 2021-05-25 13:31:29 +0200
  • 651b087932 Use cl/dl deduplication mechanism for ILIAS crawler I-Al-Istannen 2021-05-25 12:15:38 +0200
  • bce3dc384d Deduplicate path names in crawler Joscha 2021-05-25 11:58:01 +0200
  • c21ddf225b Add a CLI option to configure ILIAS links behaviour I-Al-Istannen 2021-05-25 11:58:41 +0200
  • 4fefb98d71 Add a wrapper to pretty-print ValueErrors in argparse parsers I-Al-Istannen 2021-05-25 11:57:59 +0200
  • ffda4e43df Add extension to link files I-Al-Istannen 2021-05-25 11:40:41 +0200
  • 69cb2a7734 Add Links option to ilias crawler I-Al-Istannen 2021-05-25 11:33:45 +0200
  • c33de233dc Add script for releasing new versions Joscha 2021-05-24 20:08:49 +0200
  • 85f89a7ff3 Interpret accordions and expandable headers as virtual folders I-Al-Istannen 2021-05-24 18:53:00 +0200
  • 9ce20216b5 Do not set a timeout for whole HTTP request I-Al-Istannen 2021-05-24 18:32:18 +0200
  • 1739c54091 Add checklist for releasing new versions Joscha 2021-05-24 17:50:17 +0200
  • d8bd1f518a Set up build and release workflow Joscha 2021-05-24 15:43:53 +0200
  • 86ba47541b Fix cookie loading and saving Joscha 2021-05-24 16:53:50 +0200
  • 492ec6a932 Detect and skip ILIAS tests I-Al-Istannen 2021-05-24 16:32:29 +0200
  • 342076ee0e Handle exercise detail containers in ILIAS html parser I-Al-Istannen 2021-05-24 16:22:51 +0200
  • d44f6966c2 Log authentication attempts in HTTP crawler I-Al-Istannen 2021-05-24 16:22:11 +0200
  • 5c76193045 Set up pyinstaller Joscha 2021-05-24 15:21:25 +0200
  • 1c1f781be4 Reword some log messages Joscha 2021-05-24 13:17:28 +0200
  • c687d4a51a Implement cookie sharing Joscha 2021-05-24 13:10:19 +0200
  • fca62541ca De-duplicate element names in ILIAS crawler I-Al-Istannen 2021-05-24 00:24:31 +0200
  • 3ab3581f84 Add timeout for HTTP connection I-Al-Istannen 2021-05-23 23:40:28 +0200
  • 8dd0689420 Add keyring authentication to ILIAS CLI I-Al-Istannen 2021-05-23 23:04:18 +0200
  • be4b1040f8 Document status and report options Joscha 2021-05-23 22:51:42 +0200
  • 79be6e1dc5 Switch some other options to BooleanOptionalAction Joscha 2021-05-23 22:49:09 +0200
  • edbd92dbbf Add --status and --report flags Joscha 2021-05-23 22:41:59 +0200
  • 27b5a8e490 Rename log.action to log.status Joscha 2021-05-23 22:39:07 +0200
  • 1f400d5964 Implement BooleanOptionalAction Joscha 2021-05-23 22:26:41 +0200
  • 0ca0680165 Simplify --version Joscha 2021-05-23 21:40:48 +0200
  • ce1dbda5b4 Overhaul colours Joscha 2021-05-23 21:27:37 +0200
  • 9cce78669f Print report after all crawlers have finished Joscha 2021-05-23 21:13:06 +0200
  • 6ca0ecdf05 Load and store reports Joscha 2021-05-23 20:46:12 +0200
  • 6e9f8fd391 Add a keyring authenticator I-Al-Istannen 2021-05-23 19:44:12 +0200
  • 2fdf24495b Restructure crawling and auth related modules Joscha 2021-05-23 19:16:42 +0200
  • bbf9f8f130 Add -C as alias for --crawler Joscha 2021-05-23 19:05:56 +0200
  • 37f8d84a9c Output total amount of http requests in HTTP Crawler I-Al-Istannen 2021-05-23 19:00:01 +0200
  • 5edd868d5b Fix always-smart redownloading the wrong files Joscha 2021-05-23 18:49:34 +0200
  • e4e5e83be6 Fix downloader using crawl bar Joscha 2021-05-23 18:39:43 +0200
  • 74c7b39dc8 Clean up files in alphabetical order Joscha 2021-05-23 18:39:25 +0200
  • 445dffc987 Reword some explanations Joscha 2021-05-23 18:35:32 +0200
  • d97d6bf147 Fix handling nested ILIAS folders I-Al-Istannen 2021-05-23 18:29:28 +0200
  • 79efdb56f7 Adjust ILIAS html explain messages I-Al-Istannen 2021-05-23 18:22:29 +0200
  • a9af56a5e9 Improve specifying crawlers via CLI Joscha 2021-05-23 18:16:25 +0200
  • 59f13bb8d6 Explain ILIAS HTML parsing and add some warnings I-Al-Istannen 2021-05-23 18:12:51 +0200
  • 463f8830d7 Add warn_contd I-Al-Istannen 2021-05-23 18:12:34 +0200
  • 05ad06fbc1 Only enclose get_page in iorepeat in ILIAS crawler I-Al-Istannen 2021-05-23 17:24:05 +0200
  • 29d5a40c57 Replace asyncio.gather with custom Crawler function Joscha 2021-05-23 17:25:16 +0200
  • c0cecf8363 Log crawl and download actions more extensively Joscha 2021-05-23 16:22:58 +0200
  • b998339002 Fix cleanup logging of paths Joscha 2021-05-23 16:22:38 +0200
  • 245c9c3dcc Explain output dir decisions and steps Joscha 2021-05-23 16:22:14 +0200
  • d8f26a789e Implement CLI Command for ilias crawler I-Al-Istannen 2021-05-23 13:26:40 +0200
  • e1d18708b3 Rename "no_videos" to videos I-Al-Istannen 2021-05-23 13:26:23 +0200
  • b44b49476d Fix noncritical and anoncritical decorators Joscha 2021-05-23 13:23:28 +0200
  • 7e0bb06259 Clean up TODOs Joscha 2021-05-23 12:47:30 +0200
  • ecdedfa1cf Add no-videos flag to ILIAS crawler I-Al-Istannen 2021-05-23 12:36:09 +0200
  • 3d4b997d4a Retry crawl_url and work around Python's closure handling I-Al-Istannen 2021-05-23 12:24:10 +0200
  • e81005ae4b Fix CLI arguments Joscha 2021-05-23 11:57:59 +0200
  • 33a81a5f5c Document authentication in HTTP crawler and rename prepare_request I-Al-Istannen 2021-05-23 11:55:34 +0200
  • 25e2abdb03 Improve transformer explain wording Joscha 2021-05-23 11:45:14 +0200
  • 803e5628a2 Clean up logging Joscha 2021-05-23 11:30:16 +0200
  • c88f20859a Explain config file dumping Joscha 2021-05-23 11:04:50 +0200
  • ec3767c545 Create crawler base dir at start of crawl Joscha 2021-05-23 10:52:02 +0200
  • 729ff0a4c7 Fix simple authenticator output Joscha 2021-05-23 10:44:59 +0200