diff --git a/CHANGELOG.md b/CHANGELOG.md index 519c046..3f032cf 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,13 +8,28 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## Unreleased ### Added -- Support for concurrent downloads -- Support for proper config files +- Proper config files +- Concurrent crawling +- Crawl external ILIAS links +- Crawl uploaded exercise solutions +- Explain what PFERD is doing and why (`--explain`) +- More control over output (`--status`, `--report`) +- Print report after exiting via Ctrl+C +- Store crawler reports in `.report` JSON file +- Extensive config file documentation (`CONFIG.md`) +- Documentation for developers (`DEV.md`) - This changelog ### Changed - Rewrote almost everything +- Better error messages - Redesigned CLI +- Redesigned transform rules +- ILIAS crawling logic (paths may be different) +- Better support for weird paths on Windows +- Set user agent (`PFERD/`) ### Removed - Backwards compatibility with 2.x +- Python files as config files +- Some types of crawlers diff --git a/CONFIG.md b/CONFIG.md index 2f2dbbe..b48a2dd 100644 --- a/CONFIG.md +++ b/CONFIG.md @@ -90,7 +90,7 @@ full name of an auth section (including the `auth:` prefix). Here is a simple example: -``` +```ini [auth:example] type = simple username = foo diff --git a/README.md b/README.md index f9d718e..8e3b387 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,14 @@ Other resources: - [Changelog](CHANGELOG.md) - [Development Guide](DEV.md) -## Installation with pip +## Installation + +### Direct download + +Binaries for Linux, Windows and Mac can be downloaded directly from the +[latest release](https://github.com/Garmelon/PFERD/releases/latest). + +### With pip Ensure you have at least Python 3.8 installed. Run the following command to install PFERD or upgrade it to the latest version: @@ -18,3 +25,111 @@ $ pip install --upgrade git+https://github.com/Garmelon/PFERD@latest ``` The use of [venv](https://docs.python.org/3/library/venv.html) is recommended. + +## Basic usage + +PFERD can be run directly from the command line with no config file. +Run `pferd -h` to get an overview of available commands and options. +Run `pferd -h` to see which options a command has. + +For example, you can download your personal desktop from the KIT ILIAS like +this: + +``` +$ pferd kit-ilias-web desktop +``` + +Also, you can download most ILIAS pages directly like this: + +``` +$ pferd kit-ilias-web +``` + +However, the CLI only lets you download a single thing at a time, and the +resulting command can grow long quite quickly. Because of this, PFERD can also +be used with a config file. + +To get started, just take a command you've been using and add `--dump-config` +directly after `pferd`, like this: + +``` +$ pferd --dump-config kit-ilias-web +``` + +This will make PFERD write its current configuration to its default config file +path. You can then run `pferd` without a command and it will execute the config +file. Alternatively, you can use `--dump-config-to` and specify a path yourself. +Using `--dump-config-to -` will print the configuration to stdout instead of a +file, which is a good way to see what is actually going on when using a CLI +command. + +Another good way to see what PFERD is doing is the `--explain` option. When +enabled, PFERD explains in detail what it is doing and why. This can help with +debugging your own config, for example. + +If you don't want to run all crawlers from your config file, you can specify the +crawlers you want to run with `--crawler` or `-C`, like this: + +``` +$ pferd -C crawler1 -C crawler2 +``` + +## Advanced usage + +PFERD supports lots of different options. For example, you can configure PFERD +to [use your system's keyring](CONFIG.md#the-keyring-authenticator) instead of +prompting you for your username and password. PFERD also supports +[transformation rules](CONFIG.md#transformation-rules) that let you rename or +exclude certain files. + +For more details, see the comprehensive [config format documentation](CONFIG.md). + +## Example + +This example downloads a few courses from the KIT ILIAS with a common keyring +authenticator. It reorganizes and ignores some files. + +```ini +[DEFAULT] +# All paths will be relative to this. +# The crawler output directories will be /Foo and /Bar. +working_dir = ~/stud +# If files vanish from ILIAS the local files are not deleted, allowing us to +# take a look at them before deleting them ourselves. +on_conflict = no-delete + +[auth:ilias] +type = keyring +username = foo + +[crawl:Foo] +type = kit-ilias-web +auth = auth:ilias +# Crawl a course by its ID (found as `ref_id=ID` in the URL) +target = 1234567 + +# Plaintext files are easier to read by other tools +links = plaintext + +transform = + # Ignore unneeded folders + Online-Tests --> ! + Vorlesungswerbung --> ! + + # Move exercises to own folder. Rename them to "Blatt-XX.pdf" to make them sort properly + "Übungsunterlagen/(\d+). Übungsblatt.pdf" -re-> Blätter/Blatt-{i1:02}.pdf + # Move solutions to own folder. Rename them to "Blatt-XX-Lösung.pdf" to make them sort properly + "Übungsunterlagen/(\d+). Übungsblatt.*Musterlösung.pdf" -re-> Blätter/Blatt-{i1:02}-Lösung.pdf + + # The course has nested folders with the same name - flatten them + "Übungsunterlagen/(.+?)/\\1/(.*)" -re-> Übung/{g1}/{g2} + + # Rename remaining folders + Übungsunterlagen --> Übung + Lehrbücher --> Vorlesung + +[crawl:Bar] +type = kit-ilias-web +auth = auth:ilias +target = 1337420 +```