mirror of
https://github.com/Garmelon/PFERD.git
synced 2023-12-21 10:23:01 +01:00
Update README, CHANGELOG
This commit is contained in:
parent
519a7ef435
commit
c665c36d88
19
CHANGELOG.md
19
CHANGELOG.md
@ -8,13 +8,28 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|||||||
## Unreleased
|
## Unreleased
|
||||||
|
|
||||||
### Added
|
### Added
|
||||||
- Support for concurrent downloads
|
- Proper config files
|
||||||
- Support for proper config files
|
- Concurrent crawling
|
||||||
|
- Crawl external ILIAS links
|
||||||
|
- Crawl uploaded exercise solutions
|
||||||
|
- Explain what PFERD is doing and why (`--explain`)
|
||||||
|
- More control over output (`--status`, `--report`)
|
||||||
|
- Print report after exiting via Ctrl+C
|
||||||
|
- Store crawler reports in `.report` JSON file
|
||||||
|
- Extensive config file documentation (`CONFIG.md`)
|
||||||
|
- Documentation for developers (`DEV.md`)
|
||||||
- This changelog
|
- This changelog
|
||||||
|
|
||||||
### Changed
|
### Changed
|
||||||
- Rewrote almost everything
|
- Rewrote almost everything
|
||||||
|
- Better error messages
|
||||||
- Redesigned CLI
|
- Redesigned CLI
|
||||||
|
- Redesigned transform rules
|
||||||
|
- ILIAS crawling logic (paths may be different)
|
||||||
|
- Better support for weird paths on Windows
|
||||||
|
- Set user agent (`PFERD/<version>`)
|
||||||
|
|
||||||
### Removed
|
### Removed
|
||||||
- Backwards compatibility with 2.x
|
- Backwards compatibility with 2.x
|
||||||
|
- Python files as config files
|
||||||
|
- Some types of crawlers
|
||||||
|
@ -90,7 +90,7 @@ full name of an auth section (including the `auth:` prefix).
|
|||||||
|
|
||||||
Here is a simple example:
|
Here is a simple example:
|
||||||
|
|
||||||
```
|
```ini
|
||||||
[auth:example]
|
[auth:example]
|
||||||
type = simple
|
type = simple
|
||||||
username = foo
|
username = foo
|
||||||
|
117
README.md
117
README.md
@ -8,7 +8,14 @@ Other resources:
|
|||||||
- [Changelog](CHANGELOG.md)
|
- [Changelog](CHANGELOG.md)
|
||||||
- [Development Guide](DEV.md)
|
- [Development Guide](DEV.md)
|
||||||
|
|
||||||
## Installation with pip
|
## Installation
|
||||||
|
|
||||||
|
### Direct download
|
||||||
|
|
||||||
|
Binaries for Linux, Windows and Mac can be downloaded directly from the
|
||||||
|
[latest release](https://github.com/Garmelon/PFERD/releases/latest).
|
||||||
|
|
||||||
|
### With pip
|
||||||
|
|
||||||
Ensure you have at least Python 3.8 installed. Run the following command to
|
Ensure you have at least Python 3.8 installed. Run the following command to
|
||||||
install PFERD or upgrade it to the latest version:
|
install PFERD or upgrade it to the latest version:
|
||||||
@ -18,3 +25,111 @@ $ pip install --upgrade git+https://github.com/Garmelon/PFERD@latest
|
|||||||
```
|
```
|
||||||
|
|
||||||
The use of [venv](https://docs.python.org/3/library/venv.html) is recommended.
|
The use of [venv](https://docs.python.org/3/library/venv.html) is recommended.
|
||||||
|
|
||||||
|
## Basic usage
|
||||||
|
|
||||||
|
PFERD can be run directly from the command line with no config file.
|
||||||
|
Run `pferd -h` to get an overview of available commands and options.
|
||||||
|
Run `pferd <command> -h` to see which options a command has.
|
||||||
|
|
||||||
|
For example, you can download your personal desktop from the KIT ILIAS like
|
||||||
|
this:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ pferd kit-ilias-web desktop <output_directory>
|
||||||
|
```
|
||||||
|
|
||||||
|
Also, you can download most ILIAS pages directly like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ pferd kit-ilias-web <url> <output_directory>
|
||||||
|
```
|
||||||
|
|
||||||
|
However, the CLI only lets you download a single thing at a time, and the
|
||||||
|
resulting command can grow long quite quickly. Because of this, PFERD can also
|
||||||
|
be used with a config file.
|
||||||
|
|
||||||
|
To get started, just take a command you've been using and add `--dump-config`
|
||||||
|
directly after `pferd`, like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ pferd --dump-config kit-ilias-web <url> <output_directory>
|
||||||
|
```
|
||||||
|
|
||||||
|
This will make PFERD write its current configuration to its default config file
|
||||||
|
path. You can then run `pferd` without a command and it will execute the config
|
||||||
|
file. Alternatively, you can use `--dump-config-to` and specify a path yourself.
|
||||||
|
Using `--dump-config-to -` will print the configuration to stdout instead of a
|
||||||
|
file, which is a good way to see what is actually going on when using a CLI
|
||||||
|
command.
|
||||||
|
|
||||||
|
Another good way to see what PFERD is doing is the `--explain` option. When
|
||||||
|
enabled, PFERD explains in detail what it is doing and why. This can help with
|
||||||
|
debugging your own config, for example.
|
||||||
|
|
||||||
|
If you don't want to run all crawlers from your config file, you can specify the
|
||||||
|
crawlers you want to run with `--crawler` or `-C`, like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ pferd -C crawler1 -C crawler2
|
||||||
|
```
|
||||||
|
|
||||||
|
## Advanced usage
|
||||||
|
|
||||||
|
PFERD supports lots of different options. For example, you can configure PFERD
|
||||||
|
to [use your system's keyring](CONFIG.md#the-keyring-authenticator) instead of
|
||||||
|
prompting you for your username and password. PFERD also supports
|
||||||
|
[transformation rules](CONFIG.md#transformation-rules) that let you rename or
|
||||||
|
exclude certain files.
|
||||||
|
|
||||||
|
For more details, see the comprehensive [config format documentation](CONFIG.md).
|
||||||
|
|
||||||
|
## Example
|
||||||
|
|
||||||
|
This example downloads a few courses from the KIT ILIAS with a common keyring
|
||||||
|
authenticator. It reorganizes and ignores some files.
|
||||||
|
|
||||||
|
```ini
|
||||||
|
[DEFAULT]
|
||||||
|
# All paths will be relative to this.
|
||||||
|
# The crawler output directories will be <working_dir>/Foo and <working_dir>/Bar.
|
||||||
|
working_dir = ~/stud
|
||||||
|
# If files vanish from ILIAS the local files are not deleted, allowing us to
|
||||||
|
# take a look at them before deleting them ourselves.
|
||||||
|
on_conflict = no-delete
|
||||||
|
|
||||||
|
[auth:ilias]
|
||||||
|
type = keyring
|
||||||
|
username = foo
|
||||||
|
|
||||||
|
[crawl:Foo]
|
||||||
|
type = kit-ilias-web
|
||||||
|
auth = auth:ilias
|
||||||
|
# Crawl a course by its ID (found as `ref_id=ID` in the URL)
|
||||||
|
target = 1234567
|
||||||
|
|
||||||
|
# Plaintext files are easier to read by other tools
|
||||||
|
links = plaintext
|
||||||
|
|
||||||
|
transform =
|
||||||
|
# Ignore unneeded folders
|
||||||
|
Online-Tests --> !
|
||||||
|
Vorlesungswerbung --> !
|
||||||
|
|
||||||
|
# Move exercises to own folder. Rename them to "Blatt-XX.pdf" to make them sort properly
|
||||||
|
"Übungsunterlagen/(\d+). Übungsblatt.pdf" -re-> Blätter/Blatt-{i1:02}.pdf
|
||||||
|
# Move solutions to own folder. Rename them to "Blatt-XX-Lösung.pdf" to make them sort properly
|
||||||
|
"Übungsunterlagen/(\d+). Übungsblatt.*Musterlösung.pdf" -re-> Blätter/Blatt-{i1:02}-Lösung.pdf
|
||||||
|
|
||||||
|
# The course has nested folders with the same name - flatten them
|
||||||
|
"Übungsunterlagen/(.+?)/\\1/(.*)" -re-> Übung/{g1}/{g2}
|
||||||
|
|
||||||
|
# Rename remaining folders
|
||||||
|
Übungsunterlagen --> Übung
|
||||||
|
Lehrbücher --> Vorlesung
|
||||||
|
|
||||||
|
[crawl:Bar]
|
||||||
|
type = kit-ilias-web
|
||||||
|
auth = auth:ilias
|
||||||
|
target = 1337420
|
||||||
|
```
|
||||||
|
Loading…
Reference in New Issue
Block a user