mirror of
https://github.com/Garmelon/PFERD.git
synced 2023-12-21 10:23:01 +01:00
Update README, CHANGELOG
This commit is contained in:
parent
519a7ef435
commit
c665c36d88
19
CHANGELOG.md
19
CHANGELOG.md
@ -8,13 +8,28 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
## Unreleased
|
||||
|
||||
### Added
|
||||
- Support for concurrent downloads
|
||||
- Support for proper config files
|
||||
- Proper config files
|
||||
- Concurrent crawling
|
||||
- Crawl external ILIAS links
|
||||
- Crawl uploaded exercise solutions
|
||||
- Explain what PFERD is doing and why (`--explain`)
|
||||
- More control over output (`--status`, `--report`)
|
||||
- Print report after exiting via Ctrl+C
|
||||
- Store crawler reports in `.report` JSON file
|
||||
- Extensive config file documentation (`CONFIG.md`)
|
||||
- Documentation for developers (`DEV.md`)
|
||||
- This changelog
|
||||
|
||||
### Changed
|
||||
- Rewrote almost everything
|
||||
- Better error messages
|
||||
- Redesigned CLI
|
||||
- Redesigned transform rules
|
||||
- ILIAS crawling logic (paths may be different)
|
||||
- Better support for weird paths on Windows
|
||||
- Set user agent (`PFERD/<version>`)
|
||||
|
||||
### Removed
|
||||
- Backwards compatibility with 2.x
|
||||
- Python files as config files
|
||||
- Some types of crawlers
|
||||
|
@ -90,7 +90,7 @@ full name of an auth section (including the `auth:` prefix).
|
||||
|
||||
Here is a simple example:
|
||||
|
||||
```
|
||||
```ini
|
||||
[auth:example]
|
||||
type = simple
|
||||
username = foo
|
||||
|
117
README.md
117
README.md
@ -8,7 +8,14 @@ Other resources:
|
||||
- [Changelog](CHANGELOG.md)
|
||||
- [Development Guide](DEV.md)
|
||||
|
||||
## Installation with pip
|
||||
## Installation
|
||||
|
||||
### Direct download
|
||||
|
||||
Binaries for Linux, Windows and Mac can be downloaded directly from the
|
||||
[latest release](https://github.com/Garmelon/PFERD/releases/latest).
|
||||
|
||||
### With pip
|
||||
|
||||
Ensure you have at least Python 3.8 installed. Run the following command to
|
||||
install PFERD or upgrade it to the latest version:
|
||||
@ -18,3 +25,111 @@ $ pip install --upgrade git+https://github.com/Garmelon/PFERD@latest
|
||||
```
|
||||
|
||||
The use of [venv](https://docs.python.org/3/library/venv.html) is recommended.
|
||||
|
||||
## Basic usage
|
||||
|
||||
PFERD can be run directly from the command line with no config file.
|
||||
Run `pferd -h` to get an overview of available commands and options.
|
||||
Run `pferd <command> -h` to see which options a command has.
|
||||
|
||||
For example, you can download your personal desktop from the KIT ILIAS like
|
||||
this:
|
||||
|
||||
```
|
||||
$ pferd kit-ilias-web desktop <output_directory>
|
||||
```
|
||||
|
||||
Also, you can download most ILIAS pages directly like this:
|
||||
|
||||
```
|
||||
$ pferd kit-ilias-web <url> <output_directory>
|
||||
```
|
||||
|
||||
However, the CLI only lets you download a single thing at a time, and the
|
||||
resulting command can grow long quite quickly. Because of this, PFERD can also
|
||||
be used with a config file.
|
||||
|
||||
To get started, just take a command you've been using and add `--dump-config`
|
||||
directly after `pferd`, like this:
|
||||
|
||||
```
|
||||
$ pferd --dump-config kit-ilias-web <url> <output_directory>
|
||||
```
|
||||
|
||||
This will make PFERD write its current configuration to its default config file
|
||||
path. You can then run `pferd` without a command and it will execute the config
|
||||
file. Alternatively, you can use `--dump-config-to` and specify a path yourself.
|
||||
Using `--dump-config-to -` will print the configuration to stdout instead of a
|
||||
file, which is a good way to see what is actually going on when using a CLI
|
||||
command.
|
||||
|
||||
Another good way to see what PFERD is doing is the `--explain` option. When
|
||||
enabled, PFERD explains in detail what it is doing and why. This can help with
|
||||
debugging your own config, for example.
|
||||
|
||||
If you don't want to run all crawlers from your config file, you can specify the
|
||||
crawlers you want to run with `--crawler` or `-C`, like this:
|
||||
|
||||
```
|
||||
$ pferd -C crawler1 -C crawler2
|
||||
```
|
||||
|
||||
## Advanced usage
|
||||
|
||||
PFERD supports lots of different options. For example, you can configure PFERD
|
||||
to [use your system's keyring](CONFIG.md#the-keyring-authenticator) instead of
|
||||
prompting you for your username and password. PFERD also supports
|
||||
[transformation rules](CONFIG.md#transformation-rules) that let you rename or
|
||||
exclude certain files.
|
||||
|
||||
For more details, see the comprehensive [config format documentation](CONFIG.md).
|
||||
|
||||
## Example
|
||||
|
||||
This example downloads a few courses from the KIT ILIAS with a common keyring
|
||||
authenticator. It reorganizes and ignores some files.
|
||||
|
||||
```ini
|
||||
[DEFAULT]
|
||||
# All paths will be relative to this.
|
||||
# The crawler output directories will be <working_dir>/Foo and <working_dir>/Bar.
|
||||
working_dir = ~/stud
|
||||
# If files vanish from ILIAS the local files are not deleted, allowing us to
|
||||
# take a look at them before deleting them ourselves.
|
||||
on_conflict = no-delete
|
||||
|
||||
[auth:ilias]
|
||||
type = keyring
|
||||
username = foo
|
||||
|
||||
[crawl:Foo]
|
||||
type = kit-ilias-web
|
||||
auth = auth:ilias
|
||||
# Crawl a course by its ID (found as `ref_id=ID` in the URL)
|
||||
target = 1234567
|
||||
|
||||
# Plaintext files are easier to read by other tools
|
||||
links = plaintext
|
||||
|
||||
transform =
|
||||
# Ignore unneeded folders
|
||||
Online-Tests --> !
|
||||
Vorlesungswerbung --> !
|
||||
|
||||
# Move exercises to own folder. Rename them to "Blatt-XX.pdf" to make them sort properly
|
||||
"Übungsunterlagen/(\d+). Übungsblatt.pdf" -re-> Blätter/Blatt-{i1:02}.pdf
|
||||
# Move solutions to own folder. Rename them to "Blatt-XX-Lösung.pdf" to make them sort properly
|
||||
"Übungsunterlagen/(\d+). Übungsblatt.*Musterlösung.pdf" -re-> Blätter/Blatt-{i1:02}-Lösung.pdf
|
||||
|
||||
# The course has nested folders with the same name - flatten them
|
||||
"Übungsunterlagen/(.+?)/\\1/(.*)" -re-> Übung/{g1}/{g2}
|
||||
|
||||
# Rename remaining folders
|
||||
Übungsunterlagen --> Übung
|
||||
Lehrbücher --> Vorlesung
|
||||
|
||||
[crawl:Bar]
|
||||
type = kit-ilias-web
|
||||
auth = auth:ilias
|
||||
target = 1337420
|
||||
```
|
||||
|
Loading…
Reference in New Issue
Block a user