Joscha
aa74604d29
Use utf-8 for report
2022-04-29 23:11:27 +02:00
I-Al-Istannen
a82a0b19c2
Collect crawler warnings/errors and include them in the report
2021-11-07 21:48:55 +01:00
I-Al-Istannen
ef7d5ea2d3
Allow storing crawler-specific data in reports
2021-10-30 18:09:05 +02:00
Joscha
7b062883f6
Use raw paths for --debug-transforms
...
Previously, the already-transformed paths were used, which meant that
--debug-transforms was cumbersome to use (as you had to remove all transforms
and crawl once before getting useful results).
2021-05-31 12:33:37 +02:00
Joscha
adb5d4ade3
Print files that are *not* deleted by cleanup
...
These are files that are not present on the remote source any more, but still
present locally. They also show up in the report.
2021-05-26 10:58:19 +02:00
Joscha
bce3dc384d
Deduplicate path names in crawler
...
Also rename files so they follow the restrictions for windows file names if
we're on windows.
2021-05-25 12:11:15 +02:00
Joscha
6ca0ecdf05
Load and store reports
2021-05-23 20:46:29 +02:00
Joscha
b4d97cd545
Improve output dir and report error handling
2021-05-22 20:54:42 +02:00
Joscha
989032fe0c
Fix cookies getting deleted
2021-05-15 22:25:48 +02:00
Joscha
d565df27b3
Add HttpCrawler
2021-05-13 22:28:14 +02:00
Joscha
60cd9873bc
Add local file crawler
2021-05-06 01:02:40 +02:00
Joscha
bbfdadc463
Implement output directory
2021-05-05 18:08:34 +02:00
Joscha
07e831218e
Add sync report
2021-05-02 00:56:10 +02:00