I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						3453bbc991 
					 
					
						
						
							
							Add basic auth to KIT-IPD crawler  
						
						 
						
						
						
						
					 
					
						2025-10-29 13:11:34 +01:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								randomNumber101 
							
						 
					 
					
						
						
							
						
						6353571eb4 
					 
					
						
						
							
							Added Ilias configuration for HHU Duesseldorf to Congig file  
						
						 
						
						... 
						
						
						
						Closes  #125  
						
						
					 
					
						2025-10-19 16:43:54 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Christian Schliz 
							
						 
					 
					
						
						
							
						
						56e3065950 
					 
					
						
						
							
							Document usage of pilot.ilias.studium.kit.edu ( #111 )  
						
						 
						
						
						
						
					 
					
						2025-05-30 17:13:45 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								PinieP 
							
						 
					 
					
						
						
							
						
						596b6a7688 
					 
					
						
						
							
							Add support for non-KIT shibboleth login ( #98 )  
						
						 
						
						... 
						
						
						
						Co-authored-by: Mr-Pine <git@mr-pine.de >
Co-authored-by: I-Al-Istannen <I-Al-Istannen@users.noreply.github.com > 
						
						
					 
					
						2024-11-05 18:30:34 +01:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Mr. Pine 
							
						 
					 
					
						
						
							
						
						4f9e2ab48d 
					 
					
						
						
							
							Support named capture groups in regex transformers ( #94 )  
						
						 
						
						
						
						
					 
					
						2024-10-21 15:21:33 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						19beb8f07b 
					 
					
						
						
							
							Document course overview downloading in config.md  
						
						 
						
						
						
						
					 
					
						2024-07-31 22:02:43 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						89b44c69a7 
					 
					
						
						
							
							Update docs  
						
						 
						
						... 
						
						
						
						All config file options must be documented in CONFIG.md. The README.md
is just a starting point. To avoid duplicated info, I've moved most of
the docs to CONFIG.md. 
						
						
					 
					
						2024-05-10 14:36:01 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Florian Raith 
							
						 
					 
					
						
						
							
						
						bbcfe9c8dd 
					 
					
						
						
							
							Fix typo in CONFIG.md ( #89 )  
						
						 
						
						
						
						
					 
					
						2024-04-19 16:52:18 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Mr. Pine 
							
						 
					 
					
						
						
							
						
						dbc2553b11 
					 
					
						
						
							
							Add default show-not-deleted option  
						
						 
						
						... 
						
						
						
						If set to `no`, PFERD won't print status or report messages for not deleted files 
						
						
					 
					
						2023-08-26 18:43:01 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Mr. Pine 
							
						 
					 
					
						
						
							
						
						443f7fe839 
					 
					
						
						
							
							Add no-delete-prompt-overwrite crawler conflict resolution option ( #75 )  
						
						 
						
						
						
						
					 
					
						2023-07-29 18:36:33 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						635caa765d 
					 
					
						
						
							
							Fix typo  
						
						 
						
						... 
						
						
						
						Thanks, burg113 
						
						
					 
					
						2022-11-15 17:17:57 +01:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						07200bbde5 
					 
					
						
						
							
							Document ilias web crawler's forums option  
						
						 
						
						
						
						
					 
					
						2022-10-31 14:12:27 +01:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						ed24366aba 
					 
					
						
						
							
							Add pass authenticator  
						
						 
						
						
						
						
					 
					
						2022-06-05 10:04:42 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						616b0480f7 
					 
					
						
						
							
							Simplify IPD crawler link regex  
						
						 
						
						
						
						
					 
					
						2022-05-08 18:18:05 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						afbd03f777 
					 
					
						
						
							
							Fix docs  
						
						 
						
						
						
						
					 
					
						2022-05-05 14:35:42 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						13b8c3d9c6 
					 
					
						
						
							
							Add regex option to config and CLI parser  
						
						 
						
						
						
						
					 
					
						2021-11-02 09:30:46 +01:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Toorero 
							
						 
					 
					
						
						
							
						
						d6f38a61e1 
					 
					
						
						
							
							Fixed minor spelling mistakes  
						
						 
						
						
						
						
					 
					
						2021-11-02 01:54:00 +01:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								lukasprobst 
							
						 
					 
					
						
						
							
						
						55ea304ff3 
					 
					
						
						
							
							Disable interpolation of ConfigParser  
						
						 
						
						
						
						
					 
					
						2021-10-25 23:37:42 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						6673077397 
					 
					
						
						
							
							Add kit-ipd crawler  
						
						 
						
						
						
						
					 
					
						2021-10-21 13:20:21 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						70b33ecfd9 
					 
					
						
						
							
							Add migration notes to changelog  
						
						 
						
						... 
						
						
						
						Also clean up some other formatting for consistency 
						
						
					 
					
						2021-06-13 15:06:50 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						a292c4c437 
					 
					
						
						
							
							Add example for ">>" arrow heads  
						
						 
						
						
						
						
					 
					
						2021-06-12 14:57:29 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						f28bbe6b0c 
					 
					
						
						
							
							Update transform rule documentation  
						
						 
						
						... 
						
						
						
						It's still missing an example that uses rules with ">>" arrows. 
						
						
					 
					
						2021-06-09 22:45:52 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						df3ad3d890 
					 
					
						
						
							
							Add 'skip' option to crawlers  
						
						 
						
						
						
						
					 
					
						2021-06-04 18:47:13 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						1fc8e9eb7a 
					 
					
						
						
							
							Document credential file authenticator config options  
						
						 
						
						
						
						
					 
					
						2021-06-01 10:01:14 +00:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						9d5ec84b91 
					 
					
						
						
							
							Add credential file authenticator  
						
						 
						
						
						
						
					 
					
						2021-05-31 18:33:34 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						6fa9cfd4c3 
					 
					
						
						
							
							Fix error when capturing group is None  
						
						 
						
						
						
						
					 
					
						2021-05-27 15:41:00 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						2c72a9112c 
					 
					
						
						
							
							Reword -name-> and -name-re-> docs and remove -name-exact->  
						
						 
						
						
						
						
					 
					
						2021-05-27 13:20:37 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						17207546e9 
					 
					
						
						
							
							Document --debug-transforms  
						
						 
						
						
						
						
					 
					
						2021-05-26 11:47:51 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						c665c36d88 
					 
					
						
						
							
							Update README, CHANGELOG  
						
						 
						
						
						
						
					 
					
						2021-05-25 17:18:31 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						61430c8739 
					 
					
						
						
							
							Overhaul config and CLI option names  
						
						 
						
						
						
						
					 
					
						2021-05-25 14:23:38 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						bce3dc384d 
					 
					
						
						
							
							Deduplicate path names in crawler  
						
						 
						
						... 
						
						
						
						Also rename files so they follow the restrictions for windows file names if
we're on windows. 
						
						
					 
					
						2021-05-25 12:11:15 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						c687d4a51a 
					 
					
						
						
							
							Implement cookie sharing  
						
						 
						
						
						
						
					 
					
						2021-05-24 13:10:44 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						3ab3581f84 
					 
					
						
						
							
							Add timeout for HTTP connection  
						
						 
						
						
						
						
					 
					
						2021-05-23 23:41:05 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						be4b1040f8 
					 
					
						
						
							
							Document status and report options  
						
						 
						
						
						
						
					 
					
						2021-05-23 22:51:42 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						6e9f8fd391 
					 
					
						
						
							
							Add a keyring authenticator  
						
						 
						
						
						
						
					 
					
						2021-05-23 19:44:12 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						ecdedfa1cf 
					 
					
						
						
							
							Add no-videos flag to ILIAS crawler  
						
						 
						
						
						
						
					 
					
						2021-05-23 12:37:01 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						0d10752b5a 
					 
					
						
						
							
							Configure explain log level via cli and config file  
						
						 
						
						
						
						
					 
					
						2021-05-19 17:50:10 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						db1219d4a9 
					 
					
						
						
							
							Create a link file in ILIAS crawler  
						
						 
						
						... 
						
						
						
						This allows us to crawl links and represent them in the file system.
Users can choose between an ILIAS-imitation (that optionally
auto-redirects) and a plain text variant. 
						
						
					 
					
						2021-05-17 21:44:54 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						467ea3a37e 
					 
					
						
						
							
							Document ILIAS-Crawler arguments in CONFIG.md  
						
						 
						
						
						
						
					 
					
						2021-05-16 13:26:58 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						e1104f888d 
					 
					
						
						
							
							Add tfa authenticator  
						
						 
						
						
						
						
					 
					
						2021-05-15 18:27:16 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						8c32da7f19 
					 
					
						
						
							
							Let authenticators provide username and password separately  
						
						 
						
						
						
						
					 
					
						2021-05-15 18:27:03 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						b70b62cef5 
					 
					
						
						
							
							Make crawler sections start with "crawl:"  
						
						 
						
						... 
						
						
						
						Also, use only the part of the section name after the "crawl:" as the crawler's
output directory. Now, the implementation matches the documentation again 
						
						
					 
					
						2021-05-15 17:24:37 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						868f486922 
					 
					
						
						
							
							Rename local crawler path to target  
						
						 
						
						
						
						
					 
					
						2021-05-15 17:12:25 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						a6fdf05ee9 
					 
					
						
						
							
							Allow variable whitespace in arrow rules  
						
						 
						
						
						
						
					 
					
						2021-05-15 15:25:05 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						f897d7c2e1 
					 
					
						
						
							
							Add name variants for all arrows  
						
						 
						
						
						
						
					 
					
						2021-05-15 15:25:05 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						302b8c0c34 
					 
					
						
						
							
							Fix errors loading local crawler config  
						
						 
						
						... 
						
						
						
						Apparently getint and getfloat may return a None even though this is not
mentioned in their type annotations. 
						
						
					 
					
						2021-05-15 15:25:05 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						acd674f0a0 
					 
					
						
						
							
							Change limiter logic  
						
						 
						
						... 
						
						
						
						Now download tasks are a subset of all tasks. 
						
						
					 
					
						2021-05-15 15:25:05 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						296a169dd3 
					 
					
						
						
							
							Make limiter logic more complex  
						
						 
						
						... 
						
						
						
						The limiter can now distinguish between crawl and download actions and has a
fancy slot system and delay logic. 
						
						
					 
					
						2021-05-15 15:25:05 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						1591cb9197 
					 
					
						
						
							
							Add options to slow down local crawler  
						
						 
						
						... 
						
						
						
						These options are meant to make the local crawler behave more like a
network-based crawler for purposes of testing and debugging other parts of the
code base. 
						
						
					 
					
						2021-05-15 15:25:01 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						961f40f9a1 
					 
					
						
						
							
							Document simple authenticator  
						
						 
						
						
						
						
					 
					
						2021-05-13 19:55:04 +02:00