Joscha 
							
						 
					 
					
						
						
							
						
						edbd92dbbf 
					 
					
						
						
							
							Add --status and --report flags  
						
						 
						
						
						
						
					 
					
						2021-05-23 22:41:59 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						27b5a8e490 
					 
					
						
						
							
							Rename log.action to log.status  
						
						 
						
						
						
						
					 
					
						2021-05-23 22:40:33 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						1f400d5964 
					 
					
						
						
							
							Implement BooleanOptionalAction  
						
						 
						
						
						
						
					 
					
						2021-05-23 22:26:59 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						0ca0680165 
					 
					
						
						
							
							Simplify --version  
						
						 
						
						
						
						
					 
					
						2021-05-23 21:40:48 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						ce1dbda5b4 
					 
					
						
						
							
							Overhaul colours  
						
						 
						
						... 
						
						
						
						"Crawled" and "Downloaded" are now printed less bright than "Crawling" and
"Downloading" as they're not as important. Explain topics are printed in yellow
to stand out a bit more from the cyan action messages. 
						
						
					 
					
						2021-05-23 21:33:04 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						9cce78669f 
					 
					
						
						
							
							Print report after all crawlers have finished  
						
						 
						
						
						
						
					 
					
						2021-05-23 21:17:13 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						6ca0ecdf05 
					 
					
						
						
							
							Load and store reports  
						
						 
						
						
						
						
					 
					
						2021-05-23 20:46:29 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						6e9f8fd391 
					 
					
						
						
							
							Add a keyring authenticator  
						
						 
						
						
						
						
					 
					
						2021-05-23 19:44:12 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						2fdf24495b 
					 
					
						
						
							
							Restructure crawling and auth related modules  
						
						 
						
						
						
						
					 
					
						2021-05-23 19:16:42 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						bbf9f8f130 
					 
					
						
						
							
							Add -C as alias for --crawler  
						
						 
						
						
						
						
					 
					
						2021-05-23 19:06:09 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						37f8d84a9c 
					 
					
						
						
							
							Output total amount of http requests in HTTP Crawler  
						
						 
						
						
						
						
					 
					
						2021-05-23 19:00:01 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						5edd868d5b 
					 
					
						
						
							
							Fix always-smart redownloading the wrong files  
						
						 
						
						
						
						
					 
					
						2021-05-23 18:49:34 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						e4e5e83be6 
					 
					
						
						
							
							Fix downloader using crawl bar  
						
						 
						
						... 
						
						
						
						Looks like I made a dumb copy-paste error. Now the download bar shows the proper
progress and speed again. 
						
						
					 
					
						2021-05-23 18:39:43 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						74c7b39dc8 
					 
					
						
						
							
							Clean up files in alphabetical order  
						
						 
						
						
						
						
					 
					
						2021-05-23 18:39:25 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						445dffc987 
					 
					
						
						
							
							Reword some explanations  
						
						 
						
						
						
						
					 
					
						2021-05-23 18:35:32 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						d97d6bf147 
					 
					
						
						
							
							Fix handling nested ILIAS folders  
						
						 
						
						
						
						
					 
					
						2021-05-23 18:29:28 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						79efdb56f7 
					 
					
						
						
							
							Adjust ILIAS html explain messages  
						
						 
						
						
						
						
					 
					
						2021-05-23 18:24:25 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						a9af56a5e9 
					 
					
						
						
							
							Improve specifying crawlers via CLI  
						
						 
						
						... 
						
						
						
						Instead of removing the sections of unselected crawlers from the config file,
crawler selection now happens in the Pferd after loading the crawlers and is
more sophisticated. It also has better error messages. 
						
						
					 
					
						2021-05-23 18:18:50 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						59f13bb8d6 
					 
					
						
						
							
							Explain ILIAS HTML parsing and add some warnings  
						
						 
						
						
						
						
					 
					
						2021-05-23 18:14:54 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						463f8830d7 
					 
					
						
						
							
							Add warn_contd  
						
						 
						
						
						
						
					 
					
						2021-05-23 18:14:54 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						05ad06fbc1 
					 
					
						
						
							
							Only enclose get_page in iorepeat in ILIAS crawler  
						
						 
						
						... 
						
						
						
						We previously also gathered in there, which could lead to some more
surprises when the method was retried. 
						
						
					 
					
						2021-05-23 18:14:51 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						29d5a40c57 
					 
					
						
						
							
							Replace asyncio.gather with custom Crawler function  
						
						 
						
						
						
						
					 
					
						2021-05-23 17:25:16 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						c0cecf8363 
					 
					
						
						
							
							Log crawl and download actions more extensively  
						
						 
						
						
						
						
					 
					
						2021-05-23 16:25:44 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						b998339002 
					 
					
						
						
							
							Fix cleanup logging of paths  
						
						 
						
						
						
						
					 
					
						2021-05-23 16:25:44 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						245c9c3dcc 
					 
					
						
						
							
							Explain output dir decisions and steps  
						
						 
						
						
						
						
					 
					
						2021-05-23 16:25:44 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						d8f26a789e 
					 
					
						
						
							
							Implement CLI Command for ilias crawler  
						
						 
						
						
						
						
					 
					
						2021-05-23 13:30:42 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						e1d18708b3 
					 
					
						
						
							
							Rename "no_videos" to videos  
						
						 
						
						
						
						
					 
					
						2021-05-23 13:30:42 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						b44b49476d 
					 
					
						
						
							
							Fix noncritical and anoncritical decorators  
						
						 
						
						... 
						
						
						
						I must've forgot to update the anoncritical decorator when I last changed the
noncritical decorator. Also, every exception should make the crawler not
error_free, not just CrawlErrors. 
						
						
					 
					
						2021-05-23 13:24:53 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						7e0bb06259 
					 
					
						
						
							
							Clean up TODOs  
						
						 
						
						
						
						
					 
					
						2021-05-23 12:47:30 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						ecdedfa1cf 
					 
					
						
						
							
							Add no-videos flag to ILIAS crawler  
						
						 
						
						
						
						
					 
					
						2021-05-23 12:37:01 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						3d4b997d4a 
					 
					
						
						
							
							Retry crawl_url and work around Python's closure handling  
						
						 
						
						... 
						
						
						
						Closures capture the scope and not the variables. Therefore, any
type-narrowing performed by mypy on captured variables is lost inside
the closure. 
						
						
					 
					
						2021-05-23 12:28:15 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						e81005ae4b 
					 
					
						
						
							
							Fix CLI arguments  
						
						 
						
						
						
						
					 
					
						2021-05-23 12:24:21 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						33a81a5f5c 
					 
					
						
						
							
							Document authentication in HTTP crawler and rename prepare_request  
						
						 
						
						
						
						
					 
					
						2021-05-23 11:55:34 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						25e2abdb03 
					 
					
						
						
							
							Improve transformer explain wording  
						
						 
						
						
						
						
					 
					
						2021-05-23 11:45:14 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						803e5628a2 
					 
					
						
						
							
							Clean up logging  
						
						 
						
						... 
						
						
						
						Paths are now (hopefully) logged consistently across all crawlers 
						
						
					 
					
						2021-05-23 11:37:19 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						c88f20859a 
					 
					
						
						
							
							Explain config file dumping  
						
						 
						
						
						
						
					 
					
						2021-05-23 11:04:50 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						ec3767c545 
					 
					
						
						
							
							Create crawler base dir at start of crawl  
						
						 
						
						
						
						
					 
					
						2021-05-23 10:52:02 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						729ff0a4c7 
					 
					
						
						
							
							Fix simple authenticator output  
						
						 
						
						
						
						
					 
					
						2021-05-23 10:45:37 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						6fe51e258f 
					 
					
						
						
							
							Number rules starting at 1  
						
						 
						
						
						
						
					 
					
						2021-05-23 10:45:37 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						44ecb2fbe7 
					 
					
						
						
							
							Fix cleanup deleting crawler's base directory  
						
						 
						
						
						
						
					 
					
						2021-05-23 10:45:37 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						53e031d9f6 
					 
					
						
						
							
							Reuse dl/cl for I/O retries in ILIAS crawler  
						
						 
						
						
						
						
					 
					
						2021-05-23 00:28:27 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						8ac85ea0bd 
					 
					
						
						
							
							Fix a few typos in HttpCrawler  
						
						 
						
						
						
						
					 
					
						2021-05-22 23:37:34 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						adfdc302d7 
					 
					
						
						
							
							Save cookies after successful authentication in HTTP crawler  
						
						 
						
						
						
						
					 
					
						2021-05-22 23:30:32 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						3053278721 
					 
					
						
						
							
							Move HTTP crawler to own file  
						
						 
						
						
						
						
					 
					
						2021-05-22 23:23:21 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						4d07de0d71 
					 
					
						
						
							
							Adjust forum log message in ilias crawler  
						
						 
						
						
						
						
					 
					
						2021-05-22 23:20:21 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								I-Al-Istannen 
							
						 
					 
					
						
						
							
						
						953a1bba93 
					 
					
						
						
							
							Adjust to new crawl / download names  
						
						 
						
						
						
						
					 
					
						2021-05-22 23:18:05 +02:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						e724ff7c93 
					 
					
						
						
							
							Fix normal arrow  
						
						 
						
						
						
						
					 
					
						2021-05-22 20:44:59 +00:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						62f0f7bfc5 
					 
					
						
						
							
							Explain crawling and partially explain downloading  
						
						 
						
						
						
						
					 
					
						2021-05-22 20:39:57 +00:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						9cb2b68f09 
					 
					
						
						
							
							Fix arrow parsing error messages  
						
						 
						
						
						
						
					 
					
						2021-05-22 20:39:29 +00:00  
					
					
						 
						
						
							
							
							 
							
							
							
							
							 
						
					 
				 
			
				
					
						
							
							
								 
								Joscha 
							
						 
					 
					
						
						
							
						
						1bbc0b705f 
					 
					
						
						
							
							Improve transformer error handling  
						
						 
						
						
						
						
					 
					
						2021-05-22 20:38:56 +00:00