IMAP: replace non-UTF-8 characters rather than aborting
Emails received may not be UTF-8. Following error was observed on a specific
mail:
Traceback (most recent call last):
  File "/home/tdescham/repo/offlineimap3/offlineimap/threadutil.py", line 146, in run
    Thread.run(self)
  File "/usr/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/tdescham/repo/offlineimap3/offlineimap/folder/Base.py", line 850, in copymessageto
    message = self.getmessage(uid)
  File "/home/tdescham/repo/offlineimap3/offlineimap/folder/IMAP.py", line 327, in getmessage
    data = self._fetch_from_imap(str(uid), self.retrycount)
  File "/home/tdescham/repo/offlineimap3/offlineimap/folder/IMAP.py", line 844, in _fetch_from_imap
    ndata1 = data[0][1].decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 10177: invalid start byte
This completely aborted offlineimap3, blocking further mail reception.
Instead, use the 'replace' error strategy in Python:
    Replace with a suitable replacement character; Python will use the
    official U+FFFD REPLACEMENT CHARACTER for the built-in Unicode codecs on
    decoding and ‘?’ on encoding.
    https://docs.python.org/2/library/codecs.html#codec-base-classes
			
			
This commit is contained in:
		| @@ -839,7 +839,7 @@ class IMAPFolder(BaseFolder): | |||||||
|  |  | ||||||
|         # Convert bytes to str |         # Convert bytes to str | ||||||
|         ndata0 = data[0][0].decode('utf-8') |         ndata0 = data[0][0].decode('utf-8') | ||||||
|         ndata1 = data[0][1].decode('utf-8') |         ndata1 = data[0][1].decode('utf-8', errors='replace') | ||||||
|         ndata = [ndata0, ndata1] |         ndata = [ndata0, ndata1] | ||||||
|  |  | ||||||
|         return ndata |         return ndata | ||||||
|   | |||||||
		Reference in New Issue
	
	Block a user
	 Thomas De Schampheleire
					Thomas De Schampheleire