Previously, we instanciated an MappedImapFolder, and would cleverly (too
cleverly?) invoke methods on it casting it to an IMAPFolder by calling
methods such as: self._mb.cachemessages(self) where self._MB is the class
IMAPFolder and self and instance of MappedImapFolder. If
e.g. cachemessages() invokes a method uidexists() which exists for
MappedImapFolder, but not directly in IMAPFolder, I am not sure if
Python would at some point attempt to use the method of the wrong class.
Also, this leads to some twisted thinking as our class would in same
cases act as an IMAPFolder and in some cases as an MappedImapFOlder and
it is not always clear if we mean REMOTE UID or LOCAL UID.
This commit simplifies the class, by a)doing away with the complex Mixin
construct and directly inheriting from IMAPFOlder (so we get all the
IMAPFOlder methods that we can inherit). We instantiate self._mb as a
new instance of IMAPFolder which represents the local IMAP using local
UIDs, separating the MappedIMAPFolder construct logically from the
IMAPFolder somewhat.
In the long run, I would like to remove self._mb completely and simply
override any method that needs overriding, but let us take small and
understandable baby steps here.
Reported-and-tested-by: Vincent Beffara <vbeffara@gmail.com>
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
Folder.savemessage() is supposed to return the new UID that a backend
assigned, and it BaseFolder.copymessageto() fails if we don't return a
non-negative number in the savemessage() there.
For some reason, the UIDMappedFolder was not returning anything in
savemessage, despite clearly stating in the code docs that it is
supposed to return a UID. Not sure how long this has already been the
case. This patch fixes the UIDMappedFolder to behave as it should.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
During a sync run, someone might remove or move IMAP messages. As we
only cache the list of UIDs in the beginning, we might be requesting
UIDs that don't exist anymore. Protect folder.IMAP.getmessage() against
the response that we get when we ask for unknown UIDs.
Also, if the server responds with anything else than "OK", (eg. Gmail
seems to be saying frequently ['NO', 'Dave I can't let you do that now']
:-) so we should also be throwing OfflineImapErrors here rather than
AssertionErrors.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
Previously we were attempting to save out mails according to
http://www.qmail.org/man/man5/maildir.html in 4 steps:
1 Create a unique filename
2 Do stat(tmp/<filename>). If it found a file, wait 2 sec and go back to 1.
3 Create and write the message to the tmp/<filename>.
4 Link from tmp/* to new/*
(we did step 2 up to 15 times) But as stated by
http://wiki1.dovecot.org/MailboxFormat/Maildir (see section 'Issues with
the specification'), this is a pointless approach, e.g. there are race
issues between stating that the filename does not exist and the actual
moving (when it might exist).
So, we can simplify the steps as suggested in the dovecot wiki and
tighten up our safety at the same time.
One improvement that we do is to open the file, guaranteeing that it did
not exist before in an atomic manner, thus our simplified approach is
really more secure than what we had before.
Also, we throw an OfflineImapError at MESSAGE level when the supposedly
unique filename already exists, so that we can skip this message and
still continue with other messages.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
MaildirFolder.messagelist[*]['filename'] was storing the absolute file
paths for all stored emails. While this is convenient, it wastes much
space, as the folder prefix is always the same and it is known to the
MaildirFolder. Just 40 chars in a folder with 100k mails waste >4MB of
space. Adapt the few locations where we need the full path to construct
it dynamically.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
We use getfullname() very often (thousands to millions), yet we
dynamically calculate the very same value over and over. Optimize this
by caching the value once and be done with it.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
we do:
for msgid in imapdata:
maxmsgid = max(long(msgid), maxmsgid)
and then basically immediately:
maxmsgid = long(imapdata[0])
throwing away the first assignment although the first method of
assigning is the correct one. The second had been forgotten to be
removed when we introduced the above iteration. This bug would fix a
regression with those broken ZIMBRA servers that send multiple EXISTS
replies.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
All months names are 3-letter abbreviated, but accidentally June and
July slipped through. Thanks to the heads up by
Philipp Kern <pkern@debian.org>.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
The semaphorewait()/waitforthread() logic is usefull for IMAP starting
connections. We actually use it in imapserver only.
This patch removes the over-engineered factorized methods. It tend to simplify
the code by cleaning out a chain of two direct calls with no other processes.
Reviewed-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
We were outputting full message bodies to the debug log (often stderr),
and then again (as they go over the imaplib2 wire, imaplib logs
everything too). Not only is quite a privacy issue when sending in debug
logs but it can also freeze a console for quite some time. Plus it
bloats debug logs A LOT.
Only output the first and last 100 bytes of each message body to the
debug log (we still get the full body from imaplib2 logging). This
limits privacy issues when handing the log to someone else, but usually
still contains all the interesting bits that we want to see in a log.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
The syntax was not right, and deleting messages from the LocalStatus
failed. (We passed in the full list of uids and we need to pass in one
uid at a time (as a tuple). Deleting messages works now.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
Throw an OfflineImapError when SELECTing a folder is unsuccessful and
bail out with a FOLDER serverity. In accounts.py catch all
OfflineImapErrors and either just log the error and skip the folder or
bubble it up if it's severe.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
Make the folder classes use uidexists() more. Add some code
documentation while going through.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
Test if sqlite is multithreading-safe and bail out if not. sqlite
versions since at least 2008 are.
But, as it still causes errors when 2
threads try to write to the same connection simultanously (We get a
"cannot start transaction within a transaction" error), we protect
writes with a per class, ie per-connection lock. Factor out the retrying
to write when the database is locked.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
doautosave was a useless variable before (it was *always* 1). So we
remove the self.dofsync variable and store in doautosave whether we
should fsync as often as possible (which really hurts performance).
The sqlite backend could (at one point) use the doautosave variable to
determine if it should autocommit after each modification.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
Based on patches by Stewart Smith, updated by Rob Browning.
plus:
- Inherit LocalStatusSQLFolder from LocalStatusFolder
This lets us remove all functions that are available via our ancestors
classes and are not needed.
- Don't fail if pysql import fails. Fail rather at runtime when needed.
- When creating the db file, create a metadata table which contains the
format version info, so we can upgrade nicely to other formats.
- Create an upgrade_db() function which allows us to upgrade from any
previous file format to the current one (including plain text)
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
When our LocalStatus cache is corrupt, ie e.g. it contains lines not in
the form number:number, we would previously just raise a ValueError
stating things like "too many values". In case we encounter clearly
corrupt LocalStatus cache entries, clearly raise an exception stating
the filename and the line, so that people can attempt to repair it.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
This function will need much more "robustifying", but the very least we
can do is to print the file name and line that are giving trouble.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
Add some comments how the data structures actually look like.
Describe the function properly, and make sure we only hold on to the
data connection as quickly as possible.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
This was in the code for a very long time, it seems.
Remove one instance, no functional changes.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
Just clean up lines which end in whitespaces.
Also adapt the copyright to the current year while touching the file.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
In order to optimize performance, we fold the 1st and 2nd pass of our
sync strategy into one. They were essentially doing the same thing:
uploading a message to the other side. The only difference was that in
one case we have a negative UID locally, and in the other case, we have
a positive one already.
This saves some time, as we don't have to run through that function on
IMAP servers anyway (they always have positive UIDs), and 2nd were we
stalling further copying until phase 1 was finished. So uploading a
single new message would prevent us from starting to copy existing
regular messages.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
For each folder we were making a second IMAP request asking for the
latest UID and compared that with the highest UID in our
statusfolder. This catched the case that 1 mail has been deleted by
someone else and another one has arrived since we checked, so that the
total number of mails appears to not have changed.
We don't capture anymore this case in the quickchanged() case.
It improves my performance from 8 to about 7.5 seconds per check (with lots of
variation) and we would benefit even more in the IMAP<->IMAP case as we do one
additional IMAP lookup per folder on each side then.
Do cleanups on whitespaces while in this file.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
quickchanged() was iterating a lot, make use of some of the helper
functions that had been introduced recently and document the function a
bit better. No functional change.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
getmessagelist() is slow for the mapped UID case, so replace some of its
occurences with calls that are optimized for this case, ie
getmessagecount() and uidexists().
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Reviewed-and-tested-by: Vincent Beffara <vbeffara@ens-lyon.fr>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
We are calling getmessagelist() internally a lot, e.g. just to check if
a UID exists (from uidexist()). This is a very expensive operation in
the UIDMapped case, as we reconstruct the whole messagelist dict every
single time, involving lots of copying etc.
So we provide more efficient implementations for the uidexists()
getmessageuidlist() and getmessagecount() functions that are fast in the
UIDMapped case. This should solve the performance regression that was
recently observed in the Mapped UID case.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Reviewed-and-tested-by: Vincent Beffara <vbeffara@ens-lyon.fr>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
Signed-off-by: David Favro <offlineimap@meta-dynamic.com>
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
When uploading a new message to Gmail we need to find out the UID it
assigned it, but Gmail does not advertize the UIDPLUS extension (in all
cases) and it fails to find the email that we just uploaded when
searching for it. This prevented us effectively from uploading to
gmail.
See analysis in
http://lists.alioth.debian.org/pipermail/offlineimap-project/2011-March/001449.html
for details on what is going wrong.
This patch increases compatability with Gmail by checking for APPENDUID
responses to an APPEND action even if the server did not claim to
support it. This restores the capability to upload messages to the
*broken* Gmail IMAP implementation.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
As the LocalStatus and UIDMap backend already did: If the uid already
exists for savemessage(), only modify the flags and don't append a new
message.
We don't invoke savemessage() on messages that already exist in our sync
logic, so this has no change on our current behavior. But it makes
backends befave more consistent with each other.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
- Some documentation improvements, this is a severely underdocumented
class. This still needs some further improvements though.
- Don't use apply(Baseclass) (which is going away in Python 3), use
IMAPFolder.__init__(self, *args, **kwargs).
- Don't call ValueError, string. It is ValueError(string)
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
The biggest change here is that imapobj.untagged_responses is no
longer a dictionary, but a list. To access it, I use the semi-private
_get_untagged_response method.
* offlineimap/folder/IMAP.py (IMAPFolder.quickchanged,
IMAPFolder.cachemessagelist): imaplib2 now explicitly removes its
EXISTS response on select(), so instead we use the return values from
select() to get the number of messages.
* offlineimap/imapserver.py (UsefulIMAPMixIn.select): imaplib2 now
stores untagged_responses for different mailboxes, which confuses us
because it seems like our mailboxes are "still" in read-only mode when
we just re-opened them. Additionally, we have to return the value
from imaplib2's select() so that the above thing works.
* offlineimap/imapserver.py (UsefulIMAPMixIn._mesg): imaplib2 now
calls _mesg with the name of a thread, so we display this
information in debug output. This requires a corresponding change to
imaplibutil.new_mesg.
* offlineimap/imaplibutil.py: We override IMAP4_SSL.open, whose
default arguments have changed, so update the default arguments. We
also subclass imaplib.IMAP4 in a few different places, which now
relies on having a read_fd file descriptor to poll on.
Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
imaplib2 has slightly different semantics than standard imaplib, so
this patch will break the build, but I thought it was helpful to have it as
a separate commit.
Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
The rfc822 module has been deprecated since python 2.3, and conversion to
the email module is straightforward, so let us do that. rfc822 is
completely gone in python3.
This also fixes a bug that led to offlineimap abortion (but that code path
is apparently usually not exercised so I did not notice:
rfc822|email.utils.parsedate return a tuple which has no named attributes,
but we were using them later in that function. So pass the tuple into a
struct_time() to get named attributes.
While reading the docs, I noticed that email.parsedate returns invalid
daylight savings information (is_dst attribute), and we are using it
anyway. Oh well, the imap server might think the mails are off by an hour
at worst.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
More convenient way to test if a certain uid exists and getting a list
of all uids. Also, the SQL backend will have efficient overrides for
these methods.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
Rather than always having to call len(getmessagelist.keys()) as was done
before. No functional change, just nicer looking code. Also the SQLite
backend or other backends could implement more efficient implementations.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
We only have one "dstfolder" at a time when deleting/adding flags, so no
need to pass in a list of those to the ui functions that output the log
info.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
This enables us to just use the folder instance in the ui output and get
a name rather than having to call getname() all the time.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
The previous syncing strategy was doing more than we needed to and was a
bit underdocumented. This is an attempt to clean it up.
1) Do away with the previous different code paths depending on
whether there is a LocalStatus file or not (the isnewfolder() test). We
always use the same strategy now, which makes the strategy easier to
understand. This strategy is simply:
a) Sync remote to local folder first
b) Sync local to remote
Where each sync implies a 4 pass strategy which does basically the same
as before (explained below).
2) Don't delete messages on LOCAL which don't exist on REMOTE right at
the beginning anymore. This prevented us e.g. from keeping local
messages rather than redownloading everything once LocalStatus got
corrupted or deleted. This surprised many who put in an existing local
maildir and expected it to be synced to the remote place. Instead, the
local maildir was deleted. This is a data loss that actually occured to
people!
3) No need to separately sync the statusfolder, we update that one
simultanously with the destfolders...
3) Simplified the sync function API by only taking one destdir rather
than a list of destdirs, we never used more anyway. This makes the code
easier to read.
4) Added plenty of code comments while I was going through to make sure
the strategy is easy to understand.
-----------------------------------------
Pass1: Transfer new local messages
Upload msg with negative/no UIDs to dstfolder. dstfolder should
assign that message a new UID. Update statusfolder.
Pass2: Copy existing messages
Copy messages in self, but not statusfolder to dstfolder if not
already in dstfolder. Update statusfolder.
Pass3: Remove deleted messages
Get all UIDS in statusfolder but not self. These are messages
that we have locally deleted. Delete those from dstfolder and
statusfolder.
Pass4: Synchronize flag changes
Compare flags in self with those in statusfolder. If msg has a
valid UID and exists on dstfolder (has not e.g. been deleted
there), sync the flag change to dstfolder and statusfolder.
The user visible implications of this change should be unnoticable
except in one situation:
Blowing away LocalStatus will not require you to redownload ALL of
your mails if you still have the local Maildir. It will simply recreate
LocalStatus.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
Removing this lock makes the function not threadsafe, but then it is
only ever called from one thread, the main account syncer. Also, it
doesn't make it worse than most of the other functions in that class
which are also not threadsafe.
Removing this makes the code simpler, and removes the need to import the
threading module.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
Rather than inserting our own home-grown header, everytime we save a
message to an IMAP server, we check if we suport the UIDPLUS extension
which provides us with an APPENDUID reply. Use that to find the new UID
if possible, but keep the old way if we don't have that extension.
If a folder is read-only, return the uid that we have passed in per API
description in folder.Base.py
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>
The working horse of the savemessage() function, imaplib.append() was
hidden away in an assert statement. Pull the real functions out of the
asserts and simply assert on the return values. This looks less
convoluted and makes this easier to understand in my opinion.
Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
Signed-off-by: Nicolas Sebrecht <nicolas.s-dev@laposte.net>