How does eM Client implement threading to construct a conversation?

VanguardLH · September 15, 2024, 1:24pm

For clarification, how does eM Client thread separate messages into a conversation? Does it rely on the References or In-Reply-To header that has the Message-ID (MID) headers from multiple messages tracking the hierarchy for threading?

The Subject header cannot be used for threading messages into a conversation. You could have multiple senders saying the same thing in the Subject header, or the same sender saying the same thing in the Subject, but the body of their message relates to a different prior message. The References header is used both in e-mail and Usenet to track the threading of messages by their MIDs.

https://www.rfc-editor.org/rfc/rfc1036
Section 2.2.5 - References

I know the References header tracks the threading in a conversation. Does eM Client employ any other criteria to group messages into a conversation?

There can be a problem using the References header. By RFC norm, it must be less than 998 characters long, but I’m not sure that restriction is per physical line, or across all continuation lines. In any case, any client would have a maximum that it could parse to track and construct hierarchy in the threading. When discussions get long, and the hierarchy gets deep, the References header gets truncated. The first (original) message’s MID is retained, and the last X MIDs are retained, but the N+1 to X-1 MIDs may get deleted (truncated) to keep the References header under the maximum length.

I’ve seen trolls deliberately use excessively long MIDs trying to break apart the threading. I’ve dealt with users whose client (usually a webmail client, or web-based forum sending e-mails) doesn’t add the References header. I get an e-mail. My reply has the References header. However, when they reply, there is no References header in their message, so threading gets broke. Their reply appears as a new thread as though it were the first or originating message.

I know how the References header gets used for threading. I think the In-Reply-To header evolved to perform the identical threading purpose of the References header. Does eM Client employ anything else in attempting to thread together multiple messages into a conversation?

I did search here on “conversations”, but no one asked just how eM Client constructs conversations. Threading by the References header is the norm. Threading by Subject is prone to mixing unrelated messages into the same conversation. Other clients use References to thread messages into a conversation. What does eM Client use to construct threads aka conversations?

Will eM Client’s conversations include messages in any folder, like the Trash folder? There are times when I want a conversation tracked to included deleted messages for completeness, but often I want to exclude the Trash folder (I deleted them, so don’t want them in the conversation anymore).

Note: I went to FAQ - Getting Started | Frequently Asked Questions | Support, clicked on the search icon (magnifier glass), but the input textbox was dead. I could not enter anything on which to search, so I couldn’t search on “conversation” to check if the online help mentions how eM Client builds conversations. So, I tried an external search into the web site using:

https://www.google.com/search?q=site%3Ahttps%3A%2F%2Fwww.emclient.com%2F%20conversation

which found:

https://www.emclient.com/blog/conversations-188

which said:

In short, eM Client selects messages with the same subject and sorts them into one thread.

Ouch. Threading by Subject is fraught with false positives. That is the ancient way to thread before the References header was defined back in 1982 per RFC 822.

I was going to switch to Conversation view not because I prefer it, but because it might address another need to group related messages by instead grouping related conversations (for which, so far, I’ve only discerned would require assigning tags to use the left pane Tag → tagname to show conversations with the selected tagname). I wanted to determine how reliable is conversation threading in eM Client, and, if possible, restrict it to only using the headers, and not making guesses.

The above blog describes other criteria in attempting to build conversations other than just by Subject. While MID headers are mentioned, the References header for threading the MIDs together is not mentioned. Without using the References header, the algorithm to build conversations seems a bit too, um, loose. All those guesses seem a poor substitute to using the References header, but might help in those cases where a sender’s client didn’t add the References header in a reply, or the thread was so deep as to cause truncation of the References header.

If the References header exists in a message, it should be used alone for threading. Only if the References header is missing from a message, and the “RE:” prefix (indicating a reply message) is present in the Subject header, should eM Client guess to with which other messages this one should get threaded into a conversation. There will be no References header and no “RE:” in the Subject on the original or starter message, only in reply messages.

(References|In-Reply-To) header absent AND subject starts with "re:" then guess

Alas, not everyone replying adds the “RE:” prefix starting after any leading whitespace characters in their Subject header. They may be missing the References or In-Reply-To header, too. I’d rather not have those replies thrown into a conversation, but have them shown separate. No guessing, please.

In addition to References, the Reply-To and In-Reply-To headers can indicate how to thread messages into a conversation, but the above blog doesn’t mention using those headers. I think the In-Reply-To header duplicates the same thread tracking as the References header. Reply-To is to where a sender wants a reply get sent, and may not be the same as the From, Sender, or other headers indicating who was the sender, so Reply-To doesn’t provide reliable threading. Other than References and In-Reply-To, there doesn’t seem a reliable means of threading messages into a conversation.

Does eM Client have an option either in its config screens, or an advanced account setting as an argument string, to only use the References or In-Reply-To headers, if present, to thread together messages into a conversation? I’d rather have replies missing the References or In-Reply-To headers get orphaned outside of any conversation than get unrelated messages mixed together and hidden in the wrong conversation. I’d prefer a safe scheme where eM Client uses the headers to build conversations, and use nothing else to guess.

Gary · September 15, 2024, 1:32pm

It depends on where the message are.

If they are synced from a Google or Exchange server, eM Client shows the conversations as the server says it should be rather than using its own heuristics.

For other servers and Local Folders, it is based on the Message-ID, In-Reply-To: and References: headers. There are some exceptions where the subject, senders/recipients or date imply they are in the same conversation even if the other headers don’t say so, and then they are grouped.

All folders for that account except for Trash and Junk.

It is a very old blog post, but read further. It is explained as I have said above.

The RE and FWD additions to the subject are ignored anyway.

VanguardLH · September 15, 2024, 2:23pm

Didn’t realize Exchange created the conversations, but it makes sense when using an Exchange-capable e-mail client. However, because Microsoft yanked EWS (Exchange Web Services) from non-hosted accounts (i.e., hosted = pay for an MS 365 subscription), eM Client doesn’t use EWS to my Hotmail and Outlook.com accounts. It uses IMAP for e-mail, and ActiveSync (the old name for EAS 1.0) for calendars and contacts. I’m assuming since eM Client is using IMAP to access e-mail in my MS accounts that eM Client is doing the threading of conversations.

I might later subscribe to MS 365 which would give Exchange access to eM Client, but it’s iffy. I had an MS 365 subscription for 5 years, but dropped it a few years back. Not enough bang-for-the-buck for me to continue paying for subscriptionware.

I do have a Gmail account defined in eM Client. From what I read in other posts here, looks like Google has their own algorithms for threading into conversations, and eM Client won’t step on that order. Okay by me. However, wouldn’t that depend on whether the client uses IMAP, or uses Google’s Mail API to access the Gmail account? In the past, users were getting “exceeded quota” errors when eM Client was using the Google Mail API which requires getting a Google project account that includes a minimum default quota. More quota can be purchased, but that costs money. It took quite a long time for eMC to buy more quota, but within 3 days the larger quota got consumed, and users were again getting “synchronization quota exceeded” errors in eM Client that prevent them from getting or sending e-mails in their Gmail accounts. See the forum thread at:

https://forum.emclient.com/t/exceeded-api-quota-error/49437

The only way to have a reliable Gmail account in eM Client was to delete the old account, and create a new IMAP account for Gmail in eM Client. I’ve been using an IMAP account for Gmail in eM Client ever since. That was 2 years ago, so I don’t know if eMC add an abundance of quota to their Google project to cover all free and paid users along with a lot of extra quota to preempt the error.

While the References or In-Reply-To headers would obviate the needs for the “RE:” prefix in the Subject header, some clients don’t add the prefix, and the References or In-Reply-To headers may be absent. There are still crappy e-mail clients and web-based forums out there. Mobile e-mail apps had a nasty habit for a long time (haven’t checked recently) of omitting the References header. Same for web-based forums sending notices of updates to a watched thread. If “RE:” is in the Subject indicating a reply, but References and In-Reply-To are missing, seems eM Client should assume the message is a reply. If “RE:” is getting always ignored, and References and In-Reply-To are absent, seems eM Client would assume it was an originating or starter message to either display alone, or as the first message in a conversation.

I like eM Client will use References and In-Reply-To headers to track the threading of messages into a conversation. I would prefer if there were a choice to NOT use other methods to link together messages into a conversation; i.e., opt-out of the guessing algorithms. However, that level of detail in enabling some but disabling other schemes in the grouping algorithm probably isn’t available, so it’s enable all schemes to build conversations, or disable conversation mode. Since the vast majority of users don’t understand how e-mail works, they probably don’t care about customizing what schemes get employed for building conversations. Tis probably a case of a need wanted by very few; however, I’ve seen other forum posts here of users complaining unrelated messages getting mixed together into a conversation.

So, for now, I’m using IMAP with Gmail, not the Google Mail API, in eM Client. Under that scenario, is eM Client doing the threading for conversations, or is eM Client somehow relegating the threading to something Gmail informs to eM Client? I know eM Client supports IMAP extensions that apply only to Gmail, but I don’t know if there is another Gmail-specific IMAP extension supported by eM Client that synchronizes Gmail’s threading into eM Client.

https://developers.google.com/gmail/imap/imap-extensions

From what I can tell, that Google-only IMAP extension allows eM Client to get the labels assigned to messages in the Gmail account. While Gmail uses labels to pretend to IMAP clients those are folders on the server, I don’t think labels are used for threading messages into conversations. When I use the webmail client to Gmail, the labels I see listed are the standard ones, and I haven’t used labels in Gmail to add my own.

Per the other thread about a missing “Save sent in same folder” option, conversations was mentioned as a solution which would show messages in any folder, including the Sent folder, grouped into a conversation. I haven’t used conversation mode before, because unrelated messages possibly getting melded into the wrong conversation. Before I switch to conversation mode, I wanted to get a feel of just how they are compiled, and how [un]reliable they may be. I’ll probably go with conversation mode for now to see if I get too many unrelated messages mixed into the wrong conversations.

Thanks for the assist.