I tried using Deduplicator. As far as I can see it compares ONLY THE SUBJECT, nothing else. Not the body of the message. Not even the date. Only the subject. So if you have a thousand unique emails with “Re:” as the subject, then 999 of those get deleted!
Tested repeatedly and verified, also the same result on two different installations.
I’ve seen previous complaints that this tool doesn’t work, but nobody seems to realise how bad it is. This tool is DANGEROUS and should be removed immediately from the program until it fixed.
The Deduplicator considers several properties: identical subject lines but also the sender, recipients, message ID, and other data.
If you encountered a case where it flagged unique messages incorrectly, we definitely want to investigate further. Could you please send me two sample messages that were marked as duplicates to [email protected]? Which email provider are you using?
I had the opposite experience where eM Client did a fantastic job of removing a large number of duplicates that I’d lived with for years because of an issue with a different E-mail client and a lack of a solution. It managed to get rid of ~250,000 E-Mails and my different folders became the sizes I would have expected, threads were cleaned up, and I was not aware of the loss of anything that shouldn’t be removed.
I use Exchange and wondered if there was a possibility that differences between E-mail services/providers might come into play here, but @Kim_Fisher’s response suggests to me that might not be an issue.
It doesn’t consider several properties. As I said, it is only considering the subject and thereby flagging every duplicate subject for deletion regardless of date or message body. Have you tested it and can confirm it is working correctly for you? (I guess a lot of people use it and don’t realise it what it is doing). If it works for some people and not for others, or for certain email types and not for others, or for certain email providers and not for others, or for emails that have been imported from other programs and not for others - well, that means it’s flaky and can’t be trusted. I have seen others point this out in the past. I think you need to do get a lot of people to test it. I’m content to report it and don’t wish to help with the forensics.
The Swiss company Fookes Software Ltd has a range of programs:
Aid4Mail Converter, Aid4Mail Investigator, & Aid4Mail Enterprise.
They all include a utility named, “Eliminate duplicates”, that is not
only configurable and very effective in removing duplicate emails.
We have used the Aid4Mail Investigator version with no issues.
¡Buena suerte!
skybat
¡Saludos desde Sevilla la soleada en España!
¡Mis mejores deseos y mantente a salvo!
v 10.3.2412.
I can’t send sample emails as they’re all personal and have names and email addresses in. Also, they date back twenty years over many email providers, so that can’t have anything to do with it.
When I run deduplicator on several thousand messages, the results window appears instantly - it shouldn’t. It should take some time to process all that data. A couple of other duplicate remover programs that I’ve tried takes 10-20 seconds. Deduplicator is just not processing.
After running it, then clicking on an “x duplicates” link on the right, it’s clear in the bottom panel how it is not working. For each subject one will be marked as ‘keep’ and all the rest ‘remove’. Yes, you can manually go through the list of intended removals and change them one by one to ‘keep’ but that’s nuts and can’t be how it’s intended to work.
We haven’t seen this behavior in our testing. If something is unusual about your specific messages, seeing two examples would help us look into it. If it happens with all messages, there should be at least two that do not contain any sensitive data?
Sorry, can’t help with that - when I moved to eM Client I weeded out all my junk and all my email archives are now personal messages with other people’s email addresses in them.
I keep trying deduplicator but it’s still checking only subjects.
Did I mention that all my emails are in data files? Does that make a difference? They look alright in the message list and when opened and I can convert them backwards and forwards (to eml and pst) and they are still fine in eM Client, except for deduplication.
One of our trial users found this thread and sent me two EML files, so we were able to figure out why you were encountering these false positives. (Many thanks to Jake!)
The Deduplicator’s weak spot were blank message IDs. Since it looks for identical IDs, along with the other factors, blank IDs = technically identical. It is obviously wrong, though, so our developers are going to fix this ASAP.
The ID information needs to be provided by the sender’s server (or correctly exported), and this would be an edge case where the information is missing in the message data.
We appreciate you opened this topic @DonaldMcDonald! I hope that after the fix, you can have the quality Deduplicator experience other users report.