Disabling Attachment Search Index Over Security Concerns

Just been testing the program out and happened to see that there is the ability to search within attachments. However, I’m a little concerned with how secure this is and I’m a bit disappointing there don’t seem to be any options available to control this feature.

I often get spam spoofing it’s from address to be from everywhere gmail, paypal, google… etc. They sometimes even make it to my inbox and they often have attachments pretending to be different file types (html, pdf…etc). I’m concerned with how EM Client handles indexing these attachments.

If I get some spam that looks like it’s from a trusted sender than happens to have some script posing as a PDF, is EM Client just gonna read that file willy nilly in the background in the name of trying to index it? If so, what sort of security vulnerability does that pose? How can I disable this behavior or control it with some specific rules in place. IE only index attachments in a set of folders. Like rather than having it try to index all attachments in my Inbox or Junk folder I’d prefer it only cache PDFs in folders where I only store trusted/validated emails.

Any insight on this would be appreciated. I like the program but that potential security hole seems just a bit too risky and glossed over.

Thanks

In the search dropdown there is an option that controls attachment search… just make sure it is NOT checked

image

2 Likes

While at it, what does Use server search if available means?

No idea, perhaps one of the other volunteers will be able to answer that question.

This option allows you to search the messages on the IMAP or Exchange server before they have been synced with eM Client.

It used to be that the default setting for IMAP in eM Client was to download the header only, and the message body would be downloaded when you opened the message. In that scenario a search would not include content that was yet to be downloaded. This feature overcomes that limitation by searching the messages bodies directly on the server.

I think the default setting is now to download for offline use, so this new feature is only useful if you have disabled that. You will find the setting for offline use in your account IMAP settings.

image

1 Like

Wow, thank you @Gary!

Ok but will that stop the indexing or just not search/display those results?

Guess that answers that… moving on to find another solution than EM Client. Too bad really.

Yes, it does.

I guess we got a little off-topic with the server search, but as @sunriseal originally commented, disable attachments in the search parameter. I don’t think you can find a better security option if you are concerned about searching attachments, than just not to search them at all.

Except it didn’t answer my followup question “will that stop the indexing or just not search/display those results?

The security risk is with the program arbitrarily opening attachments in the background to cache them for search. Pretty sure that tick box is just a filter to be applied to their index/cache that’s already created in the background.

What are you talking about?

eM Client does not arbitrarily open attachments in the background. You have an option in the search parameters to include attachments. If you have that option deselected, a search will not include attachments.

I’m talking about search indexing. And more specifically I’m talking about this piece of software accessing a file in the background to read it when it could in fact be a virus posing as a legit PDF.

Let’s break this down…

EM Client can search for text within PDF attachments. So if I do a search for “July Proposal” it will return all emails that have that text in subject, body or even contained within PDF attachments.

However, like virtually everything that has search functionality these days it does not set out scanning every email for that content at the time I hit enter on the search as it would take too long to produce results. So it uses indexing and caching. In essence this means going out in the background (prior to a search being made) to scan every email and store the information in an organized relational database for easy lookup in the future based on keyword.

So rather than having to go to each email and checking if it contains the words “July Proposal” it instead references the cached index that it created previously in the background. It looks for “July Proposal” in that database and it returns a list of IDs for specific emails/attachments that contain those words.

Now, in order to create that index it needs to read all the emails and attachments ahead of time. So it scans through the emails. If it sees there is an attachment it then decides how to process it. And this is where the security risk is depending how it’s coded and handled.

It’s probably going to look at the attachment file type to see if it’s a “supported” attachment that can be searched. (Do they only support searching PDF, or others?). If it has the right extension the program opens the file. And when I say it opens the file I don’t mean it opens it in Adobe reader or something. I mean it accesses the file and reads the contents of it as a binary file in the background. PDF’s don’t store the content as plain text that can be read like traditional text. So the application reads that binary and processes what it’s reading. A PDF can contain a LOT of stuff. Text, images, forms, marcos, security, signatures…etc…etc. So the software needs to determine what it’s reading in that file and determine what is text and should be indexed and what is not and should be ignored.

All well and good, HOWEVER. It poses a big security risk if that PDF files it’s scanning for text isn’t a PDF at all and has some code in it designed to exploit vulnerabilities in applications reading or indexing the file. IE maybe EM Client is reading the binary of that PDF and it thinks it’s reading text in the PDF but there is some memory overload or escape character that allows other code in the fake PDF to be executed. Maybe the text is actually some javascript that seems harmless to index but maybe there is a flaw in EM Client where javascript from a search result gets rendered when displayed. Who knows? There’s tons of exploits like this and people that spend a lot of time trying to find them and use them.

All of which could be avoided if it were possible to control which attachments are being read. To explicitly tell the program NOT to index attachments unless they are on emails in a certain folder. Even having a “whitelist” based on the sender isn’t safe because anyone can make an email look like it’s coming from anywhere without any coding or skill required.

Anyways, I just tested it. And yes it does open and cache attachments in the background when you view an email regardless of what is set in the search filter. So if you click to preview an email in the junk folder before deleting it and it’s got an attachment EM client goes ahead and reads the attachment for it’s cache. So… pretty insecure.

For detailed technical information about the application you will need to contact the company as this is just a user supported forum. You can find contact details on their web site.

According to the company, eM Client uses the native system features to index the attachments, but attachments should be indexed only if they are downloaded. The security risk is then the same as if you just downloaded the file to your device from anywhere else. Windows defender should also work in this case and keep the indexing safe.

So the only way to avoid the indexing in the first place would be to NOT use download for offline use for attachments in the account settings.

Thanks for that, but I don’t have the download for offline use option selected and this issue still exists.

Maybe you just misunderstood whatever process you used to prove that the attachment was being indexed because if it has not been downloaded, how is it being indexed?

Items are still cached when you click on them which opens the preview in EM Client. So yes, while that option is unchecked it’s not going to download every attachment in every email. But, it still caches the contents of attachment in emails you just click on where it opens in the preview pane.

This isn’t a hard thing to replicate / test.

  1. Add account to EM Client (leave the download for offline use unchecked)
  2. Send an email with a pdf attachment (that has selectable text in it) to yourself from a different account / client. Like use the Gmail webmail and sent to your Outlook account that was added to EM Client (or whatever)
  3. Don’t open or click on the new email in EM Client yet. Instead do a search for some of the text that you know is in that PDF. There will be no results found.You can wait and try again and the results still will show nothing.
  4. Click the new email to have it open in the preview pane. Don’t click to show remote images or any of that.Just click it once.
  5. Search again for that same text that you know is in the PDF. And boom that email will show up.

So it’s reading and indexing the contents of that PDF attachment as soon as you click to view that email regardless of settings. And this is the issue. If that email was something posing from being Amazon or Paypal…etc. With a fake attachment and you click on the email (if only as a means to select it for deletion because you know it’s spam). It’s too late. EM Client has already gone ahead and read that attached “pdf” in the background and added to the cache/index for searching and potentially executed harmful code in doing so.

Having the ability to search in PDF’s is great. But it’s idiotic to have the program’s default (and unchangeable behavior) to just trust that every attachment in every email in your inbox, junk folder or otherwise is safe and legit and that their software code for reading those files for the purpose of indexing is infallible. Why is there a security check that let’s you choose to download and view images contained in an email but there isn’t one to control which attachments em client is reading in the background?

It’s not. The attachment will only be downloaded, and therefore the OS will index it, when you open the message.

What are you talking about the OS will index it? It’s got nothing to do with the OS… the OS doesn’t handle the indexing of your emails within EM Client. EM Client has it’s own database it uses for search. If it was using the Cortana system in Windows that would pose it’s own set of risks. That’s all you would need, an email client that’s able to poke around in every file on your system with the freedom of Cortana. LOL

Look this is getting no where. I appreciate your effort in trying to offer a solution but the clear answer is “you can’t disable or control this behavior”. And that’s unfortunate due to the inherit security risks.Unless I’m able to see the source code or there is an official statement made by the developers in regards to this I’m not going to be convinced otherwise.And since you’ve gotta pay for access to contact their support team and I don’t think their support team even looks at this forum this is where the issue dies.

I’ll keep an eye on the program for updates in the future… Maybe after there’s a breach due to this issue they’ll fix it lol Who knows. Great program otherwise, lots of options and control - not sure why they added exactly 0 for this functionality though.

Again, I very much appreciate your dedication to responding to this thread if you hadn’t replied there would have been no response at all. So thanks for that :slight_smile:

Anyway, I posted the reply from eM Client staff.

Do with it what you will.