Methodologies

Malware Removal Test Methodology

Anti-virus test removal

Operating System

Microsoft Windows; the exact details of the version used are noted in the individual test reports.

Aim of the test

This test aims to determine how effectively and easily different antivirus products remove malware that has already infected a system.

Target Audience

Anyone who is likely to try removing a malware infection from a PC will benefit from reading the results of this test. It will be of particular interest to IT staff and computer technicians who carry out malware removal as part of their jobs, but will also be applicable to computer users who undertake their own computer maintenance. It is assumed that the user does not have any specialist technical knowledge (other than the ability to boot Windows into Safe Mode), and will rely on intuitive use of the antivirus software concerned to remove the malware from the system.

Definition of the threat

Malware can be defined as computer programs that have a clear and significant malicious purpose. This excludes some programs such as commercial keyloggers which can be used legitimately, e.g. in computer training, and also excludes “potentially unwanted programs” such as some browser toolbars, which are irritating but cannot be classed as malicious.

Scope of the test

The test is solely concerned with each product’s ability to remove malware from an already infected system. It does not measure detection or protection.

Test Setup

The operating system is installed on each test PC and fully patched.

Settings

Products are tested with default settings. In the event that a scan fails to remove the malware, settings may then be changed for subsequent scans in individual cases. This is noted in the test report where relevant.

Sources and numbers of test cases

The samples have been selected according to the following criteria: All anti-virus products must be able to detect the malware dropper used when inactive. The sample must have been prevalent (according to metadata) and/or seen in the field on at least two PC’s of our local customers in the last 6 months. The malware must be non-destructive (in other words, it should be possible for an anti-virus product to repair/clean the system without the need for replacing Windows system files etc.). It must also show common malware behavior under the operating system used, in order to represent also behaviors observed by many other malware samples. Around one dozen randomly picked malware samples are taken from the pool of samples matching the above criteria.

Test procedure

Thorough malware analysis is done for each sample, to see exactly what changes are made. The physical machine is infected with one threat, rebooted, and a check done to ensure that threat is fully running. The anti-virus product is installed and updated. If this is not possible, the PC is rebooted into Safe Mode; if Safe Mode is not possible and in event that a rescue disk for the relevant AV-Product is available, this is used to run a full system scan before installing. A thorough/full system scan is run and instructions given by the anti-virus product are followed to remove the malware, as a typical home-user would do. The machine is rebooted, and manual inspection/analysis of the system is made, to check for malware removal and remnants.

Ratings

We allowed certain negligible/unimportant traces to be left behind, mainly because a perfect score can’t be reached due to the behaviour/system-modifications made by some of the malware samples used. The “removal of malware” and “removal of remnants” are combined into one dimension and we took into consideration also the convenience. The ratings are given as follows, whereby A is the highest mark and D the lowest:

  1. Removal of malware/traces
  • Malware removed, only negligible traces left (A)
  • Malware removed, but some executable files, MBR and/or registry changes (e.g. loading points, etc.) remaining (B)
  • Malware removed, but annoying or potentially dangerous problems (e.g. error messages, compromised hosts file, disabled task manager, disabled folder options, disabled registry editor, detection loop, etc.) remaining (C)
  • Only the malware dropper has been neutralized and/or most other dropped malicious files/changes were not removed, or system is no longer normally usable; dropped malicious files are still on the system; removal failed (D)
  1. Convenience:
  • Removal could be done in normal mode (A)
  • Removal requires booting in Safe Mode or other built-in utilities and manual actions (B)
  • Removal requires Rescue Disk (C)
  • Removal or install requires contacting support or similar; removal failed (D)

False positives

“Aggressive cleaning” – i.e. a program deleting more than it should – is regarded as a false positive in this test.

Summary

Detection Yes
False Positives Yes
Cloud connectivity Yes
Updates allowed Yes
Default configuration Yes

Anti-Phishing Test Methodology

Anti-virus test phishing

Operating system / Browser

Microsoft Windows; details of the exact version and architecture used will be given in the individual test reports. Please note that phishing tests can be carried out on the anti-phishing features built into individual browsers, without an additional security product, or on the anti-phishing measures provided by security products – hence the use of the term “browser/security product” in this document. The browser(s) used in security product tests will be specified in the individual test report.

Aim of the test

The test is intended to demonstrate how effective the participating browser/security products are at recognising and blocking phishing websites, and thus protecting the user from being defrauded by these sites.

Target Audience

Any computer user who does not feel completely confident of their own ability to recognise and avoid phishing attacks will benefit from using a security product/browser with effective phishing protection. Any computer enthusiast or professional who provides technical support for family, friends, colleagues or clients will also be concerned with installing or recommending products that provide phishing protection for their supported users.

Definition of the threat

A phishing site is a website that attempts to mimic a pre-existing, legitimate website, or which purports to be from a pre-existing, legitimate body such as a bank, and aims to obtain user credentials with a view to directly or indirectly defrauding the user or carrying out some other sort of crime.

One very common type of phishing attack involves sending out spam mails purporting to be from a bank, with a message to recipients that they need to log on to their Internet banking account for one reason or another. A hyperlink is provided in the mail, supposedly giving the victims easy access to their online accounts. In reality, the link leads to a fake copy of the bank’s login page. This will capture the user’s login credentials, which can then be used by the perpetrators of the scam to e.g. steal money from the victim’s account.

It should be noted that a phishing website does not affect the user’s computer or device in any way. Provided it uses standard technologies such as HTML, a phishing website can be effective on any type device, and is independent of operating system or browser.

There are numerous types of online fraud which are not counted as phishing and consequently not considered in this test. For example, fraudulent sites which encourage users to enter personal data under the guise of offering a new service (as opposed to mimicking an existing service).

Many web-based malware attacks use legitimate web servers to host the malware executables. Equally, it is possible for phishing attacks to host their webpages on the servers of reputable organisations which the perpetrators have managed to compromise. A phishing page should be recognised regardless of where it is hosted, though if a legitimate top-level domain is blocked, this would be regarded as a false positive. For example, if a phishing page is hosted under the URL www.lycos.com/user2035/personal/index.htm, this particular URL should be blocked, but blocking lycos.com (a legitimate domain) would be a false positive.

Scope of the test

The test is optional; vendors who have joined the main-test series can decide whether or not to participate. As noted above, phishing attacks commonly use links in spam mails to persuade users to visit the phishing websites. Our phishing-protection test is exclusively concerned with the ability of the browser/security product to identify the website itself as fraudulent and warn the user. The vector leading to each phishing URL, be it spam mail or links in other web pages, is not considered. Our Spam Protection Test will indicate to readers which security products are most effective at filtering out spam emails.

Test Setup

The test is carried out on identical test machines.

If Windows 8.x is used as the operating system, the test will be done using both the Desktop and Modern versions of the browser where available.

The operating system and browser(s) to be used in a test will be announced to participating vendors before the test begins. The browser used will be a popular mainstream one, which is supported by all the participating vendors.

Identical operating system and browser configurations are installed on all test computers. Any anti-phishing mechanisms within the OS are deactivated. For security-product tests, anti-phishing features in the browser are deactivated (obviously they are left active in browser tests). One browser/security product is then installed on each machine and updated.

Settings

All settings are left at their default values. The products have unrestricted cloud access throughout the test. Before the test proper is run, all products will be tested to ensure that they are correctly configured and functioning properly.

Sources and numbers of test cases

The phishing URLs used in the test are extracted from spam emails and collected from the web using a crawler.

A minimum of 100 phishing sites will be used, but possibly many more, depending on the duration of the test. For false-positive testing, at least 100 legitimate online banking websites are used.

Test procedure for browsers/security products

Phishing websites have very short lives and may be taken down only hours after they are put online. To ensure that as many phishing pages as possible can be tested while they are still active, we test all the phishing URLs we receive immediately; any that turn out to be inappropriate are excluded from the results. A URL may turn out to be unsuitable if it is not a genuine phishing site, is offline, is evidently a duplicate, or can be seen to be malfunctioning (e.g. error messages appear when the page is opened). Our automated test procedure feeds the test PCs one phishing URL, to which all machines then browse simultaneously (ensuring that the availability of the page is the same for all products).  This is done in a way which replicates a user clicking on a link in real life (as opposed to an argument being passed directly to the browser). Screenshots are taken to indicate whether or not a phishing page has been blocked and/or a warning message displayed. Each test PC is then rebooted and reset to its original configuration before the next test case begins. This ensures a level playing field for each test case, and prevents any possible confusion between warning messages for one test case and those for a following test case.

To be deemed successful, a product must warn the user that a site is considered unsafe. It does not have to physically block the site completely. We note that some URL-blocking products will display a warning notice in the browser, with a darkened and inactive representation of the web page in the background; a button or link in the warning box will allow the user to proceed to the page. We accept this as protected – there are no “user-dependent” results in this test. The same principle applies with false-positive testing; a warning message that would allow the user to continue to the page by clicking the appropriate button or link still counts as a false positive if it appears for a harmless site.

False positives

As with many of our other tests, we check that products are not reaching high detection rates at the expense of a high rate of false positives. A false-positives test is carried out using a number of popular legitimate websites that ask for user credentials or personal information; there will be an emphasis on online banking sites worldwide. A single false positive from an online banking site is sufficient to downgrade a program’s rating.

Summary

Detection Yes
False Positives Yes
Cloud connectivity Yes
Updates allowed Yes
Default configuration Yes

Performance Test Methodology

Anti-virus test performance

Operating system

Microsoft Windows; the exact version will be noted in each test report.

Aim of the test

The test aims to compare how much the tested security-products slow down everyday tasks, such as launching applications, while protecting the system.

Target Audience

The test is of interest to all users of Windows PCs who are concerned that security products may reduce the performance of their system.

Definition of Performance

Performance is defined here as the speed with which a PC running a particular security product carries out a particular task, relative to an otherwise identical PC without any security software. It should be noted that perceived performance, i.e. whether the user subjectively feels that the system is being slowed down, is important. In some cases, the impact of security software on some actions, such as opening a program, may in itself have a negligible effect on the user’s actual productivity during a working day, but nonetheless irritate or frustrate the user; this can then indirectly cause him or her to be less productive. As we do not condone using a PC without antivirus software, the relative scores of the products tested should be considered more important than the difference between the fastest product and a PC without any security software.

Scope of the test

The test measures performance (speed) reduction caused by the tested programs when carrying out a number of everyday tasks such as opening and saving files. It does not measure the impact of the security products on boot time, as some products give the false impression of having loaded quickly. In reality, there is a delay in launching the protection features, in order to imply that the product has not slowed down the boot process.  Thus full protection is only offered some time after the Windows Desktop appears, the appearance of speed coming at the cost of security. Please note that this test does not in any way test the malware detection/protection abilities of the participating products. AV-Comparatives carries out a number of separate malware protection tests, most notably the Real-World Protection Test, details of which can be found on our website, www.av-comparatives.org. We do not recommend choosing an antivirus product solely on the basis of low performance impact, as it must provide effective protection too.

Test Setup

The operating system is installed on a single test PC and updated. Care is taken to minimize other factors that could influence the measurements and/or comparability of the system; for example, Windows Update is configured so that it will not download or install updates during the test. Additionally, Microsoft Office and Adobe Reader are installed, so that the speed of opening/saving Microsoft Office and PDF files can be measured. Two further programs, both relating to performance testing, are installed: UL Procyon® Benchmark-Suite, which is an industry-recognised performance-testing suite, and our own automation which runs specified tasks  and logs the time taken to complete them. It is used to simulate various file operations that a computer user would execute.

Settings

Default settings are used for all consumer products.

Test procedure

All the tests are performed with an active Internet connection, to allow for the real-world impact of cloud services and features. Optimizing processes/fingerprinting used by the products are also considered – this means that the results represent the impact on a system which has already been operated by the user for a while. The tests are repeated at least 10 times, five times with fingerprinting and five times without) in order to get mean values and filter out measurement errors. In the event of fluctuations, the tests are repeated additional times. After each run, the workstation is rebooted. The automated test procedures are as follows:

File copying

We copied a set of various common file types from one physical hard disk to another physical hard disk. Some anti-virus products might ignore some types of files by design/default (e.g. based on their file type), or use fingerprinting technologies, which may skip already scanned files in order to increase the speed.

Archiving and unarchiving

Archives are commonly used for file storage, and the impact of anti-virus software on the time taken to create new archives or to unarchive files from existing archives may be of interest for most users. We archived a set of different file types that are commonly found on home and office workstations.

Installing applications

We installed several common applications with the silent install mode and measured how long it took. We did not consider fingerprinting, because usually an application is installed only once.

Launching applications

Microsoft Office (Word, Excel, PowerPoint) and PDF documents are very common. We opened and then later closed various documents in Microsoft Office and in Adobe Acrobat Reader. The time taken for the viewer or editor application to launch was measured. Although we list the results for the first opening and the subsequent openings, we consider the subsequent openings more important, as normally this operation is done several times by users, and optimization of the anti-virus products take place, minimizing their impact on the systems.

Downloading files

Common files are downloaded from a webserver on the Internet.

Browsing websites

Common websites are opened with Google Chrome. The time to completely load and display the website was measured. We only measure the time to navigate to the website when an instance of the browser is already started.

Procyon Test

UL Procyon® benchmark suite is run, to measure the system impact during real-world product usage.

RATINGS

To produce an overall rating for each product, we combine the scores from the Procyon test and from our own automated tests, as follows. The range of times taken to complete each of the automated tasks is clustered into the categories Very Fast, Fast, Mediocre and Slow. Points are then awarded for each task, with products in the Fast category gaining 15 points, Fast getting 10 points, Mediocre 5 points and Slow zero points. A rounded-up average is taken of the points awarded to produce a final score. For the File Copying test, the mean score for the first run (with brand-new image) is taken, along with the average of the subsequent runs (to allow for e.g. fingerprinting by the security product). For the Launching Applications test, only the scores of the subsequent runs are used to create the average. As there are 6 individual tasks, a product can get up to 90 points overall. The number of points is then added to the Procyon score, which goes up to 100.

Real-World Protection Test Methodology

Anti-virus test real world

Operating system

Microsoft Windows; the exact version will be noted in the individual test report.

Aim of the test

This test aims to find out how effective the security products are at protecting the computer against active real-world malware threats while using the Internet. Any of the protection components in each product can be used to block the malware. The goal is to find out whether the security software protects the computer by hindering the malware to do any system changes (or remediate all changes), the question of how being of secondary importance.

Target Audience

Anyone who browses the Internet or checks webmail using a computer running Microsoft Windows is vulnerable to web-based malware attacks, and should be informed of the effectiveness of security products that can help to protect against such attacks. IT professionals and enthusiasts who provide technical support for colleagues, clients, family and friends will want to know which security products will provide the best protection for their supported users.

Definition of the threat

Web-based malware is any type of malicious program (e.g. virus, worm, Trojan) originating from the Internet and which can be run on the user’s computer, either using an exploit, or by fooling the user into thinking that it is a harmless file such as a game (e.g. by social engineering).

Scope of the test

The test is concerned with malware coming from the Internet, and does not consider other vectors by which malware can enter a computer, e.g. via USB flash drive or local area network. Tested products have full Internet access during the test, and can make use of reputation features or other cloud services. Consequently the test does not indicate how well a product protects a PC that is offline. Readers who are concerned about the offline proactive protection afforded by security products are recommended to read the report of our Heuristic/Behaviour Test, which covered precisely this scenario. In the event that a product reaches 100% detection in this test or a section thereof, this only demonstrates that it has protected against all the particular samples in this individual test/section.

Test Setup

The test is carried out on identical systems. At the beginning of each month we check that the most recent major version of each product is installed. Minor version updates are installed as they become available, and the latest signatures are applied before each single test case. Testing-framework software is installed on all test machines. This runs the test procedure by browsing to malicious URLs submitted by the testers, which would lead to the system being compromised if not successfully protected by the security software. The testing framework also includes our own monitoring software, so that any changes made to the system during the test will be recorded. Furthermore, the recognition algorithms check whether the antivirus program reacts to the malware. Our test system monitors whether any changes are made to files, the Windows Registry, MBR, running processes, network traffic etc. It does not rely on the tested product to determine whether the malware has been blocked, meaning that security appliances can be tested just as effectively as locally installed software solutions. Another component of the testing-framework software resets the test machines to their starting configuration after each test case.

Settings

In the consumer test-series, all product settings are left at their default values. In the enterprise test-series, setting changes are allowed and documented inside the reports. The products have unrestricted cloud access throughout the test. Before the test proper is run, all products will be tested to ensure that they are correctly configured and functioning properly. Additionally, we employ tools to check and log whether the connection to the product’s cloud service (if there is one) is working properly.

If user interactions are requested by a product, we always choose “Allow” or equivalent. If the product protects the system anyway, we count the malware as blocked, even though we allow the program to run when the user is asked to make a decision. If the system then becomes compromised, we count it as user-dependent. We consider “protection” to mean that the system is not compromised. This means that the malware is not running (or is removed/terminated) and there are no system changes. We do not consider an outbound-firewall alert about a running malware process, which asks whether or not to block traffic from the users’ workstation to the Internet, to be protection.

Sources and numbers of test cases

We aim to use visible and relevant malicious websites/malware that are currently out there and present a risk to ordinary users. We try to include as many in-the-wild exploits as possible; these are usually well covered by almost all major security products, which may be one reason why the scores look relatively high, beside the fact that on a fully patched Windows system there are not many different exploits online to test against. The rest are URLs that point directly to malware executables; this causes the malware file to be downloaded, thus replicating a scenario in which the user is tricked by social engineering into following links in spam mails or websites, or installing a Trojan or other malicious software.

We use our own crawling system to search continuously for malicious sites and extract malicious URLs (including spammed malicious links). We also search manually for malicious URLs. In the rare event that our in-house methods do not find enough valid malicious URLs on one day, we have contracted some external researchers to provide additional malicious URLs (initially for the exclusive use of AV-Comparatives) and look for additional (re)sources.

In this kind of testing, it is very important to use enough test cases. If an insufficient number of samples is used in comparative tests, differences in results may not indicate actual differences in protective capabilities among the tested products. Our tests use much more test cases (samples) per product and month than any similar test performed by other testing labs. Because of the higher statistical significance this achieves, we consider all the products in each results cluster to be equally effective, assuming that they have a false-positives rate below the industry average.

Test procedure

Before browsing to each new malicious URL we update the programs/signatures (as described above). Our automated test procedure feeds one malicious URL to all test machines simultaneously, to help ensure the same testing conditions for each product. The evaluation process for each test case will recognise any variations among the malware files executed on each test machine. After the malware is executed (if not blocked before), we wait several minutes for malicious actions and also to give e.g. behaviour-blockers time to react and remedy actions performed by the malware. If the malware is not detected and the system is indeed infected/compromised, the process goes to “System Compromised”. Otherwise the product is deemed to have protected the test system, unless a user interaction is required. In this case, if the worst possible decision by the user results in the system becoming compromised, we rate this as “user-dependent”. Where a tested product requests a user decision, we always select the option to let the program run (e.g. “Allow”). In cases where we do this, but the program is blocked anyway, the program is deemed to have protected the system. After each test case the machine is reset to its clean state.

False positives

The false-alarm test in the Whole-Product Dynamic “Real-World” Protection Test consists of two parts: wrongly blocked domains (while browsing) and wrongly blocked files (while downloading/installing). It is necessary to test both scenarios because testing only one of the two above cases could penalize products which focus mainly on one type of protection method, either URL filtering or on-access/behaviour/reputation-based file protection.

  1. a) Wrongly blocked domains (while browsing)

We use around one thousand randomly chosen popular domains. Blocked non-malicious domains/URLs are counted as false positives (FPs).

  1. b) Wrongly blocked files (while downloading/installing)

We use about one hundred different applications listed either as top downloads or as new/recommended downloads from ten different popular download portals. The applications are downloaded from the original software developers’ websites (instead of the download portal host), saved to disk and installed to see if they are blocked at any stage of this procedure. Additionally, we include a few clean files that were encountered and disputed over the past months of the Real-World Protection Test.

Summary

Detection Yes
False Positives Yes
Cloud connectivity Yes
Updates allowed Yes
Default configuration Yes (for consumer products)