Real-World Protection Test Methodology

24. November 2015

Methodologies

Operating system

Microsoft Windows; the exact version will be noted in the individual test report.

Aim of the test

This test aims to find out how effective the security products are at protecting the computer against active real-world malware threats while using the Internet. Any of the protection components in each product can be used to block the malware. The goal is to find out whether the security software protects the computer by hindering the malware to do any system changes (or remediate all changes), the question of how being of secondary importance.

Target Audience

Anyone who browses the Internet or checks webmail using a computer running Microsoft Windows is vulnerable to web-based malware attacks, and should be informed of the effectiveness of security products that can help to protect against such attacks. IT professionals and enthusiasts who provide technical support for colleagues, clients, family and friends will want to know which security products will provide the best protection for their supported users.

Definition of the threat

Web-based malware is any type of malicious program (e.g. virus, worm, Trojan) originating from the Internet and which can be run on the user’s computer, either using an exploit, or by fooling the user into thinking that it is a harmless file such as a game (e.g. by social engineering).

Scope of the test

The test is concerned with malware coming from the Internet, and does not consider other vectors by which malware can enter a computer, e.g. via USB flash drive or local area network. Tested products have full Internet access during the test, and can make use of reputation features or other cloud services. Consequently the test does not indicate how well a product protects a PC that is offline. Readers who are concerned about the offline proactive protection afforded by security products are recommended to read the report of our Heuristic/Behaviour Test, which covered precisely this scenario. In the event that a product reaches 100% detection in this test or a section thereof, this only demonstrates that it has protected against all the particular samples in this individual test/section.

Test Setup

The test is carried out on identical systems. At the beginning of each month we check that the most recent major version of each product is installed. Minor version updates are installed as they become available, and the latest signatures are applied before each single test case. Testing-framework software is installed on all test machines. This runs the test procedure by browsing to malicious URLs submitted by the testers, which would lead to the system being compromised if not successfully protected by the security software. The testing framework also includes our own monitoring software, so that any changes made to the system during the test will be recorded. Furthermore, the recognition algorithms check whether the antivirus program reacts to the malware. Our test system monitors whether any changes are made to files, the Windows Registry, MBR, running processes, network traffic etc. It does not rely on the tested product to determine whether the malware has been blocked, meaning that security appliances can be tested just as effectively as locally installed software solutions. Another component of the testing-framework software resets the test machines to their starting configuration after each test case.

Settings

In the consumer test-series, all product settings are left at their default values. In the enterprise test-series, setting changes are allowed and documented inside the reports. The products have unrestricted cloud access throughout the test. Before the test proper is run, all products will be tested to ensure that they are correctly configured and functioning properly. Additionally, we employ tools to check and log whether the connection to the product’s cloud service (if there is one) is working properly.

If user interactions are requested by a product, we always choose “Allow” or equivalent. If the product protects the system anyway, we count the malware as blocked, even though we allow the program to run when the user is asked to make a decision. If the system then becomes compromised, we count it as user-dependent. We consider “protection” to mean that the system is not compromised. This means that the malware is not running (or is removed/terminated) and there are no system changes. We do not consider an outbound-firewall alert about a running malware process, which asks whether or not to block traffic from the users’ workstation to the Internet, to be protection.

Sources and numbers of test cases

We aim to use visible and relevant malicious websites/malware that are currently out there and present a risk to ordinary users. We try to include as many in-the-wild exploits as possible; these are usually well covered by almost all major security products, which may be one reason why the scores look relatively high, beside the fact that on a fully patched Windows system there are not many different exploits online to test against. The rest are URLs that point directly to malware executables; this causes the malware file to be downloaded, thus replicating a scenario in which the user is tricked by social engineering into following links in spam mails or websites, or installing a Trojan or other malicious software.

We use our own crawling system to search continuously for malicious sites and extract malicious URLs (including spammed malicious links). We also search manually for malicious URLs. In the rare event that our in-house methods do not find enough valid malicious URLs on one day, we have contracted some external researchers to provide additional malicious URLs (initially for the exclusive use of AV-Comparatives) and look for additional (re)sources.

In this kind of testing, it is very important to use enough test cases. If an insufficient number of samples is used in comparative tests, differences in results may not indicate actual differences in protective capabilities among the tested products. Our tests use much more test cases (samples) per product and month than any similar test performed by other testing labs. Because of the higher statistical significance this achieves, we consider all the products in each results cluster to be equally effective, assuming that they have a false-positives rate below the industry average.

Test procedure

Before browsing to each new malicious URL we update the programs/signatures (as described above). Our automated test procedure feeds one malicious URL to all test machines simultaneously, to help ensure the same testing conditions for each product. The evaluation process for each test case will recognise any variations among the malware files executed on each test machine. After the malware is executed (if not blocked before), we wait several minutes for malicious actions and also to give e.g. behaviour-blockers time to react and remedy actions performed by the malware. If the malware is not detected and the system is indeed infected/compromised, the process goes to “System Compromised”. Otherwise the product is deemed to have protected the test system, unless a user interaction is required. In this case, if the worst possible decision by the user results in the system becoming compromised, we rate this as “user-dependent”. Where a tested product requests a user decision, we always select the option to let the program run (e.g. “Allow”). In cases where we do this, but the program is blocked anyway, the program is deemed to have protected the system. After each test case the machine is reset to its clean state.

False positives

The false-alarm test in the Whole-Product Dynamic “Real-World” Protection Test consists of two parts: wrongly blocked domains (while browsing) and wrongly blocked files (while downloading/installing). It is necessary to test both scenarios because testing only one of the two above cases could penalize products which focus mainly on one type of protection method, either URL filtering or on-access/behaviour/reputation-based file protection.

a) Wrongly blocked domains (while browsing)

We use around one thousand randomly chosen popular domains. Blocked non-malicious domains/URLs are counted as false positives (FPs).

b) Wrongly blocked files (while downloading/installing)

We use about one hundred different applications listed either as top downloads or as new/recommended downloads from ten different popular download portals. The applications are downloaded from the original software developers’ websites (instead of the download portal host), saved to disk and installed to see if they are blocked at any stage of this procedure. Additionally, we include a few clean files that were encountered and disputed over the past months of the Real-World Protection Test.

Summary

Detection	Yes
False Positives	Yes
Cloud connectivity	Yes
Updates allowed	Yes
Default configuration	Yes (for consumer products)