Retrospective / Proactive Test 2012

Date	March 2012
Language	English
Last Revision	July 19th 2012

Heuristic and behavioural protection against new/unknown malicious software

Release date	2012-07-20
Revision date	2012-07-19
Test Period	March 2012
Number of Testcases	4138
Online with cloud connectivity
Update allowed
False Alarm Test included
Platform/OS	Microsoft Windows

Tested Products
Test Results
Summary Result
Awards

Introduction

Many new viruses and other types of malware appear every day, which is why it is important that antivirus products not only provide new updates, as frequently and as quickly as possible, but also that they are able to detect such threats in advance (preferably without having to execute them or contact the cloud) with generic/heuristic techniques; failing that, with behavioural protection measures. Even if nowadays most antivirus products provide daily, hourly or cloud updates, without proactive methods there is always a time-frame where the user is not reliably protected.

The data shows how good the proactive heuristic/generic detection capabilities of the scanners were in detecting new threats (sometimes also named as zero-hour threats by others) used in this test. By design and scope of the test, only the heuristic/generic detection capability and behavioural protection capabilities (on-execution) were tested (offline). Additional protection technologies (which are dependent on cloud-connectivity) are considered by AV-Comparatives in e.g. whole-product dynamic (“real-world”) protection tests and other tests, but are outside the scope of retrospective tests.

This test report is the second part of the March 2012 test. The report is delivered in late July due to the large amount of work required, deeper analysis, preparation and dynamic execution of the retrospective test-set. This year this test is performed only once, but includes also a behavioural protection element.

The products used the same updates and signatures they had on the 1^st March 2012, and the same detection settings as used in March (see page 5 of this report) were used for the heuristic detection part. In the behavioural test we used the default settings. This test shows the proactive detection and protection capabilities that the products had at that time. We used 4,138 new malware variants that appeared around the 2^nd March 2012. The following products were tested:

Tested Products

Test Procedure

What about the cloud? Even in June (months later), many of the malware samples used were still not detected by certain products which rely heavily on the cloud. Consequently, we consider it a marketing excuse if retrospective tests – which test the proactive detection against new malware – are criticized for not being allowed to use cloud resources. This is especially true considering that in many corporate environments the cloud connection is disabled by the company policy, and the detection of new malware coming into the company often has to be provided (or is supposed to be provided) by other product features. Clouds are very (economically) convenient for security software vendors and allow the collection and processing of large amounts of data. However, in most cases (not all) they still rely on blacklisting known malware, i.e. if a file is completely new/unknown, the cloud will usually not be able to determine if it is good or malicious.

There are two major changes in this test relative to our previous proactive tests. Firstly, because of the frequency of updates now provided by the vendors, the window between malware appearing and a signature being provided by the vendor is much shorter. Consequently we collected malware over a shorter period (~1 day), and the test scores are correspondingly higher than in earlier tests. Secondly, we have introduced a second (optional) element to the test: behavioural protection. In this, any malware samples not detected in the scan test are executed, and the results observed. A participating product has the opportunity to increase its overall score by blocking the malware on/after execution, using behavioural monitoring. The following vendors asked to be included in the new behavioural test: Avast, AVG, AVIRA, BitDefender, ESET, F-Secure, G DATA, GFI, Kaspersky, Panda and PC Tools. The results published in this report show results for all programs for the scan test, plus any additional protection by those products participating in the behavioural test. Although it was a lot of work, we received good feedback from various vendors, as they were able to find bugs and areas for improvement in the behavioural routines.

AV-Comparatives prefer to test with default settings. Almost all products run nowadays by default with highest protection settings or switch automatically to highest settings in the event of a detected infection. Due to this, in order to get comparable results for the heuristic detection part, we also set the few remaining products to highest settings (or leave them to default settings) in accordance with the respective vendor´s wishes. In the behavioural protection part, we tested ALL products with DEFAULT settings. Below are notes about settings used (scan of all files etc. is always enabled) of some products:

F-Secure: asked to be tested and awarded based on their default settings (i.e. without using their advanced heuristics).
AVG, AVIRA: asked us not to enable/consider the informational warnings of packers as detections. Because of this, we did not count them as such.
Avast, AVIRA, Kaspersky: the heuristic detection test was done with heuristics set to high/advanced.

Testcases

This time we included in the retrospective test-set only new malware which has been seen in-the-field and prevalent in the few days after the last update in March. Additionally, we took care to include malware samples which belong to different clusters and that appeared in the field only after the freezing date. Due to the use of only one sample per malware variant and the shortened period (~1 day) of new samples, the detection rates are higher than in previous tests. We adapted the award system accordingly. Samples which were not detected by the heuristic/generic on-demand/on-access detection of the products were then executed in order to see if they would be blocked using behaviour-analysis features. As can be seen in the results, in at least half of the products the behaviour analyser (if even present) did not provide much additional protection. Good heuristic/generic detection remains one of the core components to protect against new malware. In several cases, we observed behaviour analysers only warning about detected threats without taking any action, or alerting to some dropped malware components or system changes without protecting against all malicious actions performed by the malware. If only some dropped files or system changes were detected/blocked, but not the main file that showed the behaviour, it was not counted as a block. As behaviour analysis only come into play after the malware is executed, a certain risk of getting compromised remains (even when the security product claims to have blocked/removed the threat). Therefore, it is preferable that malware gets detected before it gets executed, by e.g. the on-access scanner using heuristics (this is also one of the reasons for the different thresholds on the next page). Behaviour analyser/blockers should be considered as a complement to the other features inside a security product (multi-layer protection), and not as a replacement.

Ranking System

The awards are given by the testers after consulting a number of statistical methods, including hierarchical clustering. We based our decisions on the following scheme:

	Proactive Protection Rates
	Under 50%	3	2	1
None - Few FPs	TESTED	STANDARD	ADVANCED	ADVANCED+
Many FPs	TESTED	TESTED	STANDARD	ADVANCED
Very many FPs	TESTED	TESTED	TESTED	STANDARD
Crazy many FPs	TESTED	TESTED	TESTED	TESTED

Test Results

To know how these antivirus products perform with updated signatures and cloud connection against prevalent malware files, please have a look at our File Detection Tests of March and September. To find out about real-life online protection rates provided by the various products, please have a look at our ongoing Whole-Product Dynamic “Real-World” Protection tests. Readers should look at the results and decide on the best product for them based on their individual needs. For example, laptop users who are worried about infection from e.g. infected flash drives whilst offline should pay particular attention to this Proactive test.

False Positive (False Alarm) Test Result

To better evaluate the proactive detection capabilities, the false-alarm rate has to be taken into account too. A false alarm (or false positive [FP]) occurs when an antivirus product flags an innocent file as infected. False alarms can sometimes cause as much trouble as real infections.

The false-alarm test results were already included in the March test report. For details, please read the report, False Alarm Test March 2013.

1.	Microsoft	0	very few FPs (0-3)
2.	ESET	2	very few FPs (0-3)

3.	Bitdefender, F-Secure	4	few FPs (4-15)
4.	BullGuard	5	few FPs (4-15)

5.	Kaspersky	9	many FPs (over 15)
6.	Panda	10
7.	eScan	11

8.	G DATA	13	very many FPs (over 100)
9.	Avast	14
10.	Avira	15
11.	Tencent	18
12.	PC Tools	22
13.	Fortinet	32
14.	AVG	38
15.	AhnLab	64
16.	GFI	79
17.	Qihoo	149

Summary Result

The results show the proactive (generic/heuristic/behavioural) protection capabilities of the various products against new malware. The percentages are rounded to the nearest whole number.

Below you can see the proactive protection results over our set of new and prevalent malware files/families appeared in-the-field (4,138 malware samples):

	Heuristic Detection	Heuristic + Behavioural Protection Rate[1]	False Alarms	Cluster
Kaspersky	90%	97%	few	1
BitDefender	82%	97%	few	1
Qihoo	95%	–	very many	1
F-Secure	82%	91%	few	1
G DATA	90%	90%	few	1
ESET	87%	87%	very few	1
Avast	77%	87%	few	1
Panda	75%	85%	few	1
AVIRA	84%	84%	few	1

AVG	77%	83%	many	2
BullGuard, eScan	82%	–	few	2
PC Tools	53%	82%	many	2
Microsoft	77%	–	very few	2
Tencent	75%	–	many	2
Fortinet	64%	–	many	2

GFI	51%	51%	many	3
AhnLab	47%	–	many	3

[1] User-dependent cases were given a half credit. Example: if a program blocks 80% of malware by itself, plus another 20% user-dependent, we give it 90% altogether, i.e. 80% + (20% x 0.5).

Award levels reached in this Heuristic / Behavioural Test

The following awards are for the results reached in the proactive/behavioural test, considering not only the protection rates against new malware, but also the false alarm rates:

AhnLab

* these products got lower awards due to false alarms

Notes

This test is an optional part of our public main test-series, that is to say, manufacturers can decide at the beginning of the year whether they want their respective products to be included in the test. The test is currently done as part of the public main-test series only if a minimum number of vendors choose to participate in it.

Microsoft security products are not included in the awards page, as their out-of-box protection is (optionally) included in the operating system and is currently considered out-of-competition.

Readers may be interested to see a summary and commentary of our test methodology which was published by PC Mag two years ago: http://securitywatch.pcmag.com/security-software/315053-can-your-antivirus-handle-a-zero-day-malware-attack

Copyright and Disclaimer

This publication is Copyright © 2012 by AV-Comparatives ®. Any use of the results, etc. in whole or in part, is ONLY permitted after the explicit written agreement of the management board of AV-Comparatives prior to any publication. AV-Comparatives and its testers cannot be held liable for any damage or loss, which might occur as result of, or in connection with, the use of the information provided in this paper. We take every possible care to ensure the correctness of the basic data, but a liability for the correctness of the test results cannot be taken by any representative of AV-Comparatives. We do not give any guarantee of the correctness, completeness, or suitability for a specific purpose of any of the information/content provided at any given time. No one else involved in creating, producing or delivering test results shall be liable for any indirect, special or consequential damage, or loss of profits, arising out of, or related to, the use or inability to use, the services provided by the website, test documents or any related data.

For more information about AV-Comparatives and the testing methodologies, please visit our website.

AV-Comparatives
(July 2012)