Real-World Protection Test August-November 2010
|Test Period||August - November 2010|
|Number of Testcases||1968|
|Online with cloud connectivity|
|False Alarm Test included|
The threat posed by malicious software is growing day by day. Not only is the number of malware programs increasing, also the very nature of the threats is changing rapidly. The way in which harmful code gets onto computers is changing from simple file-based methods to distribution via the Internet. Malware is increasingly infecting PCs through e.g. users deceived in visiting infected web pages, installing rogue/malicious software or open emails with malicious attachments.
The scope of protection offered by antivirus programs is extended by the inclusion of e.g. URL-blockers, content filtering, anti-phishing measures and user-friendly behavior-blockers. If these features are perfectly coordinated with the signature-based and heuristic detection, the protection provided against threats increases.
In spite of these new technologies, it remains very important that the signature-based and heuristic detection abilities of antivirus programs continue to be tested. It is precisely because of the new threats that signature/heuristic detection methods are becoming ever more important too. The growing frequency of zero-day attacks means that there is an increasing risk of malware infection. If this is not intercepted by “conventional” or “non-conventional” methods, the computer will be infected, and it is only by using an on-demand scan with signature and heuristic-based detection that the malware can be found, and hopefully removed. The additional protection technologies also offer no means of checking existing data stores for already-infected files, which can be found on the file servers of many companies. Those new security layers should be understood as an addition to good detection rates, not as replacement.
In this test all features of the product contribute protection, not only one part (like signatures/ heuristic file scanning). So the ability of protection should be higher than in testing only parts of the product. We would recommend that all parts of a product would be high in detection, not only single components (e.g. URL blocking protects only while browsing the web, but not against malware introduced by other means or already present on the system).
The Whole-Product-Dynamic test is a project of AV-Comparatives and the University of Innsbruck, faculty of Computer Science and Quality Engineering. It is partially supported by the Austrian Government. Some details about the test process cannot be disclosed, as it could be easily misused by vendors to game the test systems.
The following products take part in the official Whole-Product-Dynamic main test-series. We may test also other products which are not part of the main test-series, but only separately and for a limited timeperiod. In this type of test we usually included Internet Security Suites, although also other product versions would fit, because what is tested is the “protection” provided by the various products against a set of real-world threats. Main product versions used for the monthly test-runs:
- Kingsoft Internet Security
2010 | 2010 | 2011 | 2011
- Norman Security Suite Pro
8.0 | 8.0 | 8.0 | 8.0
- PC Tools Internet Security
2010 | 2011 | 2011 | 2011
Testing hundreds of URL’s a day with dozens of antivirus programs makes a total of thousands URL tests and only a high degree of automation makes this possible. This automation has been developed jointly with the Institute of Computer Science of the University of Innsbruck and AV-Comparatives.
Over the year we had to introduce several changes in the automated systems to circumvent and also prevent some AV vendors trying to “game” the system, as well as update/rewrite our tools due unannounced changes in the security products which made it harder to create automated systems. Due that, the start of our whole-product-dynamic test started with some delay. We kindly ask vendors to inform us in advance in case of product changes which can affect automated testing systems.
Preparation for Test Series
Every antivirus program to be tested is installed on its own test computer (please note that the term â€œantivirus programâ€ as used here may also mean a full Internet Security Suite). All computers are connected to the Internet, each with its own external IP address. The system is frozen, with the operating system and antivirus program installed.
The entire test is performed on real workstations. We do not use any kind of virtualization. Each workstation has its own internet connection with its own external IP. We have special agreements with several providers (failover clustering and not blocking any traffic) to ensure a stable internet connection. The tests are performed using a live internet connection. We took the necessary precautions (with special configured firewalls, etc.) not to harm others (i.e. not to cause outbreaks).
Hardware and Software
For this test we used identical workstations, an IBM Bladecenter and network attached storage (NAS).
|Workstations||Fujitsu||E3521 E85+||Intel Core2 Duo||4 GB||80 GB|
|Blades||IBM||LS200||AMD Dual Opteron||8 GB||76 GB|
|NAS||QNAP||TS-859U-RP||Atom Dual Core||1 GB||16 TB Raid 6|
The tests are performed under Windows XP SP3 with no further updates. Further installed (vulnerable) software includes:
|Adobe||Flash Player ActiveX||10.1||Microsoft||Internet Explorer||7|
|Adobe||Flash Player Plug-In||10||Microsoft||Office Professional||2003|
|Adobe||Acrobat Reader||8.0||Microsoft||.NET Framework||3.5|
We use every security suite with its default (out-of-the-box) settings. If user interactions are required, we will choose the default option. Our whole-product dynamic test aims to simulate real-world conditions as experienced every day by users. Therefore, if there is no predefined action, we will always use the same action where we consider the warning/message to be very clear and definitive. If the message leaves it up to the user, we will mark it as such and if the message is very vague, misleading or even suggesting trusting e.g. the malicious file/URL/behavior, we will consider it as a miss, as the ordinary user would. We consider a protection if the system is not compromised. This means that the malware is not running (or is removed/terminated) and there are no significant/malicious system changes. An outbound-firewall alert about a running malware process, which asks whether to block traffic form the users’ workstation to the internet is too little, too late and not considered as protection by us.
Preparation for every testing day
Every morning, any available antivirus software updates are downloaded and installed, and a new base image is made for that day. This ensures that even in the case the antivirus would not finish a bigger update during the day, it would at least use the updates of the morning, like it would happen to the user in the real-world.
Testing Cycle for each malicious URL
First of all, there is researching. With our own crawler we are searching the web constantly for malicious sites. We are not focusing on zero-day malware/exploits (although it is possible that they are also present in the URL pool); we are looking for malicious websites that are currently out there and being a threat to the ordinary users. Before browsing to each new malicious URL/test-case we update the programs/signatures. New major product versions are installed once a month, that’s why in each monthly report we only give the product main version number. Our test-software starts monitoring the PC, so that any changes made by the malware will be recorded. Furthermore, the detection algorithms check whether the antivirus program detects the malware. After each test case the machine is reverted to its clean state.
Security products should protect the user’s PC. It is not very important at which stage the protection takes place. This can either be while browsing to the website (e.g. protection through URL Blocker), while an exploit tries to run or while the file is being downloaded/created or while the malware is executed (either by the exploit or by the user). After the malware is executed (if not blocked before), we wait several minutes for malicious actions and also to give e.g. behavior-blockers time to react and remedy actions performed by the malware. If the malware is not detected and the system is indeed infected/compromised, the process goes to “Malware Not Detected”. If a user interaction is required and it is up to the user to decide if something is malicious, and in the case of the worst user decision the system gets compromised, we rate this as “user-dependent”. Due that, the yellow bars in the results graph can be interpreted either as protected or not protected (it’s up to the user).
Due the dynamic nature of the test to mimic real-world conditions and due the way several different technologies work (like cloud scanners, reputation services, etc.), it is a matter of fact that such tests cannot be repeated or replicated like e.g. static detection rate tests. Anyway, we are trying to log as much as reasonably possible to prove our findings and results. Vendors are invited to provide useful logs inside their products which can provide them with the additional proof/data they want. Vendors were given one to two weeks time after each testing month to dispute our conclusion about the compromised cases, so that we could recheck if there were maybe some problems in the automation or with our analysis of the results.
In the case of cloud products, we will only consider the results that the products had at the time of testing; sometimes the cloud services provided by the security vendors are down due to faults or maintenance by the vendors, but this is often not disclosed/communicated to the users by the vendors. This is also a reason why products relying too much on cloud services can be risky, as in such cases the security provided by the products can decrease significantly. Cloud signatures/detection/reputation should be implemented in the products to increase the other protection features (like local real-time scan detection and heuristics, behavior-blockers, etc.) and not replace them completely, as e.g. offline cloud services mean the PC’s may be exposed to higher risks.
Source of test cases
We use our own crawling system to search continuously for malicious sites and extract malicious URLs (including spammed malicious links). We also research manually for malicious URLs. If our in-house crawler does not find enough valid malicious URLs on one day, we have contracted some external researchers to provide additional malicious URLs exclusively to AV-Comparatives. Although we have access to URLs shared between vendors and other public sources, we refrain from using these for the tests.
We are not focusing on zero day exploits/malware, but on current and relevant malware that is currently out there and problematic to users. We are trying to include about 30-50% URLs pointing directly to malware. For example, if the user is tricked by social-engineering to follow links in spam mails or websites or if the user is tricked into installing some Trojan or other rogue software. The rest/bigger part were exploits / drive by downloads. Those seem to be usually well covered by security products.
In this kind of testing, it is very important to use enough test cases. If an insufficient number of sam-ples are used in comparative tests, differences in results may not indicate actual differences among the tested products.
Below you can see an overview of the individual testing months:
August 2010 – 304 test cases
September 2010 – 702 test cases
October 2010 – 454 test cases
November 2010 – 508 test cases
We do not give in this report exact numbers for the single months on purpose, to avoid that little differences of 1-2 cases are misused to state that one product is better than the other on a given month and test-set size. We give the total numbers in the summary, where the size of the test-set is bigger and more significant differences may be observed.
False Positive (False Alarm) Test Result
The false alarm test in the Whole-Product-Dynamic test consists of two parts:
- False Alarms on domains (while browsing)
- False Alarms on files (while downloading/installing)
It is necessary to test both scenarios because testing only one of the two above cases could penalize products which focus mainly on one type of protection method, either e.g. URL/reputation-filtering or e.g. on-access/behavior/reputation-based file protection.
False Alarms on domains (while browsing)
For this False Alarm test we used domains listed in the Google Top1000 sites list of August 2010. We tested against those Top-Level-Domains twice, in September and in October. Non-malicious domains which were blocked at any time (either September or October) were counted as FPs (as they should never have been blocked). All below websites are among the most popular websites on the web (ranked on Alexa between place ~300 and ~3000 worldwide).
The domains below have been reported to the respective vendors for review and are now no longer blocked. We do not display the domains as we do not know if in future they may be still clean (and we also want to avoid making publicity for those domains).
By blocking the whole domains like in the cases below, the security products are causing potential financial damage (beside the damage on website reputation) to the domain owners, including loss of e.g. ads revenue. Due that, we strongly recommend vendors to block whole domains only in the case where the domain’s sole purpose is to carry/deliver malicious code, and to otherwise block just the malicious pages (as long as they are indeed malicious).
From the tested vendors, the following vendors had FPs on the tested domains during the testing period:
|G DATA||1 FP|
|Trend Micro||4 FP|
Some few more websites were blocked by various products, but not counted as FPs here this time. Those cases were mainly websites or download portals currently still hosting/promoting also some adware or unlicensed software etc. Many products continue to block websites even when they are no longer malicious and have already been cleaned up for some time. This happens also with popular websites, but of course even more with less popular/prevalent websites, with the risk of turning the security products into a web censoring tool which goes too far in blocking websites (based on what the security vendor considers being a risk or potentially unwanted content for the user). It would be much better if the product were only to block the access to the malicious part/file instead of a whole website/domain which is not malicious by itself (e.g. not containing any drive-by/exploits etc.), unless the user wants and enables e.g. a parental control setting or similar. Products which tend to block URLs based e.g. on reputation may be more prone to this and score also higher in protection tests, as they block many unpopular and strange looking websites. A further option for future FP testing could be to use such URLs which are discarded as clean or down during malware testing.
At the moment the AV industry is discussing about what/when and under which circumstances a blocked website which is not or no longer malicious by itself can be considered as a “false alarm”, as opinions are varying even among vendors. We will look at the outcome of that discussion and consider it if this makes sense also from a user perspective.
False Alarms on files (while downloading/installing)
For this False Alarm test we used software listed as Top 100 Freeware downloads in 2010 of a popular German download portal (http://www.pcwelt.de). We may change the used download portals and clean site sources every time (and maybe also no longer disclose which portals/lists were used), in order to avoid that vendors focus on whitelisting/training mainly against those sites/lists/sources.
We tested only with 100 applications, as this test was done manually in order to install the programs completely and also use them afterwards to see if they get blocked. We may automate also this type of FP testing in order to get bigger test-sets in future.
None of the products had a false alarm on those very popular applications. There were some firewall alerts, but as we do not consider firewall alerts (for programs trying to access the internet) as protection in the dynamic tests, we are also not considering them as FPs. It would be surprising to encounter FPs on very popular software in the case of well-known and internationally used AV’s, especially as the test is done at one point in time and FPs on very popular software are noticed and fixed within few hours. Probably it would make more sense to test against lower-prevalence software. We observed accidentally also some FPs on less popular software/websites, but have not included them this time as vendors do not see them as a big issue. If you think different, please let them (not us, as we know already!) know what you as a user think about FPs that happen to you on less popular files and websites.
As we do not yet have much experience about FP rates in Whole-Product-Dynamic testing, we are not considering the FPs in the awards this time, but we may give lower awards to products which will have FPs (or many user interactions) in future Whole-Product-Dynamic Tests.
Test period: August – November 2010 (1968 Test cases)
The graph below shows the overall protection rate (all samples), including the minimum and maximum protection rates for the individual months.
|Blocked||User dependent||Compromised||Protection Rate|
[Blocked % + (User dependent % / 2)]
Award levels reached in this Real-World Protection Test
We provide a 3-level-ranking-system (STANDARD, ADVANCED and ADVANCED+). Overviews of levels reached in previous main tests can be found on our website13. The awards are decided and given by the testers based on the observed test results (after consulting statistical models). The following certification levels are for the results reached in the Whole-Product-Dynamic Test:
Most operating systems already include own firewalls, automatic updates and may even prompt the user before downloading or executing files if they really want to do that, warning that downloading/executing files can be dangerous. Mail clients and web mails include spam filters too. Furthermore, most browsers include Pop-Up blockers, Phishing/URL-Filters and the possibility to remove cookies. Those are just some of the build-in protections, but despite all of them, systems can get infected any-way. The reason for this is in most cases is the ordinary user, who may get tricked by social engineering into visiting malicious websites or installing malicious software.
Users expect a security product not to ask them if they really want to execute a file etc. but expect that the security product will protect the system in any case without them having to think about it, and despite what they do (i.e. executing unknown files / malware). We try to keep in mind the interests of the users and deliver good and easy-to-read test reports. We are continuously working on improving further our automated systems to deliver a better overview about product capabilities.
Copyright and Disclaimer
This publication is Copyright © 2010 by AV-Comparatives ®. Any use of the results, etc. in whole or in part, is ONLY permitted after the explicit written agreement of the management board of AV-Comparatives prior to any publication. AV-Comparatives and its testers cannot be held liable for any damage or loss, which might occur as result of, or in connection with, the use of the information provided in this paper. We take every possible care to ensure the correctness of the basic data, but a liability for the correctness of the test results cannot be taken by any representative of AV-Comparatives. We do not give any guarantee of the correctness, completeness, or suitability for a specific purpose of any of the information/content provided at any given time. No one else involved in creating, producing or delivering test results shall be liable for any indirect, special or consequential damage, or loss of profits, arising out of, or related to, the use or inability to use, the services provided by the website, test documents or any related data.
For more information about AV-Comparatives and the testing methodologies, please visit our website.