I look forward to reading the statistics every year. How many records were stolen last year? What did it cost the victims? How did the attackers pull it off?
I love that organizations like Verizon Business and Ponemon Institute compile and publish all this data. It helps shed a lot of light on what’s going on in the information security world.
I’ll admit it now – I’m a bit of a stats guy. I look for patterns whenever I see data, and am always trying to make sense of the big picture. I eagerly dug into this year’s round of reports.
Almost as quickly as I got started reading, I realized something had changed. Of course the data is new this year, but I found it perplexing. It’s presented just the same way it’s been in the past, and the quality of the research and writing seems to be as good if not better than ever. It didn’t take me long to realize that what changed was my expectations.
After the year we had in 2011, with data breaches happening all around us almost every day, I almost expected legions of hacktivists to jump off of the pages and into my office. I thought we’d see reports that counted up the better part of a half-billion records stolen – in my head I can add up a solid 300 million from big public disclosures alone – and confirm 2011 as the biggest year ever for data theft.
Heck, a quick glance at the breaches reported via DataLossDB shows that the top 20 breaches (based on amount of records breached) worldwide exceed 330 million records. In just the U.S., there were 11 individual breaches of over 1 million records each. Those 11 breaches alone totaled more than the 174 million reported in the Verizon Data Breach Investigations Report (DBIR).
The trusty calculator shows that those 11 breaches consisted of 175,161,416 records. Then there were another 19 U.S.-based breaches between 100,000 and 1 million records, tacking on another 5,754,299 records, giving us 180,915,715 records breached, across only 30 U.S.-only breaches. You get the picture.
And the victim’s costs, oh how I want to understand a victim’s costs after a data breach. Is it really worth paying what it takes to be secure? Or do organizations see better returns if they do the bare minimum and expect to take their lumps now and again? As 2011 had at least 20 well publicized mega-breaches (over 1M records lost at once) – there has got to be some great case study material out there.
I did find quite a few gems among the data in the reports. Verizon mentions that 96-98% of all stolen records they looked at came out of databases. I guess that’s not a surprise, since pretty much all businesses store most of their data in databases, but still, those numbers are staggering.
Verizon also concluded that 97% of breaches were avoidable through simple controls, and that nearly all data is stolen by outside attackers who hack into the server infrastructure. Ponemon says the average data breach costs $5.5M, and notes that for the first time the per-record cost of a data breach has gone down (to just under $200/record). However, there is a catch.
The reports only cover the incidents they cover. Verizon counted 174 million records stolen in 2011, but we know from public disclosures (see above) that the real numbers were drastically higher. Ponemon’s data only covers 49 breaches (I’ve seen counts of over 800 for the year), and they’re all small breaches with less than 100,000 records stolen.
I don’t know what that tells us about the costs of the mega breach – but if a small breach of fewer than 100,000 records costs $5.5M on average, it’s safe to assume that a big breach (say 20 million records) is going to cost at least a few times more money.
And it’s not just me noticing. Others are finding disillusionment in the data too – but in different ways. I was shooting a video with Jack Daniel of Tenable the other day, right as the DBIR came out. He made some great points about bias in the data sets.
Verizon only gets data from the companies who can afford their services, and of those, the companies that don’t choose someone else, perhaps someone more specialized. Same goes for folks like Mandiant’s customers and their report – the data is heavily influenced by which victims hire which data breach investigators. I won’t try to put words in Jack’s mouth – he wrote his own thoughts down here.
At the end of the day, these reports are important. They provide much needed insight into at least some data breaches. But we have to accept that this isn't the U.S. Census.
The data represents a small, non-random sampling that we can't count on to be indicative of the whole data breach story. But they're data points, solid, well researched data points.
We must learn what we can from them without becoming hypnotized by the hype that can surround them.