- Storage costs are getting cheaper each year, which has made it easy for companies to capture terabytes (and sometimes even petabytes) of data related to their customers, suppliers, and operations. For example, on social networking sites like Facebook, users have shared 30+ billion pieces of content.
- A spate of technologies have been introduced in the last few years to help both manage and analyze large data sets; many of these technologies are open source, which has lead to their increased adoption and proliferation. These technologies are designed to overcome the limitations of traditional database systems that can potentially buckle under the weight of a tremendous load of data.
In a previous blog post, I discussed the role of large-scale data analytics in the context of endpoint malware defense. In this post, I’ll take a step back and make the broader claim that the IT security industry as a whole also needs to utilize big data technologies. Defending a network requires one to analyze large amounts of data:
- Network and security devices generate continuously growing amounts of log and event data that contains insights about potentially malicious activity. Hard drive sizes continue to grow at exponential rates while price remains the same, which means that more of this data is being stored.
- The number of unique threats is growing each day. Several hundred million unique malware variants are seen each year and the vast majority of these are seen only once.
We are seeing concrete evidence of this trend. For example, according to the latest Verizon data breach report, organizations used their own event monitoring or log analysis tools only 6% of the time to discover that they had been compromised. The report states: “Many smaller organizations do not have the awareness, aptitude, funding, or technical support to [use these tools] with the sophistication of the threats they face… even large businesses seem to have a difficult time utilizing their investments for significant return.”
The security industry, as a whole, needs automated tools that can analyze data (including data generated by other tools) and provide actionable intelligence on the state of threats to their customers’ information assets.
Fortunately, this demand for analyzing large security-related datasets has been met with a supply of new readily available technologies that address these issues. For example, Hadoop, an open source implementation of the Map Reduce framework for processing massive amounts of data has been utilized in numerous industries such as social-media analysis and targeted marketing. More recently, security was mentioned as a “killer app” for Hadoop because it can use a “divide and conquer” strategy to automatically process the growing amount of security data in an enterprise.
At the same time, it is important to keep in mind that Hadoop is only a tool and not a complete solution in and of itself. In general, multiple tools and vendors are starting to emerge to form an ecosystem. It is our hope that the IT security industry can contribute to forming a more mature solution for security analysts to deal with their “data deluge.”

Nice Post
ReplyDelete