"If SIEM product cannot meet a few specific performance requirements, it's underlying features, no matter how well implemented, become useless in day-to-day operations."
Discussions of Security Information and Event Management (SIEM) often focus on features, but it is important to remember that all features must support an overlaying purpose or function. The two most important functions of SIEM are Incident Detection and Incident Response: one requiring the simplification of large amounts of data; the other requiring depth of detail and the addition of context. Many SIEMs fail to meet these basic expectations because both functions require better system performance than what is commonly available. By setting performance baselines for event collection, concurrent analysis, and report response time, however, it is possible to establish the minimum requirements for a truly functional SIEM.
Ultimately, it all comes down to data. With the increasing proliferation of data networks, and the reliance on those networks for critical business applications, the amount of raw information that needs to be managed is already beyond the capabilities of most data processing systems—and the situation is getting worse. To meet the fundamental requirements of SIEM (incident detection and incident response), the first necessity is to develop a data processing architecture capable of meeting these scalability and performance requirements. At a minimum, a SIEM must be capable of all of the following, without exception:
The ability to access data and retrieve actionable intelligence, on all collected data, in "a timely manner."
A SIEM must be able to provide actionable intelligence on tens of millions of events per minute, to enable mitigation of detected incidents.
Once these basic requirements are met, additional features add value to the system: correlation of individual events to detect larger incidents; notification and workflow management; vulnerability and risk assessment; etc. However, all of these features add value only if the appropriate amount of observed network and device activity is being analyzed, and the results are achieved within an appropriate timeframe.
A SIEM needs to detect risks and threats to our infrastructures (Incident Detection) and then allow forensic investigations, in a timely manner, for the remediation of those incidents (Incident Response). Examining both of these key functions, it becomes clear that each is dependent upon the collection, storage, and analysis of data, with more value being realized as the amount and scope of data is increased.
Incident Detection is the first goal of SIEM, and it can be seen, once detection is broken down into it's fundamental parts, that the amount of data monitored is critical to the process. It is therefore a strict requirement that all available data be supported, at observed rates occurring in the network, for a SIEM to provide ay value at all. Incident Detection involves:
The end-result of incident detection is the need to take further action. What is often overlooked is that there are often hard- and soft- dollar costs, as well as potential exposure and liability concerns, that continue to accumulate throughout the process of incident response. Example: certain vulnerabilities, such as DNS cache poisoning, take time to exploit. In the case of DNS cache poisoning, internal tests have shown an average of four minutes is required for the exploit to succeed. While the result of successful DNS cache poisoning can be extremely detrimental to ongoing business operations, the exploit can be detected almost immediately, leaving a window of just under four minutes for an incident response team to act preventatively on that notification.
Tasks that might be taken include looking at the returned IP address, issued by the invalid DNS response, and proactively restricting traffic to or from that host.
Consider the following example. A SQL-slammer attack is detected by an intrusion prevention device, and a notification is sent to an administrator. He or she must use a SIEM to:
So, when is incident response fast enough? When it is faster than the attack.
This translates to the ability to perform multiple queries per minute on all collected information. This is required to fully detect attack vectors and to determine the propagation of an attack through the network (the above example would require a minimum of six event queries). Because flow information, event information, and log information is needed to perform these types of investigations, the system must support high collection rates and, as a result, much larger total volumes of events. Despite the high volume of data that must be managed, high-performance analysis and reporting is a strict requirement. Without the ability to meet these basic performance requirements, the system is incapable of performing these functions fast enough to allow mitigation of the threat.
Looking at Incident Response outside of the previous example, it is clear that the immediate availability, of as much relevant information as possible, is important—as the relationship between all events and activities can provide valuable context during the investigative process. However, the speed at which that data can be accessed is just as important. If cost, exposure and liability can only be minimized if addressed in under four minutes, a SIEM that is incapable of returning search results, queries, and other analytical requests in a few seconds holds little value in terms of incident response. Though often overlooked when evaluating a SIEM, incident response is an important and necessary function, taking over where Incident Detection ends:
Available database technologies are built upon a schema of relational tables. Each table consists of a series of columns and rows, with each column being an index and each row being a unique data record. As records are added, the number of rows increases. Because each network flow, log and event represents an added row to a database table, these tables grow in size rapidly when collecting data at the rates required for network security. Tables can also grow through the addition of indices: adding new columns for specific, parsed data such as an IP address, username, or event identifier.
As tables grow, either through the addition of indices or data records, the processing required to perform a table scan increases. Table scans are required to search your collected data, making them the primary bottleneck in forensic operations. In fact, in a moderately indexed table (consisting only of: date, time, event id, description, source IP, and destination IP), a single scan of one million records can take several minutes. Scaling the database to ten million records can reduce response time to hours. This is because, for each search, every record of the table must be examined in sequence, pushing the limits of the host processor and memory.
To avoid these performance issues, SIEMs—which require tables consisting of many millions or even billions of unique records—employ a few tricks. This includes breaking data apart into separate tables, or even entirely separate databases, in order to limit the total number of records in any one table. This can improve performance somewhat when performing specific searches, but limits the usefulness of broad correlation. For example, putting events in one database table, and network flow information in another, may allow each to scale better, but the underlying performance limitations are then compounded when attempting to perform conditional searches on both events and flows, concurrently (in other words, when performing correlation). Another method is to reduce the total number of records stored at any given time, rolling older events and flows into an archive. Alternatively, the period of time over which queries are allowed may be limited to more manageable timeframes, such as 24 or 48 hours. This again improves performance under certain conditions, but compounds the problem when comparing new data to old: a common requirement in forensic investigations. Other methods include: the reduction of indices used to reduce overall table size; in-memory analysis to remove database limitations entirely; or the use of parallel processing (server clusters) to improve overall database performance.
While each solution helps to overcome specific limitations, each presents new problems—limitations of searchable data-points within the system, the reduction of historical analysis, added cost, and extensive maintenance—while failing to fully resolve the issue. Even with high-end server clusters, massive amounts of RAM, limited indexing, frequent archiving and strict restrictions on searches, commercial databases simply can't keep up with the heavy requirements of data collection, storage, retrieval, and analysis that is required by the fundamental practice of incident detection and incident response.
To solve the need for greater scalability of the core database, and greater performance when retrieving data from that database, NitroSecurity developed NitroEDB: a data management engine that was designed from the start to support high-speed insertions, and fast access to stored data in very large datastores. With hundreds of man-years invested into research and development in a focused effort to accomplish these specific performance goals, NitroEDB is uniquely capable of extremely high performance—more than enough to overcome the inherent limitations that hinder other solutions. Applied to SIEM, NitroEDB enables NitroView Enterprise Security Manager (ESM) to: collect more information from more sources (more total events per second); store it for long periods of time (total events); and provide extremely fast access to all stored data, even over time (query response time). This translates to:
The ability to immediately access data collected over very long periods of time, performing multiple queries per minute on the billions of stored events, without impacting event collection and storage.
As a result, NitroView ESM is better equipped to:
Because more data is collected and managed concurrently, there is a greater ability to trace an event to some other relevant context (such as identity, physical location in the network, patch levels, authority, etc). Also, because the investigations occur very rapidly, it is possible to respond to an exploit or incident very quickly after the initial notification, reducing total risk, exposure and liability.
Without both the features necessary to perform rudimentary incident detection and response, and the basic performance characteristics with which to support the requirements of data collection, correlation and analysis, a SIEM cannot meet basic customer requirements. A solution that achieves in one area, but not in the other, is effectively only "one half of a SIEM," and will provide little if any value to daily security operations. As a baseline, a mid-size company requires a SIEM capable of:
Without first meeting these base requirements, no SIEM is capable of its full potential. NitroSecurity's NitroView Enterprise Security Manager (ESM) provides a core architecture capable of exceeding these requirements, allowing for quick and easy access to actionable intelligence on all collected security event data.
| Vendor | Device | Daily Events | EPS | Qty Total EPS | |
| Netscreen | Firewall | 50,112 | 0.58 | 1 | 0.58 |
| Microsoft | Windows WMI | 250,560 | 0.29 | 10 | 2.90 |
| Cisco Pix | Firewall | 300,672 | 1.74 | 2 | 3.48 |
| Cisco ASA | Firewall / VPN | 400,032 | 4.63 | 1 | 4.63 |
| Cisco IOS | Switch/Router (ACL) | 207,360 | 0.12 | 20 | 2.40 |
| HP | Procurve Switch | 300,672 | 1.74 | 2 | 3.48 |
| Linux | Server | 103,680 | 0.06 | 20 | 1.20 |
| squid | Proxy | 50,112 | 0.29 | 2 | 0.58 |
| NitroGuard | IPS | 500,256 | 5.79 | 1 | 5.79 |
| Cisco | Catalyst Switch | 99,999,360 | 57.87 | 20 | 1,157.40 |
TOTAL EVENTS 102,162,816 1,182