Return on Investment (ROI) on SOC Operations – From Noise to Signal
by Guest Blogger -Markus Malewski, Head of SOC/SIEM at ThyssenKrupp
There is often the question how to measure the return on investment on a SOC. It is tough to define the ROI on SOC, which is more complex than the calculation of ROI on an Antivirus solution.
A SOC is not generally an investment that results in a profit – It is more about loss prevention!
In that case we have to look on it from a different point of view.
Information and Event Management Systems (SIEM) helps to analyse and identify the gaps in the implementation of your IT security solutions. In combination with aligned processes and well defined workflows, automatisation and orchestration (SOAR), it could help to improve your IT security program in different ways:
- Elevate IT security posture
- Enrich events with contextual information
- Improve reporting for risk and threat assessment
- Correlate different types of events across many vendors
- Elevate threat response and defense
- Enhance threat detection and fill gaps in existing solutions
Form that point of view, you could add the benefits of your SOC into the ROI calculation of the existing solutions. Compared with the previous outcome the result should be better than before and that portion would be the ROI of your SOC.
That is less pleasing?! You are right!
ROSI approach – Return on Security Investment
ROSI compares the annual loss expectancy (ALE) with the expected loss saving. ALE is the product of the factors Annual Rate Of Occurrence (ARO) and Single Loss Expectancy (SLE). And with ALE you are able to calculate the monetary loss reduction by the difference of ALE without security solution versus the modified ALE (mALE) with implementation of the security solution. While not all security solutions provide 100% mitigation, you have to apply the mitigation ratio to your ALE as well.
Still, it is not that easy at all and it has some limits, because the ROSI calculation is the result of many approximations. – That’s annoying!
Following the Security Principals
The governance of your organisation has to answer the question “How much security is enough?”. While the security principals must support the mission of the organisation, and with that, the concepts of effective security. Including people, process and technology in terms of costs and benefit.
As the Manager of the SOC, you should ask yourself the questions:
- Is my organization paying too much for its SOC?
- Is the SOC beneficial?
That means you have to calculate all the incurring costs of a SOC and to proof how you can safe money by comparing your budget in regards to people, process and technology compared with the capability and maturity of your SOC.
Infrastructure and environment costs
- SIEM / data lake solution
- Knowledge base & ticketing system
- Communication and collaboration tools
- Network infrastructure and connectivity
- Software Licenses
Personnel costs for
- Operations of environment
- Monitoring and Analytics
- Engineering and Architecture
And not to forget additional costs for the management of portfolio, service, process and people.
OK – what’s next?
From Noise to Signal
Without events and information it is hart to keep up a SOC service. With other words, “Your SOC can’t live without it!”.
Therefore let’s start with the process flow of your events. You should
- Estimate the costs of an event ingested into your existing solution (SIEM / data lake).
- Verify the meaning of your events in regards to your use cases.
- Figure out how beneficial the events for your use cases are.
- Be aware of the false positive ratio of the use cases verified by your Analysts vs customers.
- Measure the time your Analyst spend on initial event triage per ticket and use case.
In the next step you should define the points of improvement:
- Find a way to drop events useless for your use cases and investigation.
- Measure the costs for a SIEM vs. data lake to hold events for threat intelligence and hunting (depending on capability of your SOC).
- Assess the reason of false positive ratio of your use cases.
- Review the process of event triage and proof if automatisation is beneficial.
By the way, this article is not complete and not limited to the described approaches.
There is much more to explore, e.g. the computing resources for processing the events, or the required storage for your backup and disaster recovery solution and further, business continuity management for immediate restoration of systems and services. But that would go beyond the scope of this article.
With this article I tried to give you some hands-on to start!