There are plenty of articles on security metrics but this one as true to form will be simple, practical and contrarian. Although according to Jennifer from securitymetrics.org "I am not sure there is much contrarian in the post, other than that an random engineer will be better at security metrics than a security person. That may be worth a lightning". Other than WTF is a lightning, maybe it is not contrarian but just common sense but definitely not as common as I would like to see. Read on....
I'm going to assume if you are here you know what a security metric is and how to write bad ones. So some quick lessons learned on preparation and then on to what I have found works.
Personnel - so who should work on this metrics project (and you must treat it as a project)? Definitely not a security person, not an awareness person, not a "metrics specialist", not a person who has 10 mins spare every week - throw an engineer at it. Preferably one who is self starting, can code, understands various different systems, knows how to get information out of lots of disparate systems, stick them on a database and put a web front end on them. Or you could buy some software - but be very very careful with this - even with this approach I would still stick an engineer on it because they will still need to integrate a large number of systems into the tool. Also put the spreadsheet down, put it down, don't make me hurt you! This data must go in a database of some sort with a web front end - it is the minimum requirement to be effective.
Now that you have a person (or got a good contractor), ACT! Do not have meetings, do not "talk to the business", do not "get requirements from management" (it is always earn more money, cut costs and risks) - remember what Henry Ford said: "if I asked my customers what they wanted they would still be driving faster horses". You can do all this stuff later, just get one metric done properly end to end.
The crucial aspect: AUTOMATE, AUTOMATE, AUTOMATE - I cannot stress how important this is. The number of metrics programs I have seen where there is a cottage industry just to collect all these metrics each month, copy them from one place to another, one spreadsheet to another, summarize them and present them to management (who have no idea what they mean or what to do about them - more on that latter). It is a mind numbing, I am going to burn this building down type task that never gets done well because no one cares and it is rarely the person responsible's main job or what they will be measured on.
Do not try and boil the ocean - start with one metric and add one every month or 3 months. But for each metric make sure they are well defined as I describe later and completely automated from end to end. Do not at any stage compromise and say we will start this manual and automate late, it will never happen. Do not accept that something cannot be stored in an automated form: e.g. control self assessments or awareness - score it, put it in a database and measure it.
Fully automated means:
- Collection - there must be a system that is the single source of truth for the data e.g. the Antivirus console that talks to all its agents and knows all the engine and signature versions, or a patching tool that knows in near real time or at least to a day what the patch versions of all the systems are. If you don't have this stop run the project to first get this in place, at least for one metric before you proceed. If this data is already in a central repository like a data warehouse or central log repository or SIEM you are in luck
- Query - the query of the collection system must be automated e.g. via connecting to the SQL database, using a defined API, web service, collecting a file from an FTP or Windows share
- Aggregation and analysis - once queried the software must be able to put this data into a usable form without manual intervention. It must automatically be able to run the required rules, generate the percentages and scores and assign the right RAG colour
- Report - the monthly, quarterly, real time report etc must be generated automatically. Ideally this is simply a website that is updated in a near real time with the data and allows daily, weekly, quarterly, yearly views. If you want to see a good example refer Google Analytics or even the Stats page on Blogger.com
Defining the metrics
- Measure relative - 18234 in April and 234 in May does not mean anything to anyone that matters. Just like pricing and sales, humans need something to compare anything to for it to be meaningful and for them to act. Make sure that any metric is expressed in a percentage (%), preferably limited from 1 - 100. Where you cannot or it does not make sense to express it as a % the RAG becomes even more important
- Define RAG
- Define actions - for Amber and Red there must be a clear action defined at the outset, a clear owner responsible for that and when the action would be completed so that the metric becomes Green. This is crucial
- Provide trending - again linked to relative, human beings are pattern machines (ask any Technical stocks trader), we see patterns everywhere - so provide them! Provide the past periods as comparison, use infographics, it is amazing how oddities will jump out even from a lot of information
- Twitter length - a really convoluted metric is not useful for anyone - if you can't say it in 140 characters maybe it is 2 metrics
- Encourage competition - we are competitive animals, especially guys in IT. So provide a metric that shows the Windows desktop team is beating the Windows server team in patching, or that Germany is beating the UK region and you will provide the best incentive to improve. Publish the leaders for the month and the laggards, name and shame! Measure and contrast both the internal IT and any outsourced vendors and Cloud IT (just put it the contracts for them provide you with these data feeds - do not use reports).
- Drill down - You must be able to aggregate upto a top level number and drill down by a specific area e.g. IT silo, region etc - basically down to however you have organized yourselves to a relatively senior person. This should be relatively easy if you have a relational database with some tagging.
- Format - The key items are the control, the value, the RAG status and the action owner
Bad metrics (aka what you probably have now)
- Anything you cannot automate
- Long an unclear
- Anything where there is no clear action for going red or amber
- Anything that cannot have a clear owner
Like most things in life it is not how many metrics you have but the quality of them that matters
Metrics that work
Here are some examples of metrics that I have seen work well. What does that mean? i.e. what does success look like? I define success as a metric that allows you to effectively measure a process or technology and improve it over time without costing more in time or dollars than that benefit.
Aim for a metric in each major security category:
- % of Windows servers that are more than 1 day behind the latest signature version
- Green: 1%, Amber 2%, Red 3% or more.
- Amber action: Mr Joe Blogs to investigate and push signatures to those missing etc
- Red action: Offshore team to priority 1 push signatures to missing machines etc
- % of production and BCP Unix servers that are more than 30 days behind the latest security patch
- Green: 5%, Amber 8%, Red 10% or more.
- Amber action: Ms Jane Doe to investigate and patch missing servers
- Red action: Offshore team to plan and execute priority 1 weekend change to patch missing servers
- % of production servers that have 3 more settings different from the current gold build
- Green: 5%, Amber 8%, Red 10% or more.
- Amber action: Ms Alys to investigate and remediate the deviating servers by next month
- Red action: Offshore team to plan and execute priority 1 weekend change to bring servers to compliance
You get the idea on the RAG and actions so just the metrics:
- Vulnerability management: % of internet accessible servers that have a vulnerability of CVSS 9 or above detected
- Vulnerability management: mean time to close a penetration testing finding rated medium or above
- Privileged access management: SU to Root outside of standard shift hours as a % of in shift hours
- Data loss: % of data copied to removable media, email and sent over the internet by employees in their last week compared to their average
- Data loss: % of people in the company who have write access to removable media (USB/CD/DVD)
- Data loss: % of people sending an office file to a non corporate email address (e.g. gmail)
- Awareness: % of people completing the CBT awareness with a score of 4/5 or above
- Network: % of firewall rules detected as insecure compared to total rulebase
- Incident: % of unencrypted mobile devices lost or stolen compared to population
- Policy compliance: % of BU's scoring 90% compliance or higher in a self assessment questionnaire
- Project / SDLC: % of projects that have a score of 95% of higher in the security go live questionnaire
Metrics are worth your time - once you have a few in place, the systems and know how to add new ones it just becomes another step to everything you do - new policy - define metrics, new system - define metrics, new cloud partner - define metrics. Once you start getting real data you can improve everything, show real value to your business and other stakeholders and more importantly justify your existence as a security team. So get on it and good luck!
NB: on a side note if you got this far, I am interested in what you think of this format for the blog and whether pictures add anything to an article (fun, humour, 1000 words etc) or whether you prefer pure text for ease of reading and reading in the office - please comment, tweet or email me.