The Key to Achieving Effective Vulnerability Remediation
Vulnerability remediation is the act of fixing cybersecurity weaknesses that are detected in software code, applications, and enterprise assets. Security teams continue to deploy many tools and scanners to identify security weaknesses and vulnerabilities, however identifying these issues is just a first step in the overall risk management process. Understanding which issues to remediate first, and getting them remediated by the appropriate IT and Engineering teams continues to be a challenge for almost every single organization.
Traditional vulnerability management has frustrated most security and engineering teams because often, it doesn’t provide a true understanding of the risk. Today, most organizations are scanning the assets they know about, which is a small subsection of what actually exists in a cloud-native environment. The results from these scans produce a significant amount of false positives which leads to alert fatigue. Additionally, most traditional vulnerability management products use NIST/CVSS/etc. to rank vulnerabilities, but those databases are only one component of risk ranking criteria and generally misses the business context of the assets. And if you look at the aggregate outputs from the many tools deployed to scan everything from code to cloud, security teams are completely overwhelmed by hundreds of thousands of issues that have no context and are not actionable at all.
Vulnerability Management
Traditional vulnerability management programs focus mostly on identifying vulnerabilities, possibly ranking them based on CVSS or other non-contextual systems, and aggregating them in a centralized system. While centralized reporting is important for understanding risk, the key objective of security teams is to drive risk remediation and not just report them.
We have seen across many environments where the security team does a fantastic job at identifying issues and vulnerabilities, but continue to struggle with getting them remediated. This large volume of unresolved and unprioritized security issues eventually leads to one or more of these outcomes:
- Friction Between Development and Security Teams – the large volume of security issues creates a burdensome amount of work for development teams to triage and prioritize the issues, creating frustration and eventually security being ignored by them. On the other hand, security teams continue to get frustrated that developers don’t work down the security tech debt and KPIs continue to be missed.
- Cost – triaging and prioritizing hundreds of thousands of security issues manually is very expensive as it forces the security teams to hire more security resources. Some teams also end up building homegrown tooling to automate the process, which in almost all cases is also expensive to build and maintain.
- Failed Compliance – vast majority of the security issues tend to not get resolved on time within the SLA, which is a violation of many compliance requirements (e.g. SOC2, ISO27001, HITRUST, FedRAMP, PCI-DSS, etc.) and can be a violation of most customer contractual agreements.
- Breaches – and finally it is a very well known and documented fact that almost all breaches are a result of unresolved vulnerabilities that the organization failed to fix in a timely manner.
Accelerating Vulnerability Remediation
The ideal outcome of any vulnerability management program is remediation. However it is not practical or feasible to fix every single security issue reported by tools that do not have the business context. Taking a risk-based approach helps security teams work more efficiently, optimize finite resources, and target the vulnerabilities that pose the highest risk to the organization.
Vulnerability remediation steps generally consist of the following:
- Vulnerability Identification – identifying and understanding the weaknesses in your code, applications and underlying infrastructure.
- Vulnerability Triaging – not all security issues are applicable to all environments and in a lot of cases, the tools might be reporting false positives. Handing over un-triaged issues to the engineering teams almost always creates friction and significantly impacts the likelihood of the issues getting resolved. Some simple broadstroke questions that can help triage security issues are:
- Is the issue applicable to the relevant code / system / tech stack / environment ?
- Can the identified issue be replicated in real life environments ?
- Could an exploitation of the issue cause any impact to confidentiality, integrity or availability of the system or data ?
Is the issue fixable ?
- An expected outcome of the triage process is a decision on next steps for the identified issue, which should be one of the following:
- Fix the issue, in which case the issue needs to be prioritized, assigned, governed and reported
- Acknowledge the presence of the issue but accept the risk and decide not to fix it, in which case the risk acceptance needs to be documented along with the justification.
- Triage continues until more information is available about the impacted asset or the issue itself. This could also be thought of as a risk acceptance since no decision on the issue’s disposition is, in itself, a decision.
- Vulnerability Prioritization – Prioritizing accurately is incredibly important as this is the biggest indicator of when and how many resources should the engineering teams invest in fixing the issues. A good partnership between security and engineering generally requires strong alignment on how issues are prioritized and how they should be addressed. This is also an area where security engineers should follow an agreed upon process of prioritization, avoiding over or under prioritization of issues. Vulnerabilities can generally be prioritized based on these 5 factors: severity, threats, asset exposure, business criticality and mitigating controls. Where organizations struggle today is how to identify these factors and evaluate the true risk at scale when faced with thousands of vulnerabilities on a daily basis.
- Asset Ownership Assignment – A key aspect of getting to remediation, is to identify who needs to remediate the issues. In most environments with fast moving dev teams, security teams continue to struggle finding the right owners as the dev teams constantly evolve and developers move around across various teams. With proper ownership asset and code ownership in place, risk remediation and governance efforts can be automated to the appropriate owner.
- Governance – In the context of vulnerabilities, governance can be thought of as the processes to ensure remediation of issues on time by the right owners. Once the issues are reported to the risk owners, governance processes should include things like:
- Risk Exceptions – Based off of security policies, there are exceptions that security will have to concede based on conflicting organizational or functional priorities.
- Risk Acceptances – In cases where the cost of fixing issues outweighs the risks introduced by them, appropriate decision making authorities can approve accepting the risk instead of mitigating it. The key item to note here is that the acceptances should require appropriate senior leadership, generally a combination of the asset / risk owner and someone from the security / compliance team. Additionally such acceptances should be temporary and tied to an agreed upon time window, and not permanent risk acceptances.
- Service Level Agreements (SLAs) Extensions – Security and compliance policies usually define the number of days within which security issues need to be remediated. However in many cases those timelines may not exactly be feasible for the responding teams. To ensure security teams and the responding engineering teams are on the same page and meeting compliance requirements, any extensions required to remediation due dates should be approved and documented.
- Reporting – Establishing a clear set of agreed upon metrics between application/product security teams and development is imperative to any organization. While this may seem easy, it is one of the most fundamental and important endeavors security can lead. These Key Performance Indicators (KPIs) are the basis of what will be measured and shared with executive leadership. Additionally, it can also be the basis for assigning security scores for individual teams so those that are performing are rewarded and those that are underperforming can receive support. And, last but not least, having clear reporting helps with regulatory compliance audits.
Why is Vulnerability Remediation Important?
Customers, partners, employees and regulators expect organizations to put in place policies and processes that continuously and effectively identify and remediate security risks resulting from vulnerabilities. There is also zero tolerance for system disruptions or slowdowns that could be caused by unresolved vulnerabilities. All of these factors make meeting vulnerability remediation challenges, a business-critical activity.
Making this challenging is the adoption of cloud-native architectures, DevOps and a self-service culture where developers go from code to cloud in a matter of hours – often introducing more vulnerabilities than what can be remediated. Meanwhile legacy application security systems and processes like traditional vulnerability management have stayed highly manual and impeded security teams from being able to scale at the speed of DevOps. In this agile world, vulnerability remediation must be expedient, informed and go beyond just scanning for vulnerabilities.
According to the NIST National Vulnerability Database, the number of Common Vulnerabilities and Exploits (CVEs) observed in devices, networks and applications has quintupled in a decade. This explosion in the volume of vulnerabilities is why vulnerability remediation needs to be the focus as compared to just detecting issues. Remediating vulnerabilities helps reduce the risk of breaches, denial of service attacks, and interruptions in operations. Minimizing your attack surface and overall exposure is paramount.
Accelerating Risk Remediation, from Code to Cloud
When looking to build a successful vulnerability management program, leading organizations have leveraged Tromzo’s Intelligence Graph to implement advanced prioritization techniques and automated workflows with the solid foundation of software asset inventory, ownership, and business context.
Aggregation & Deduplication
- Integrate and standardize vulnerability data from all of your code, container, cloud, and infrastructure scanners.
- Automatically deduplicate and group vulnerabilities.
- Patent pending Intelligence Graph built to bring context from code to cloud, for scaling large environments across hundreds of millions of vulnerabilities.
Contextual Prioritization
- Automatically leverage business context and environmental context to prioritize the actionable few vulnerabilities that matter.
- Run custom vulnerability remediation campaigns for addressing specific classes of bugs.
- Reprioritize severities and group issues together based on context specific to your business.
Automated Workflows
- Automate assignment of alerts to the right individual or team within your company based on asset ownership.
- Automatically create remediation tickets across one or more Jira projects, ADO boards etc. and use bi-directional sync to track remediation status at scale.
- Built-in governance and approval workflows for risk acceptances, false positives, SLA, and due date extensions.
Comprehensive Dashboards
- Build your own KPIs with custom visualizations to report triaged vulnerabilities and missing compliance controls.
- Generate reports for every asset and team across the organization.
- Gamify remediation using leaderboards with metrics like SLA compliance, MTTR, and burn down charts.