Vulnerability management – building a programme that works

Vulnerability management is one of those topics everyone agrees is important and very few organisations do well. The job is not "run a scanner" — it is a continuous, cross-team process that turns thousands of findings into a small number of fixed problems. A good programme reduces real risk, not just dashboard numbers, and gives leadership a clear view of how exposed the organisation actually is.

The vulnerability management lifecycle

Most mature programmes follow a similar loop, regardless of the tooling.

Discover – Maintain an authoritative asset inventory: servers, endpoints, cloud workloads, containers, SaaS apps, network devices, and code repositories. You cannot scan what you do not know exists.
Assess – Run authenticated scans, agent-based checks, container and image scanners, SCA on source code, and cloud configuration assessments. Combine these with manual testing (pen tests, red team exercises) for issues scanners miss.
Prioritise – Rank findings by exploitability, business impact, and exposure. Severity alone is not enough.
Remediate – Patch, reconfigure, upgrade, or accept the risk. Track the work like any other engineering ticket with clear ownership and deadlines.
Verify – Re-scan or re-test to confirm the fix landed and did not regress.
Report – Show trends, SLA performance, and residual risk to stakeholders.

The loop never stops; new vulnerabilities appear daily, and the environment changes underneath them.

Discovery: knowing what you have

Discovery is the unglamorous foundation. Without it, scanning misses assets and prioritisation becomes guesswork. In practice this means combining several sources: your CMDB, EDR or endpoint agent inventory, cloud provider APIs, container registries, DNS records, identity provider logs for SaaS, and external attack-surface monitoring for shadow IT. Reconcile them regularly — assets that exist in one source but not another are usually where the worst findings hide. Tag assets with owner, environment, data classification, and exposure (internet-facing vs internal) so prioritisation later has the context it needs.

Prioritisation beyond CVSS

CVSS scores are useful but not sufficient. A "critical" CVE on an isolated, non-internet-facing test box matters less than a "medium" on an internet-facing identity provider. Better prioritisation blends several signals:

CVSS base score – A baseline of technical severity.
EPSS (Exploit Prediction Scoring System) – Probability the CVE will be exploited in the wild in the next 30 days.
CISA KEV catalogue – Vulnerabilities known to be actively exploited; treat these as urgent regardless of CVSS.
Exposure – Internet-facing, behind authentication, internal-only, or air-gapped.
Business impact – What does the affected system do, who uses it, and what data does it touch.
Compensating controls – Existing mitigations (WAF, EDR, network segmentation, MFA) that reduce real-world exploitability.

The output is a small, ranked list of "must fix this week" versus a much larger backlog with realistic SLAs. Most teams find that a few percent of findings drive almost all real risk.

Remediation and SLAs

Findings only matter if they get fixed. A working programme defines clear, written SLAs by severity and exposure — for example, internet-facing critical issues patched within 7 days, internal highs within 30, mediums within 60–90. These need executive sign-off so engineering teams cannot quietly de-prioritise them. Use your normal ticketing system and treat security tickets like any other engineering work, with owners, due dates, and visible queues. Where patching is not possible, document the compensating control or formal risk acceptance with an expiry date, so "we'll fix it later" does not become permanent. Automate as much as possible: managed patch tooling for OS and third-party software, dependency update bots (Dependabot, Renovate) for code, and infrastructure-as-code so configuration fixes are repeatable.

Measuring success

A programme that cannot show whether it is improving will eventually lose budget and attention. Useful metrics focus on outcomes, not activity:

Mean time to remediate (MTTR) by severity and asset class.
SLA compliance – Percentage of findings closed within their target window.
Backlog age – How many findings are older than their SLA, broken down by team.
Coverage – Percentage of assets being scanned authenticated vs unauthenticated; percentage with agents installed.
Exposure of KEV/actively exploited CVEs – Ideally zero, with time-to-zero tracked when one is published.
Recurrence – How often the same finding reappears, indicating root-cause issues rather than one-off fixes.

Pair the numbers with narrative: which classes of vulnerability are growing, which teams are struggling, what investment would move the needle. A vulnerability management programme that gives leadership an honest, data-backed view of risk — and steadily reduces it — is one of the highest-leverage investments a security team can make.