Cybersecurity

How We Assess WordPress Security: An Engineering Pipeline, Not a Checklist

Lemorange Team 11 min read

WordPress Runs the Web, and So Does Its Attack Surface

WordPress powers more than 40 percent of all websites in existence, which makes it the single most widely deployed application platform on the internet. Our clients are software companies and enterprises rather than blogging hobbyists, but they inherit WordPress everywhere: marketing sites, campaign microsites, documentation portals, investor pages, and the properties that arrive attached to an acquisition. When we are asked to assess a client's security posture, a WordPress property is almost always somewhere in scope, and it is frequently the weakest point on the entire perimeter.

The reason WordPress carries a poor security reputation is not its core. WordPress core is mature, professionally maintained, and patched quickly, and a current install is rarely the problem. The risk lives in the ecosystem around it. The overwhelming majority of WordPress vulnerabilities, well over 90 percent in every annual dataset we track from Patchstack and Wordfence, are found in plugins and themes rather than in core. A single site can run 30 to 50 plugins, each an independent codebase of wildly varying quality, and each one a potential way in.

That structure is exactly why we do not assess WordPress with a manual checklist. The attack surface is too large, too varied, and changes too often for a person working through a document by hand. We treat a WordPress assessment as an engineering pipeline: automate the repetitive majority of the work so it runs in minutes and produces identical results every time, then reserve senior engineering time for the parts that genuinely require human judgment. This article walks through that pipeline stage by stage and explains where the efficiency actually comes from.

Mapping the Real Attack Surface

Before any tooling runs, we define the attack surface in layers, because each layer needs a different kind of test. Core is the foundation and is usually the least of the worries on a current site. Plugins and themes are the largest and most volatile layer, where most real findings live. Configuration covers the settings that decide whether a small bug becomes a full compromise. The server layer underneath, meaning the PHP version, the web server, the database, and the TLS setup, is routinely forgotten by site owners but is fully in scope for us. Identity and access cuts across all of it.

Supply chain is the layer most owners never think about. A plugin that was abandoned three years ago still runs in production with the same privileges as everything else, and it will never receive another patch. Nulled plugins, which are pirated commercial plugins distributed through unofficial channels, are worse still: they frequently ship with injected backdoors and are one of the most common root causes of compromise we encounter. Part of the job is to prove, file by file, that what is installed matches what the author actually published.

The inventory problem is the genuinely hard part. You cannot assess what you cannot see, and most WordPress estates have no accurate record of what is installed, which version it is, where it came from, or whether it is still maintained. So the first real engineering task in every assessment is to build that record automatically and completely, because everything downstream depends on it being right.

Stage One: Automated Reconnaissance and a Live Software Inventory

Reconnaissance starts passively and then turns active. We fingerprint the WordPress version, enumerate installed plugins and themes, and identify exposed endpoints without touching anything that could disrupt a live site. WPScan is the workhorse here, complemented by Nuclei templates for fast templated checks and our own probes for the things the public tools miss. Where we have credentialed access, WP CLI gives us an authoritative inventory read directly from the install rather than one inferred from the outside.

The output of this stage is a software bill of materials for the site: every plugin and theme, its exact version, its source, its last update date, and whether it is still actively maintained. This is the same SBOM discipline we apply to any modern software supply chain, pointed at WordPress. An accurate bill of materials turns a vague sense that the site is probably fine into a precise list of components we can actually reason about.

Doing this by hand is where traditional WordPress reviews lose days. Clicking through the admin area to record plugin versions one at a time is slow, error prone, and stale the moment an automatic update runs. Automating it means the inventory is complete, exact, and reproducible, and it can be regenerated in seconds whenever the site changes. That reproducibility is the foundation everything else is built on.

  • Fingerprint the core version, plugins, themes, and exposed endpoints with passive checks first
  • Use credentialed WP CLI access for an exact inventory rather than an inferred one
  • Record the exact version, source, and last update date for every component
  • Flag abandoned plugins and anything installed from outside the official channels
  • Regenerate the full inventory on demand so it never goes stale

Stage Two: Correlating the Inventory Against Vulnerability Intelligence

An inventory is only useful once it is matched against what is known to be vulnerable, and this is the stage where automation pays for itself most obviously. We correlate every component and version in the bill of materials against multiple vulnerability sources: the WPScan vulnerability database, Patchstack, Wordfence Intelligence, and the NVD for upstream CVEs. Each source has different coverage and different disclosure timing, so using several in combination catches issues that any single feed would miss.

The scale is the entire point. The combined WordPress vulnerability datasets track tens of thousands of known issues across the ecosystem, and thousands of new ones are disclosed every year. Patchstack alone added several thousand new WordPress vulnerabilities to its database in 2024. No human can hold that in their head or check it by hand, and no annual manual review can keep pace with it. A correlation engine does it in seconds and never forgets a CVE.

We do not stop at the fact that a known vulnerability exists. The output is contextual: which component, which version, the CVSS severity, whether a public exploit is available, whether the installed version actually falls inside the affected range, and whether a patched version has shipped. A critical vulnerability in a plugin that is deactivated is a very different priority from the same vulnerability in an active, internet facing one. That context is what turns a raw scanner dump into an actionable finding instead of noise.

  • Correlate every component against WPScan, Patchstack, Wordfence Intelligence, and the NVD
  • Use multiple feeds, because coverage and disclosure timing differ across sources
  • Confirm the installed version actually falls inside the affected range to cut false positives
  • Record the CVSS severity and whether a public exploit is known for each finding
  • Prioritise active, internet facing components over deactivated or isolated ones

Stage Three: Configuration, Hardening, and the Server Underneath

A fully patched WordPress can still be trivially compromised through configuration, so this stage checks the settings that decide whether a minor bug becomes a breach. We review the configuration file for exposed secrets and unsafe debug flags, confirm that the security keys and salts are unique, verify file and directory permissions, and check that sensitive files are not directly reachable over the web. Directory listing left enabled, exposed database backups, and a world readable configuration file are findings we still encounter on a regular basis.

File integrity is one of the most valuable and most overlooked checks available. WordPress core, and every plugin and theme from the official directory, ships with published checksums. We verify the installed files against those canonical checksums to detect any file that has been altered, which is one of the fastest ways to surface an injected backdoor or a tampered plugin. A single modified core file is a strong signal of compromise that no version based scan would ever catch on its own.

The server layer is fully in scope, because WordPress security does not stop at the application boundary. An outdated PHP version, a misconfigured web server, weak TLS, an exposed database administration tool, or missing HTTP security headers can undermine an otherwise clean install. We check the TLS configuration, the security headers, the PHP version and its known issues, and the surrounding infrastructure, and we treat a weakness in any of them as a finding in its own right.

  • Review the configuration file for exposed secrets, unique salts, and unsafe debug flags
  • Verify file and directory permissions and confirm sensitive files are not web reachable
  • Verify core, plugin, and theme files against published checksums to detect tampering
  • Check the TLS configuration, the HTTP security headers, and the PHP version in use
  • Confirm database tools and backups are not exposed to the public internet

Stage Four: Authentication, the REST API, and XML RPC

Authentication is the front door, and WordPress ships with several doors people forget to lock. We test brute force resistance on the login endpoint, check whether two factor authentication is enforced for privileged accounts, and review session and cookie handling. Weak or reused administrator passwords remain one of the most common ways WordPress sites fall, and the default login page is a permanent target for automated credential stuffing.

User enumeration is a quiet but serious problem. By default the REST API will list account names through its users endpoint, and the author archive pages leak the same information. Handing an attacker a valid list of usernames removes half the work of a brute force attack before it even begins. We check every enumeration vector and confirm it is closed, because closing it is cheap and leaving it open is an open invitation.

XML RPC is the classic example of a feature that is useful to almost no one and dangerous to everyone. It enables brute force amplification, where hundreds of password guesses are packed into a single request, and it has been abused for pingback based distributed denial of service. Unless a site has a specific, verified need for it, we recommend disabling it entirely. The REST API deserves the same scrutiny: which endpoints are exposed, what they reveal, and whether they enforce authorisation on every single call rather than assuming it.

  • Test brute force resistance on the login endpoint and confirm rate limiting is in place
  • Require two factor authentication for administrator and editor accounts
  • Close user enumeration through the REST API and the author archive pages
  • Disable XML RPC unless there is a specific, verified need for it
  • Review every exposed REST API endpoint for proper authorisation on each request

Stage Five: Reading the Code That Scanners Cannot

Automated tooling finds known vulnerabilities in known components. What it cannot find is the unknown vulnerability in the custom plugin a previous agency wrote for a client three years ago, and that custom code is exactly where the most dangerous findings tend to hide. This stage is deliberately led by engineers and supported by tooling rather than replaced by it. We run static analysis with Semgrep and PHP security rulesets across all custom plugin and theme code, then review the high signal results by hand.

The classic WordPress code flaws are well understood and still everywhere. SQL queries built with string concatenation instead of prepared statements. Missing nonce verification on state changing actions. Missing capability checks that let any logged in user perform privileged operations. Unsanitised input reflected straight back into the page. Unrestricted file uploads that allow a script to be planted on the server. We test for each of these specifically in custom code, because they are precisely the bugs that templated scanners are weakest at finding.

Business logic is the final frontier, and the one that only a person can properly assess. Can a subscriber escalate to an editor by manipulating a single request? Does a multi step workflow check authorisation at every step or only at the first? Can two requests race each other to bypass a limit that looks airtight in isolation? These are the chained, context dependent issues that define a serious assessment, and they are exactly where we spend the expert time that the automated stages free up.

Where the Efficiency Actually Comes From

The efficiency in our WordPress assessments is not a faster human working longer hours. It is a different division of labour. We codify the repetitive 80 percent of the work, the reconnaissance, the inventory, the vulnerability correlation, and the configuration and integrity checks, into an automated pipeline that runs in minutes and produces identical results every time it runs. The senior engineering time that would otherwise be spent clicking through an admin panel is redirected entirely to the 20 percent that needs judgment: custom code review, business logic, and exploit chaining.

Reproducibility is what makes this compound over time. Because the pipeline is automated, the same assessment can be run on day one, run again after remediation to verify the fixes, and then run continuously after that. For clients whose WordPress estates genuinely matter, we wire the inventory and correlation stages into a scheduled pipeline, so a newly disclosed plugin vulnerability is flagged within hours of public disclosure rather than at the next annual review. A vulnerability that is patched the day it is disclosed never gets the chance to become an incident.

The diagram below shows the full pipeline and, just as importantly, which stages are automated, which combine automation with expert review, and which are led by engineers. That separation is the whole point. Machines do what machines are good at, at machine speed and machine consistency, and people do what only people can do. The result is an assessment that is faster, more complete, and more repeatable than any manual review, without giving up the depth that only a human reading the code can provide.

  • Automate the repetitive majority so it runs in minutes and never varies between runs
  • Reserve senior engineering time for code review, business logic, and exploit chaining
  • Make every stage reproducible so re testing after remediation is effectively free
  • Schedule the inventory and correlation stages for continuous, near real time coverage
  • Flag newly disclosed vulnerabilities within hours rather than at the next annual review

WordPress Security Assessment Pipeline

Stage 1: Recon and Fingerprinting

Automated

Passive then active enumeration of the core version, installed plugins, themes, exposed endpoints, and users. Non disruptive against live sites. Establishes exactly what is running before anything is tested.

WPScanNucleiWP CLICustom Probes

Stage 2: Software Inventory and SBOM

Automated

A complete bill of materials for the site. Every plugin and theme with its exact version, source, and last update date. Abandoned components and anything installed from outside the official channels are flagged automatically.

SBOMVersion PinningAbandonment FlagsNulled Detection

Stage 3: Vulnerability Correlation

Automated

Every component matched against several vulnerability feeds in seconds. Installed versions confirmed inside the affected range to cut false positives. Each finding carries its severity and known exploit status.

WPScan DBPatchstackWordfenceCVE/NVD

Stage 4: Configuration and Authentication

Automated and Expert

Configuration, file permissions, security headers, TLS, and the server layer reviewed. File integrity verified against published checksums. Login hardening, two factor enforcement, and REST API and XML RPC exposure all tested.

ChecksumsSecurity HeadersREST APIXML RPC

Stage 5: Custom Code Review

Expert Led

Static analysis across all custom plugin and theme code, then human review of the high signal results. SQL injection, missing nonces, missing capability checks, unsafe uploads, and the business logic flaws that scanners cannot reach.

SemgrepPHPCS SecurityManual ReviewBusiness Logic

Stage 6: Scoring, Remediation, and Monitoring

Continuous

Findings scored with CVSS and ordered by real world impact. Virtual patching through a WAF where an immediate fix is not possible. Every fix verified by re running the pipeline, with inventory and correlation kept running on a schedule.

CVSSVirtual PatchRe TestMonitoring

From Findings to a Hardened Estate

An assessment that ends with a list of problems is only half a deliverable. Every finding we report carries a CVSS based severity, a plain explanation of its real world impact, and a specific, prioritised remediation. We separate what must be fixed today from what should be scheduled, because treating every finding as an emergency is one of the surest ways to make remediation stall entirely. The goal is a clear, ordered path from the current state to a hardened one.

Some fixes cannot ship immediately. A vulnerable plugin may be load bearing, with no patched version available and no quick replacement to hand. For those cases we use virtual patching: a web application firewall rule, through ModSecurity with the OWASP Core Rule Set or an edge platform such as Cloudflare, that blocks the exploit path while a proper fix is planned and tested. It buys time safely rather than leaving a known hole open and hoping nobody finds it.

Then we verify. Remediation is not complete because someone says it is, it is complete when the pipeline is re run and the finding is genuinely gone. Because the assessment is automated, that re test costs almost nothing, which means we actually do it rather than trusting a status update. For estates that matter, the same pipeline keeps running on a schedule, with file integrity monitoring and continuous vulnerability correlation, so the site stays hardened long after the engagement that hardened it ends. That is the difference between a point in time report and a security posture you can rely on.

  • Score every finding with CVSS and translate it into plain real world impact
  • Prioritise ruthlessly and separate fix today from schedule for later
  • Use virtual patching through a WAF when an immediate code fix is not possible
  • Re run the pipeline to verify each fix rather than trusting a status update
  • Keep file integrity monitoring and vulnerability correlation running continuously

Looking for help with application security, penetration testing, or secure platform architecture?

We build production systems using the patterns and technologies discussed in this article. Tell us about your project.

Get in Touch