Table of contents
- 1. Three 2025–2026 incidents that changed the answer
- 2. Where AI crushes humans — speed, scale, coverage
- 3. Where humans still win — context, chaining, judgment
- 4. Task-by-task cheat sheet — who should own what
- 5. The overlooked "three faces of AI" — a double-edged sword
- 6. Verdict — the winner is "humans × AI"
- FAQ
"Who is better at security work — AI or humans?" As of 2026, answering "one or the other" outright is no longer accurate. In just the past year, cases where AI outperformed human experts on raw merit and cases where AI exposed fatal weaknesses happened at the same time, one after another.
Start by looking at three symbolic events.
Three turning points in 2025–2026
AI was deployed for real, on both the defending and the attacking side
zero-day pre-abuse
bug bounty ranks
of the attack
This article takes the defender's point of view and compares the security capabilities of AI and humans task by task, grounded in primary sources and measured data from Google, Anthropic, DARPA, Veracode and others. The goal is not hype or wishful thinking, but a concrete breakdown of which work to hand to AI, where humans must stay in control, and how to protect your organization.
This article's stance: This is strictly a guide for defense and security measures. It does not provide attack techniques or methods of misuse. Cases where AI was abused for attacks are presented not as "AI is handy for attacking," but as threats we must prepare for — so that you can defend against them properly.
2. Where AI crushes humans — speed, scale, coverage
Start with AI's track record. The notion that "AI is still just an assistant" went completely out of date in 2025.
① Speed — work that takes humans days, done in hours
The autonomous AI pentester "XBOW" completes in hours the kind of penetration test that would normally take an experienced hacker days. It tests across the major vulnerability categories — RCE (remote code execution), SQL injection, XSS, SSRF, information disclosure — and reached #1 on the US ranking of the bug bounty platform HackerOne in just 90 days. It edged out thousands of human hackers and reported more than 1,000 vulnerabilities (132 of which the target companies confirmed and fixed). It is the first documented case of AI outperforming human experts in a large-scale, real-world setting.
② Coverage and scale — vast amounts of code, 24/7, without rest
Google's vulnerability-finding AI "Big Sleep" found 20 vulnerabilities in widely used open-source software. What stands out is that the AI discovered and reproduced each vulnerability without human intervention (humans only handled a quality check before reporting). Human researchers have limits on focus and time, but AI can scan huge codebases tirelessly, without bias, around the clock.
③ Going all the way to automatic fixes (patches)
In DARPA's "AI Cyber Challenge (AIxCC)," fully autonomous AI systems found 86% of the planted vulnerabilities and automatically patched 68%. They also discovered 18 previously unknown vulnerabilities in real-world OSS and generated patches for 11 (the winner was Team Atlanta, a mixed team led by Georgia Tech and others). It was a milestone in showing that AI can handle not just "finding" but "fixing."
AI's capability in numbers (2025)
On top of that, alert triage — the most time-consuming part of day-to-day security operations (SOC) — is another AI strength. Analysts are said to spend 25–40% of their working hours investigating false positives, but by handing first-pass triage and noise reduction to AI, humans can focus on the "real threats."
3. Where humans still win — context, chaining, judgment
Does that mean humans are unnecessary? Quite the opposite. There are areas where AI is structurally weak, and they are clear.
① Business-logic flaws — "gaps in the spec" are invisible without understanding intent
This is the biggest weakness. Business-logic flaws — for example, "enter someone else's ID and you can see their orders," or "apply a discount coupon an unlimited number of times" — are easily missed by both scanners and AI, because the code "works correctly" as written. You cannot find them without understanding how the app is supposed to behave. Humans read the intent of the spec and can creatively try out "unintended uses."
② Vulnerability chaining — assembling individual findings into a "real attack"
Real-world breaches are rarely a single vulnerability; they come from chaining together multiple weaknesses. AI is good at finding individual vulnerabilities, but the strategic thinking to assemble them into a realistic attack scenario — "this information disclosure → this privilege escalation → this authentication bypass" — is still a human advantage. In fact, a typical limitation of AI is that at the proof-of-concept (PoC) stage it "finds the bug but can't fully prove it's exploitable."
③ AI false positives and hallucinations — "confident lies"
AI will sometimes fabricate (hallucinate) vulnerabilities that don't exist, or misclassify exploitability. Even in the state-linked attack case described later, the AI used in the attack made mistakes — it fabricated fake credentials and exaggerated its results. That is exactly why AI output has to assume human verification (human-in-the-loop); otherwise it generates noise and a false sense of security. It's also why Big Sleep always inserts a human check before reporting.
The most effective security strategy combines AI-driven automation with human-led analysis — that is the industry consensus as of 2026.
4. Task-by-task cheat sheet — who should own what
Rather than framing "AI vs humans" as win-or-lose, the practical approach is to assign roles per task. The table below summarizes the fit for major security tasks.
| Task | AI fit | Human fit | Recommendation |
|---|---|---|---|
| Large-scale code/log scanning (SAST) | ◎ Fast, comprehensive | △ Can't match the volume | AI-led |
| Finding known-pattern vulnerabilities | ◎ 24/7, strong on repetition | ○ | AI-led |
| Alert triage / removing false positives | ◎ Good at first-pass triage | ○ Final call | AI triages → human confirms |
| Generating routine patches | ○ Can be automated | ○ Review required | AI generates → human reviews |
| Business-logic flaws | △ Can't understand intent | ◎ Creative thinking | Human-led |
| Vulnerability chaining / attack scenarios | △ Weak strategy | ◎ Designs the chain | Human-led |
| Proof of exploitability (PoC) | △ Poor at proving it | ◎ | Human-led |
| Incident-response decision-making | △ Can't carry context/accountability | ◎ Ultimate accountability | Human-led (AI organizes info) |
| Judging targeted phishing as real or fake | ○ First-pass filter | ◎ Contextual judgment | Collaboration |
The pattern is clear. "Broad, fast, repetitive" goes to AI; "deep, contextual, final judgment" goes to humans. The two are not competitors but complements.
5. The overlooked "three faces of AI" — a double-edged sword
This is the most important point of the article. In security, AI is not merely a "capable defender." It has three faces at once.
45% of AI-written code has vulnerabilities — 2.74× more than human-written code (Veracode study, 100+ LLMs × 80 tasks). XSS can't be written safely 86% of the time.
A state-linked group abused Claude to run 80–90% of an attack autonomously. The first large-scale, AI-led cyberattack, targeting about 30 organizations.
The same AI stopped a real zero-day before abuse, and detected and blocked the attack above. We've entered an era where defense, too, counters with AI.
Face ① AI is also a side that "mass-produces vulnerabilities"
In a 2025 study where the security specialist Veracode had more than 100 LLMs solve 80 real tasks, 45% of AI-generated code contained security flaws. Compared with human-written code, the density of vulnerabilities was about 2.74× higher. As AI-assisted coding spread, there were also reports that by mid-2025 new security findings had surged tenfold per month. So-called vibe coding may speed up development, but behind the scenes it actually increases the security workload.
Face ② Attackers are already using AI "autonomously"
In November 2025, Anthropic announced that it had detected and disrupted the first large-scale, AI-led cyber-espionage operation. A Chinese state-linked group (GTG-1002) abused the company's AI coding tool, Claude Code, to attempt intrusions into about 30 targets — tech companies, financial institutions, government agencies and more. The astonishing part is that AI executed 80–90% of the attack without human intervention (the specific method by which the attackers slipped past the AI's safety measures is deliberately omitted in this article to avoid enabling misuse). The lesson to take away is one thing: the very power of AI agents can become a weapon for attackers. That is precisely why defenders must scope the permissions and boundaries they grant to AI agents to the minimum, and be ready to monitor and log their behavior.
Face ③ But defenders can fight with AI too
What matters is that the attack was also detected and blocked by a defending side that leveraged AI. And as noted, the AI used in the attack also made mistakes (such as fabricating fake credentials), so even the attacking side has not yet reached full autonomy. In other words, AI is an amplifier that accelerates both offense and defense, and a new dynamic has emerged: "AI-using defenders vs AI-using attackers." In this arms race, the human teams that wield AI well come out ahead.
6. Verdict — the winner is "humans × AI"
The 2026 answer to "Who's better, AI or humans?" is this: "On its own, AI wins decisively on speed and scale — but the best of all is the 'humans × AI' combination." Just as a mixed human-and-AI team (a "centaur") beat AI alone at chess, in security too a division of roles is the optimal solution.
The optimal role-split model
Shared rule: always insert human-in-the-loop (a human check)
The takeaway for practitioners and executives is simple. The value of security talent shifts from "the worker who does the hands-on tasks" to "the supervisor who wields AI, verifies its results, and makes the final call." What AI replaces is repetitive work, not judgment, accountability, or creativity. This mirrors the broader picture of how AI affects jobs. Rather than treating AI as an "enemy" or "magic," whether you can embed it into your organization as a powerful but supervision-dependent new expert will decide who wins and loses at security from here on.
In summary — precisely because the attacking side is also accelerating with AI, what matters most is for defenders to adopt AI wisely and "protect properly" by combining it with human judgment. Don't dump everything on AI; let a human make the final check. Teams that stick to this basic discipline become organizations that are resilient to the threats ahead.
FAQ
Q. As AI advances, will security experts become unnecessary?
No. AI takes over repetitive work, large-scale scanning, and first-pass triage, but finding business-logic flaws, designing attack chains, and ultimate decision-making and accountability remain the human domain. If anything, demand grows for "experts who can supervise and verify AI." What disappears is the "work," not the "judgment."
Q. How should small and mid-sized businesses use this trend?
The most cost-effective place to start is handing AI the "broad, fast, repetitive" tasks — log monitoring and alert triage that are too noisy to keep up with, scanning dependencies for vulnerabilities, and so on. Meanwhile, keep the final review before a production release and any important decisions in human hands. Don't take AI output at face value; design your operations from day one so a human check is always inserted.
Q. Is code written by AI safe to ship straight to production?
It's risky. Studies found that about 45% of AI-generated code contained vulnerabilities — roughly 2.74× the rate of human-written code. AI coding boosts productivity, but use generated code on the assumption that it always passes review and security testing. Be aware that security debt tends to build up behind the speed.
Q. If attackers are using AI too, isn't defense at a disadvantage?
It has become an "arms race" in which both offense and defense accelerate with AI. As of 2026, however, the AI used in attacks still makes mistakes (such as fabricating false information) and has not reached full autonomy. Because defenders can also strengthen automatic detection and response with AI, the side that has a human team that operates AI well comes out ahead. The key is not "whether you've adopted AI" but "how skillfully you use it."