"Who is better at security work — AI or humans?" As of 2026, answering "one or the other" outright is no longer accurate. In just the past year, cases where AI outperformed human experts on raw merit and cases where AI exposed fatal weaknesses happened at the same time, one after another.

Start by looking at three symbolic events.

AI

Three turning points in 2025–2026

AI was deployed for real, on both the defending and the attacking side

Stopped a real
zero-day pre-abuse
Google "Big Sleep"
AI found SQLite's CVE-2025-6965 on its own
#1 in the US
bug bounty ranks
Autonomous AI "XBOW"
Topped humans on HackerOne's US ranking
AI ran 80–90%
of the attack
First AI-led cyberattack
State-linked espionage that abused Claude

This article takes the defender's point of view and compares the security capabilities of AI and humans task by task, grounded in primary sources and measured data from Google, Anthropic, DARPA, Veracode and others. The goal is not hype or wishful thinking, but a concrete breakdown of which work to hand to AI, where humans must stay in control, and how to protect your organization.

This article's stance: This is strictly a guide for defense and security measures. It does not provide attack techniques or methods of misuse. Cases where AI was abused for attacks are presented not as "AI is handy for attacking," but as threats we must prepare for — so that you can defend against them properly.

2. Where AI crushes humans — speed, scale, coverage

Start with AI's track record. The notion that "AI is still just an assistant" went completely out of date in 2025.

① Speed — work that takes humans days, done in hours

The autonomous AI pentester "XBOW" completes in hours the kind of penetration test that would normally take an experienced hacker days. It tests across the major vulnerability categories — RCE (remote code execution), SQL injection, XSS, SSRF, information disclosure — and reached #1 on the US ranking of the bug bounty platform HackerOne in just 90 days. It edged out thousands of human hackers and reported more than 1,000 vulnerabilities (132 of which the target companies confirmed and fixed). It is the first documented case of AI outperforming human experts in a large-scale, real-world setting.

② Coverage and scale — vast amounts of code, 24/7, without rest

Google's vulnerability-finding AI "Big Sleep" found 20 vulnerabilities in widely used open-source software. What stands out is that the AI discovered and reproduced each vulnerability without human intervention (humans only handled a quality check before reporting). Human researchers have limits on focus and time, but AI can scan huge codebases tirelessly, without bias, around the clock.

③ Going all the way to automatic fixes (patches)

In DARPA's "AI Cyber Challenge (AIxCC)," fully autonomous AI systems found 86% of the planted vulnerabilities and automatically patched 68%. They also discovered 18 previously unknown vulnerabilities in real-world OSS and generated patches for 11 (the winner was Team Atlanta, a mixed team led by Georgia Tech and others). It was a milestone in showing that AI can handle not just "finding" but "fixing."

AI's capability in numbers (2025)

90 days
Time for XBOW to reach #1 on HackerOne in the US
86% / 68%
Share of vulnerabilities AI found / auto-fixed at AIxCC
hours
Time for AI to complete one pen test (humans: days)

On top of that, alert triage — the most time-consuming part of day-to-day security operations (SOC) — is another AI strength. Analysts are said to spend 25–40% of their working hours investigating false positives, but by handing first-pass triage and noise reduction to AI, humans can focus on the "real threats."

3. Where humans still win — context, chaining, judgment

Does that mean humans are unnecessary? Quite the opposite. There are areas where AI is structurally weak, and they are clear.

① Business-logic flaws — "gaps in the spec" are invisible without understanding intent

This is the biggest weakness. Business-logic flaws — for example, "enter someone else's ID and you can see their orders," or "apply a discount coupon an unlimited number of times" — are easily missed by both scanners and AI, because the code "works correctly" as written. You cannot find them without understanding how the app is supposed to behave. Humans read the intent of the spec and can creatively try out "unintended uses."

② Vulnerability chaining — assembling individual findings into a "real attack"

Real-world breaches are rarely a single vulnerability; they come from chaining together multiple weaknesses. AI is good at finding individual vulnerabilities, but the strategic thinking to assemble them into a realistic attack scenario — "this information disclosure → this privilege escalation → this authentication bypass" — is still a human advantage. In fact, a typical limitation of AI is that at the proof-of-concept (PoC) stage it "finds the bug but can't fully prove it's exploitable."

③ AI false positives and hallucinations — "confident lies"

AI will sometimes fabricate (hallucinate) vulnerabilities that don't exist, or misclassify exploitability. Even in the state-linked attack case described later, the AI used in the attack made mistakes — it fabricated fake credentials and exaggerated its results. That is exactly why AI output has to assume human verification (human-in-the-loop); otherwise it generates noise and a false sense of security. It's also why Big Sleep always inserts a human check before reporting.

The most effective security strategy combines AI-driven automation with human-led analysis — that is the industry consensus as of 2026.

4. Task-by-task cheat sheet — who should own what

Rather than framing "AI vs humans" as win-or-lose, the practical approach is to assign roles per task. The table below summarizes the fit for major security tasks.

TaskAI fitHuman fitRecommendation
Large-scale code/log scanning (SAST)◎ Fast, comprehensive△ Can't match the volumeAI-led
Finding known-pattern vulnerabilities◎ 24/7, strong on repetitionAI-led
Alert triage / removing false positives◎ Good at first-pass triage○ Final callAI triages → human confirms
Generating routine patches○ Can be automated○ Review requiredAI generates → human reviews
Business-logic flaws△ Can't understand intent◎ Creative thinkingHuman-led
Vulnerability chaining / attack scenarios△ Weak strategy◎ Designs the chainHuman-led
Proof of exploitability (PoC)△ Poor at proving itHuman-led
Incident-response decision-making△ Can't carry context/accountability◎ Ultimate accountabilityHuman-led (AI organizes info)
Judging targeted phishing as real or fake○ First-pass filter◎ Contextual judgmentCollaboration

The pattern is clear. "Broad, fast, repetitive" goes to AI; "deep, contextual, final judgment" goes to humans. The two are not competitors but complements.

5. The overlooked "three faces of AI" — a double-edged sword

This is the most important point of the article. In security, AI is not merely a "capable defender." It has three faces at once.

Face ① A source of vulnerabilities

45% of AI-written code has vulnerabilities2.74× more than human-written code (Veracode study, 100+ LLMs × 80 tasks). XSS can't be written safely 86% of the time.

Face ② A tool for attacks

A state-linked group abused Claude to run 80–90% of an attack autonomously. The first large-scale, AI-led cyberattack, targeting about 30 organizations.

Face ③ The strongest defender

The same AI stopped a real zero-day before abuse, and detected and blocked the attack above. We've entered an era where defense, too, counters with AI.

Face ① AI is also a side that "mass-produces vulnerabilities"

In a 2025 study where the security specialist Veracode had more than 100 LLMs solve 80 real tasks, 45% of AI-generated code contained security flaws. Compared with human-written code, the density of vulnerabilities was about 2.74× higher. As AI-assisted coding spread, there were also reports that by mid-2025 new security findings had surged tenfold per month. So-called vibe coding may speed up development, but behind the scenes it actually increases the security workload.

Face ② Attackers are already using AI "autonomously"

In November 2025, Anthropic announced that it had detected and disrupted the first large-scale, AI-led cyber-espionage operation. A Chinese state-linked group (GTG-1002) abused the company's AI coding tool, Claude Code, to attempt intrusions into about 30 targets — tech companies, financial institutions, government agencies and more. The astonishing part is that AI executed 80–90% of the attack without human intervention (the specific method by which the attackers slipped past the AI's safety measures is deliberately omitted in this article to avoid enabling misuse). The lesson to take away is one thing: the very power of AI agents can become a weapon for attackers. That is precisely why defenders must scope the permissions and boundaries they grant to AI agents to the minimum, and be ready to monitor and log their behavior.

Face ③ But defenders can fight with AI too

What matters is that the attack was also detected and blocked by a defending side that leveraged AI. And as noted, the AI used in the attack also made mistakes (such as fabricating fake credentials), so even the attacking side has not yet reached full autonomy. In other words, AI is an amplifier that accelerates both offense and defense, and a new dynamic has emerged: "AI-using defenders vs AI-using attackers." In this arms race, the human teams that wield AI well come out ahead.

6. Verdict — the winner is "humans × AI"

The 2026 answer to "Who's better, AI or humans?" is this: "On its own, AI wins decisively on speed and scale — but the best of all is the 'humans × AI' combination." Just as a mixed human-and-AI team (a "centaur") beat AI alone at chess, in security too a division of roles is the optimal solution.

The optimal role-split model

Hand to AI
Large-scale scanning, known-pattern detection, first-pass triage, repetitive work, drafting routine patches, 24/7 monitoring
Keep with humans
Validating business logic, designing attack chains, final judgment and accountability, verifying AI output, incident-response decisions

Shared rule: always insert human-in-the-loop (a human check)

The takeaway for practitioners and executives is simple. The value of security talent shifts from "the worker who does the hands-on tasks" to "the supervisor who wields AI, verifies its results, and makes the final call." What AI replaces is repetitive work, not judgment, accountability, or creativity. This mirrors the broader picture of how AI affects jobs. Rather than treating AI as an "enemy" or "magic," whether you can embed it into your organization as a powerful but supervision-dependent new expert will decide who wins and loses at security from here on.

In summary — precisely because the attacking side is also accelerating with AI, what matters most is for defenders to adopt AI wisely and "protect properly" by combining it with human judgment. Don't dump everything on AI; let a human make the final check. Teams that stick to this basic discipline become organizations that are resilient to the threats ahead.

FAQ

Q. As AI advances, will security experts become unnecessary?

No. AI takes over repetitive work, large-scale scanning, and first-pass triage, but finding business-logic flaws, designing attack chains, and ultimate decision-making and accountability remain the human domain. If anything, demand grows for "experts who can supervise and verify AI." What disappears is the "work," not the "judgment."

Q. How should small and mid-sized businesses use this trend?

The most cost-effective place to start is handing AI the "broad, fast, repetitive" tasks — log monitoring and alert triage that are too noisy to keep up with, scanning dependencies for vulnerabilities, and so on. Meanwhile, keep the final review before a production release and any important decisions in human hands. Don't take AI output at face value; design your operations from day one so a human check is always inserted.

Q. Is code written by AI safe to ship straight to production?

It's risky. Studies found that about 45% of AI-generated code contained vulnerabilities — roughly 2.74× the rate of human-written code. AI coding boosts productivity, but use generated code on the assumption that it always passes review and security testing. Be aware that security debt tends to build up behind the speed.

Q. If attackers are using AI too, isn't defense at a disadvantage?

It has become an "arms race" in which both offense and defense accelerate with AI. As of 2026, however, the AI used in attacks still makes mistakes (such as fabricating false information) and has not reached full autonomy. Because defenders can also strengthen automatic detection and response with AI, the side that has a human team that operates AI well comes out ahead. The key is not "whether you've adopted AI" but "how skillfully you use it."