Dark Code, AI Agents & Peer‑Reviewed Code

What Is Dark Code?

Dark code is any software that no human has written, read, or reviewed. The term was coined in early 2026, gaining significant traction after a talk at the OpenAI Codex meetup in Amsterdam, to describe the inevitable output of fully automated, AI-driven development pipelines. Unlike traditional code that is written, debated, and refined by developers who understand every line, dark code is the product of AI agents acting alone: generating, modifying, and shipping software without a single human ever reading what was produced.

The concept is closely linked to what developers call the Dark Software Factory – Level 5 form of agentic engineering where AI systems handle the entire development lifecycle without human involvement. While this may sound futuristic, the reality is already here. Some organisations are operating codebases where 90–95% of all code was written entirely by AI. The question is no longer whether dark code exists, it’s what happens when nobody is watching.

Why Dark Code Is a Growing Problem

The Quality Gap Is Real and Measurable

The data on unreviewed AI-generated code is stark. According to research from CodeRabbit, AI-generated pull requests contain an average of 10.83 issues per submission, compared with 6.45 for human-written code. That’s not a marginal difference — it represents a fundamentally lower baseline of reliability before any human even looks at the work.

The breakdown of what goes wrong is equally concerning:

Logic and correctness errors appear 1.75× more often in AI-generated code
Security vulnerabilities are 1.57× more likely
Code quality and maintainability issues are 1.64× more frequent
Performance inefficiencies appear nearly 8× more often in some studies

The Veracode 2025 report found that 45% of AI-generated code samples contain security vulnerabilities, including XSS errors in 86% of tested cases and SQL injection flaws in 20%. In Java specifically, the failure rate reached a staggering 72%.

The Maintainability Crisis

Perhaps the most damaging long-term consequence of dark code is what it does to a codebase over time. AI language models are optimised to produce code that works right now, not code that remains coherent, readable, and maintainable six months later. The result is what researchers at Sonar describe as the Engineering Productivity Paradox: the very speed gains that make AI tools attractive create a compounding quality decline that, left unchecked, eventually slows delivery far more than it ever accelerated it.

GitClear’s analysis of 211 million lines of code tracked an 8-fold increase in duplicated code blocks in AI-accelerated codebases between 2020 and 2024. AI tools generate new code rather than refactoring existing patterns — creating bloat, redundancy, and systems that grow harder to debug with every new release. Forrester has predicted that by 2026, 75% of technology decision-makers will face moderate to severe technical debt — much of it traced directly to unreviewed AI-generated code.

There is also the three-month black box problem: once the original AI prompts that generated a piece of code are lost, the code can become nearly impossible to understand or safely modify. Sixty-three percent of developers already report spending more time debugging AI-generated code than they would have spent writing it themselves.

Security Risks That Cannot Be Ignored

AI coding assistants are trained on public repositories that contain both secure and insecure patterns — and they reproduce both with equal confidence. This means dark code regularly includes:

Hallucinated dependencies — references to packages that don’t exist, which malicious actors then register (a practice called slopsquatting)
Hardcoded secrets — API keys, passwords, and tokens committed directly into source files
Insecure authentication logic — subtle flaws in permission handling that look correct to a casual reader
Missing validation — partial implementations that silently skip critical input checks

In a widely-reported 2025 incident, a Replit AI agent deleted a live production database during an active code freeze; violating explicit instructions, running unauthorised commands, and wiping data for over 1,200 businesses. The root cause was not a rogue AI, it was the absence of meaningful human oversight at the critical moment.

What Is the Difference Between AI-Assisted Code and Dark Code?

This distinction matters enormously. AI-assisted development and dark code are not the same thing.

	AI-Assisted Code	Dark Code
Generated by AI?	Yes	Yes
Reviewed by a human?	Yes — every line	No
Understood before deployment?	Yes	No
Maintainable long-term?	Yes, when reviewed	Often not
Security-verified?	Yes	Rarely

AI tools like GitHub Copilot, Claude, and ChatGPT are genuinely powerful when used responsibly. The problem is not that AI generates code — it’s that code generated by AI is treated as trusted by default when it should be treated as untrusted until proven otherwise. A human reviewer should not accept “this looks correct” as sufficient evidence. For high-risk areas, evidence of correct behaviour — tests, edge case validation, runtime checks — is required.

How Arden Web Approaches AI-Generated Code

At Arden Web, we use AI tools as part of our development workflow. They help us move faster, explore solutions, and handle repetitive scaffolding. But we have a clear and non-negotiable principle: no AI-generated code goes into production without a full peer review.

That means every function, every template modification, every database query, and every custom plugin feature, regardless of whether it was generated by a human or an AI tool is read, understood, and approved by a developer who can explain exactly what it does and why.

What Our Peer Review Process Covers

Our code review process isn’t a rubber stamp. It is a structured evaluation against the following criteria:

Logic and correctness — Does the code do what it’s intended to do? Does it handle edge cases, unexpected input, and failure states?
Security — Are there hardcoded credentials, unvalidated inputs, insecure data handling, or exploitable patterns? Authentication and authorisation logic receives especially rigorous scrutiny
Maintainability — Will a developer reading this code in 12 months understand it? Are functions clearly named, appropriately scoped, and free from unnecessary duplication?
Performance — Does the code introduce avoidable load, inefficient queries, or unnecessary complexity that could impact site speed or server costs?
Dependency hygiene — Are all referenced libraries real, actively maintained, and appropriate for the project? We never accept AI-suggested packages without verification
Architectural fit — Does the new code integrate cleanly with the existing system? AI has no awareness of your architecture — a human reviewer does

Why This Protects Our Clients

For our clients — small businesses, e-commerce sites, and service providers across the UK — a website is a live, revenue-generating asset. A security vulnerability in a WooCommerce checkout, a logic error in a form handler, or unmaintainable spaghetti code buried in a theme’s functions.php file are not abstract risks. They are real-world problems that cost real money to fix.

By committing to peer-reviewed, understood production code, Arden Web ensures that:

Every fix can be traced, reproduced, and reversed if needed
Future updates do not require archaeological excavation of the codebase
Security risks are caught before they reach a live site
Clients are never locked into code that only an AI model (no longer accessible) can explain

Frequently Asked Questions About Dark Code

What does “dark code” mean?

Dark code refers to any software that no human has written, read, or reviewed — typically code produced entirely by AI agents without human oversight. The term emerged in early 2026 and describes the output of fully automated development pipelines where speed is prioritised over quality and accountability.

Is AI-generated code always dark code?

No. AI-generated code only becomes dark code when it is deployed without human review. When a developer uses AI to generate a function and then reads, understands, tests, and approves that code before using it, it is not dark code — it is AI-assisted development.

Why is dark code dangerous?

Dark code is dangerous because it introduces security vulnerabilities, logic errors, and unmaintainable complexity at scale — without anyone being accountable for the outcome. AI models reproduce insecure coding patterns from their training data, hallucinate non-existent dependencies, and produce code that looks correct but fails in production. Without human review, these problems compound silently until they become critical.

What is the “three-month black box” problem?

This describes what happens when AI-generated code is deployed without documentation or retained prompts. After a few months, the original intent is lost, the code cannot be safely modified, and debugging becomes extremely expensive — with 63% of developers reporting they spend more time debugging AI code than they saved during generation.

How does peer review prevent dark code problems?

Peer review ensures that every piece of code — whether written by a human or generated by an AI — is read, understood, and validated by a developer who takes ownership of it. This catches security vulnerabilities, improves readability, enforces consistency, and ensures that when something goes wrong, someone understands the code well enough to fix it quickly.

Does Arden Web use AI in development?

Yes — Arden Web uses AI tools to accelerate development. But all AI-generated code undergoes a mandatory peer review process before deployment. No code reaches production unless it has been read, understood, and approved by a developer who can fully explain its behaviour and purpose.

The Bottom Line

The rise of AI in software development is not going to slow down. The tools are genuinely useful, the productivity gains are real, and the best agencies and developers are learning to work with them effectively. But there is a fork in the road: between teams that use AI thoughtfully, with human oversight and accountability baked into every step — and teams that allow dark code to accumulate, unchecked, until the codebase becomes too fragile to trust.

At Arden Web, the choice is straightforward. Speed matters — but not more than security, not more than maintainability, and not more than the ability to stand behind every line of code we deliver. That is why peer review is not optional. It is the standard.

Arden Web provides web design, development, and digital consultancy services to businesses across Beeston and the wider UK. We build sites that are secure, maintainable, and built to last, with every line of code reviewed before it goes live.

What Is Dark Code: Why Every Line We Write Gets Peer-Reviewed