OpenAI Built an AI That Can Hack Hardened Targets — Now They're Deciding Who Gets to Use It

GPT-5.3-Codex is the first AI model its own maker calls ‘high risk’ for cyber. The Trusted Access program is their answer. Is it enough?

Feb 16, 2026

Don’t Hack On Me — Signal February 15, 2026

The Story

On February 5, OpenAI released GPT-5.3-Codex — and quietly made history. It’s the first AI model that OpenAI itself classifies as “High” risk for cybersecurity under their Preparedness Framework. That classification means OpenAI believes the model can automate end-to-end cyber operations against reasonably hardened targets, or automate the discovery and exploitation of operationally relevant vulnerabilities. Read that again. The company that built it is telling you it can hack things.

The capability curve has been steep. OpenAI’s models went from a 27% success rate on capture-the-flag cybersecurity challenges (GPT-5, August 2025) to 76% (GPT-5.1-Codex-Max, November 2025). GPT-5.3-Codex pushes that further. The company also has Aardvark, an agentic security researcher in private beta that scans codebases, reasons over entire repositories, finds vulnerabilities, and proposes patches. Aardvark has already discovered and responsibly disclosed vulnerabilities that received 10 CVE identifiers in open source projects.

So what’s OpenAI’s answer to releasing a model that can hack hardened targets? A program called Trusted Access for Cyber — an identity and trust-based framework that gates enhanced cyber capabilities behind verification. Vetted security professionals get access. Everyone else gets guardrails. Individual users can verify their identity; enterprises can request trusted access for teams. There’s also an invite-only tier for security researchers who need more permissive models. OpenAI is backing it with $10 million in API credits for defensive cyber research.

Not everyone is satisfied with the safeguards. The Midas Project, an AI safety watchdog, pointed out that GPT-5.3-Codex triggered OpenAI’s own “high risk” threshold but was deployed without the specific misalignment safeguards the Preparedness Framework calls for at that level. OpenAI’s response: those safeguards are only required when high cyber capability occurs in conjunction with long-range autonomy. The model is a powerful tool, not an autonomous agent — the distinction matters.

Source:Trusted Access for Cyber (OpenAI, February 5, 2026

Who’s Covering This

Fortune — “Unprecedented cybersecurity risks.” Focuses on the tension between capability advancement and safety. (February 5, 2026)
OpenAI System Card — The technical system card detailing the “High” cybersecurity risk classification and mitigation approach. (February 5, 2026)
SC Media — Covers the Trusted Access launch and $10 million in API credits for defensive research. (February 2026)
OpenAI — Strengthening Cyber Resilience — OpenAI’s broader strategy post explaining how they’re planning for models that could develop zero-day exploits against well-defended systems. (February 2026)
OpenAI — Introducing Aardvark — The agentic security researcher that scans codebases and has already found 10 CVEs in open source projects. (October 2025)

If you’re in cybersecurity operations: This is a tools story, and you should think about it the way you think about every powerful tool that’s come through this industry. Cobalt Strike was supposed to be a penetration testing tool. Metasploit was supposed to be a penetration testing tool. Both ended up in the hands of threat actors. That’s going to happen with AI cyber capabilities too — it’s not a question of if, it’s a question of when. The question is whether defenders get to use these tools first.

OpenAI’s Trusted Access program is an attempt to put these capabilities in the hands of the good guys before the bad guys figure it out on their own. If you’re a security practitioner or your team does vulnerability management, pen testing, or code review — apply. Get in early. Start experimenting with what these models can do for your workflows now, because the attackers aren’t waiting for an access program. The $10 million in API credits is real money on the table for defensive research. Take advantage of it.

If you’re in leadership: The models are getting better — fast. OpenAI went from 27% on CTF challenges to 76% in three months. GPT-5.3-Codex is even better. And OpenAI isn’t the only one: Hacktron AI found the BeyondTrust variant through AI-enabled analysis just weeks ago. This isn’t theoretical anymore. AI systems are finding real vulnerabilities at production scale.

What does that mean for your program? Vulnerability volume is going to increase. AI is going to find more bugs faster — both by the good guys doing responsible disclosure and by the bad guys scanning for exploitable targets. Your vulnerability management program needs to be ready for a world where the rate of CVE discovery accelerates. FIRST is already projecting 50,000+ CVEs in 2026 — a record. The organizations that integrate AI into their defensive workflows early will have an advantage. The ones that don’t will be patching faster just to keep up.

The safety debate around the Midas Project’s criticism is worth watching but shouldn’t distract from the practical reality: OpenAI’s logic makes sense here. A high-capability tool without autonomous agency is still just a tool — it needs a human operator. The risk profile is fundamentally different from an autonomous agent that can chain operations independently. The real risk isn’t the model itself. It’s who gets access and what they do with it.

The bigger picture: Every generation of security tooling follows the same pattern. A powerful capability emerges. It gets built for defense. It ends up in offense. The defenders who adopted early had the advantage; the ones who waited were playing catch-up. We saw it with Metasploit, we saw it with Cobalt Strike, and we’re going to see it with AI cyber capabilities. Sometimes you don’t know what software is going to be used for — even Sam Altman has said they’ve been surprised by how their models get applied. The capability is here. The question isn’t whether AI can hack things — OpenAI just told you it can. The question is whether you’re going to use it to find the holes before someone else does.

This post was researched, drafted, and edited with AI assistance. The analysis and perspective are Marcus’s. See something wrong? Leave a comment.

Discussion about this post

Ready for more?