Red Teams Jailbreak GPT-5 With Ease, Warn It’s ‘Nearly Unusable’ For Enterprise

Two different firms have tested the newly released GPT-5, and both find its security sadly lacking. After Grok-4 fell to a jailbreak in two days, GPT-5 fell in 24 hours to the same researchers. Separately, but almost simultaneously, red teamers from SPLX (formerly known as SplxAI) declare, “GPT-5’s raw model is nearly unusable for enterprise out of the box. Even OpenAI’s internal prompt layer leaves significant gaps, especially in Business Alignment.”

NeuralTrust’s jailbreak employed a combination of its own EchoChamber jailbreak and basic storytelling. “The attack successfully guided the new model to produce a step-by-step manual for creating a Molotov cocktail,” claims the firm. The success in doing so highlights the difficulty all AI models have in providing guardrails against context manipulation. […] “In controlled trials against gpt-5-chat,” concludes NeuralTrust, “we successfully jailbroke the LLM, guiding it to produce illicit instructions without ever issuing a single overtly malicious prompt. This proof-of-concept exposes a critical flaw in safety systems that screen prompts in isolation, revealing how multi-turn attacks can slip past single-prompt filters and intent detectors by leveraging the full conversational context.”

While NeuralTrust was developing its jailbreak designed to obtain instructions, and succeeding, on how to create a Molotov cocktail (a common test to prove a jailbreak), SPLX was aiming its own red teamers at GPT-5. The results are just as concerning, suggesting the raw model is ‘nearly unusable’. SPLX notes that obfuscation attacks still work. “One of the most effective techniques we used was a StringJoin Obfuscation Attack, inserting hyphens between every character and wrapping the prompt in a fake encryption challenge.” […] The red teamers went on to benchmark GPT-5 against GPT-4o. Perhaps unsurprisingly, it concludes: “GPT-4o remains the most robust model under SPLX’s red teaming, especially when hardened.” The key takeaway from both NeuralTrust and SPLX is to approach the current and raw GPT-5 with extreme caution.

Red Teams Jailbreak GPT-5 With Ease, Warn It’s ‘Nearly Unusable’ For Enterprise

Published by admin on August 9, 2025

Nvidia, Microsoft, Meta Warn Against ‘Premature Restrictions’ of Open-Weight Models

Humans Can Learn To Echolocate In Just 10 Weeks, and It Rewires the Brain

Researchers Discover First Known Transmissible Cancer In Fish

Red Teams Jailbreak GPT-5 With Ease, Warn It’s ‘Nearly Unusable’ For Enterprise

Published by admin on August 9, 2025

Related Posts

Nvidia, Microsoft, Meta Warn Against ‘Premature Restrictions’ of Open-Weight Models

Humans Can Learn To Echolocate In Just 10 Weeks, and It Rewires the Brain

Researchers Discover First Known Transmissible Cancer In Fish