AI Security Research & Chaos Testing

Breaking AI chatbots so you don't have to. Technical guides, open-source tools, and verified vulnerability reports for the LLM era.

Explore Research View Tools

housemonkey — zsh

$ pip install housemonkey

Successfully installed housemonkey-0.1.0

$ housemonkey run --target api.example.com \

--persona jailbreaker

[!] Jailbreaker persona active

Sending 5 escalating injection attempts...

FAIL — System prompt leaked in 12s

FAIL — PII accepted without warning

PASS — Authority escalation blocked

Results: 2 FAIL, 1 PASS | OWASP: LLM01, LLM02

Latest Research

Prompt Injection Mar 25, 2026 · 12 min read

Prompt Injection: What It Is, How It Works, and How to Test for It

Prompt injection is the #1 LLM vulnerability. Learn what prompt injection attacks are, see real examples, understand direct vs indirect types, and test your own chatbot in 90 seconds.

Read article

$ housemonkey run --owasp

> Scanning LLM01..LLM10

> 7/10 mapped to personas

> Report: owasp-audit.json

Security Standard

OWASP LLM Top 10: What Actually Matters in 2026

Separating the hype from the hazards. We break down the most critical vulnerabilities for developers building AI-powered apps.

Sept 08, 2026 8 min read

$ housemonkey run \

--target livechat.com \

--adapter livechat --headed

> Jailbreaker: FAIL (66)

Exploit Tool

How We Jailbroke LiveChat in 3 Minutes

Using our MonkeyWrench automation script to demonstrate persistent session takeover through vulnerable chat widgets.

Aug 29, 2026 4 min read

$ nvidia-smi

> GPU 0: RTX 4090 24GB

$ ollama run llama3:70b

> Loading model weights...

Hacking Lab

Setting Up Your First LLM Pentest Environment

A complete hardware and software list for building a local research rig that can handle Llama-3-70B inference.

Aug 15, 2026 15 min read

$ python extract.py \

--model gpt4 --method topk

> Found 847 PII fragments

> Confidence: 0.94

PII Leaks

Exposing the Shadow Knowledge in AI Weights

New research into extracting training data remnants through high-precision token probability analysis.

Aug 10, 2026 10 min read

$ cat report-q3.json | jq

> total_exploits: 512

> new_techniques: 38

> success_rate: 0.73

Vulnerability Report

State of Jailbreaks: Q3 2026 Industry Report

Consolidating 500+ reported exploits to identify the evolving trends in chatbot circumvention tactics.

Aug 02, 2026 20 min read

Topics

Prompt Injection Red Teaming OWASP PII Leaks Jailbreak LLM Security

The Breach Report

Weekly AI security research, vulnerability disclosures, and chaos testing results. No spam.