AI writes insecure code.
We're fixing that.
AI models are trained on static text — great at syntax, terrible at security. Muence builds reinforcement learning environments that teach models to write secure, robust code.
The Problem
Models have never been penalized for writing insecure code.
Current training pipelines reward functional correctness — does the code run? Security logic is never part of the reward signal. So models learn to write code that works, but leaves SQL injection, broken access control, and hardcoded secrets behind by default.
The fix isn't more data. It's the right environment — one that gives models a penalty signal when they ship insecure code.
Across 129 AI-generated code samples evaluated on Muence
What We Build
Our Offerings
Two interlocking products — the data that feeds the environments, and the environments themselves.
Preference & Eval Data
Comprehensive security preference datasets for reinforcement learning and model evaluation. Human-labeled audit comparisons across real-world vulnerability patterns, exported as DPO-ready training pairs.
- Human preference labels on AI security audits
- DPO-ready JSONL export
- CWE-mapped vulnerability taxonomy
- Scalable data gathering pipeline
Custom RL Environments
Sandboxed code execution environments with automated security test suites and a scoring function that gives AI models a reward signal for secure code and a penalty for insecure code — plug directly into post-training pipelines.
- Real-world code repos, fully containerized
- Automated security test suites
- Reward / penalty scoring function
- Plug-and-play with post-training pipelines
The Pipeline
How we generate training signal
Generate
AI models generate code from real-world prompts — login systems, file uploads, payment flows — without any security guidance.
Audit
Multiple models independently audit the same code for security vulnerabilities. Results are compared side-by-side.
Label & Train
Human reviewers vote on audit quality. Preference pairs become DPO training data and RL reward signals.
Talk to the team
Interested in the research, a data partnership, or running your model through our benchmark?