Summary
The prompt injection detector in src/openhuman/prompt_injection/detector.rs normalizes leet-speak (0→o, 1→i, 3→e, etc.) but does NOT handle:
- Cyrillic homoglyphs:
а (U+0430) for Latin a, о (U+043E) for o, etc.
- Fullwidth characters:
ignore passes through undetected
- NFKD decomposition: accented characters like
igñore evade regex rules
- Confusables from UAX#39: dozens of visually-identical characters from other scripts
Location
src/openhuman/prompt_injection/detector.rs — normalize_prompt() function
Impact
High — Trivial bypass of prompt injection detection. An attacker substitutes a single Cyrillic character in "ignore previous instructions" and the regex rules never fire.
Suggested Fix
- Apply Unicode NFKD decomposition before lowercasing
- Add confusable mapping from UAX#39 (at minimum Latin↔Cyrillic)
- Strip all characters from categories Cf, Mn, Mc that aren't essential to meaning
Summary
The prompt injection detector in
src/openhuman/prompt_injection/detector.rsnormalizes leet-speak (0→o, 1→i, 3→e, etc.) but does NOT handle:а(U+0430) for Latina,о(U+043E) foro, etc.ignorepasses through undetectedigñoreevade regex rulesLocation
src/openhuman/prompt_injection/detector.rs—normalize_prompt()functionImpact
High — Trivial bypass of prompt injection detection. An attacker substitutes a single Cyrillic character in "ignore previous instructions" and the regex rules never fire.
Suggested Fix