ExplorerComputer ScienceCybersecurity
Research PaperResearchia:202606.18011

CodeSentinel: A Three-Layer Defense Against Indirect Prompt Injection in Code Contexts

Po-Han Cheng

Abstract

Code large language models increasingly retrieve external code context from repositories, documentation, issue threads, and coding-agent environments, creating an indirect prompt-injection surface where attackers hide instructions in comments, strings, identifiers, or decoy code. We propose CodeSentinel, a three-layer inference-time sanitizer. It uses Tree-sitter to extract high-risk model-facing CST nodes, then combines syntax-guided pre-filtering, CST-guided Dynamic Min-K\% scoring, and node p...

Submitted: June 18, 2026Subjects: Cybersecurity; Computer Science

Description / Details

Code large language models increasingly retrieve external code context from repositories, documentation, issue threads, and coding-agent environments, creating an indirect prompt-injection surface where attackers hide instructions in comments, strings, identifiers, or decoy code. We propose CodeSentinel, a three-layer inference-time sanitizer. It uses Tree-sitter to extract high-risk model-facing CST nodes, then combines syntax-guided pre-filtering, CST-guided Dynamic Min-K% scoring, and node perturbation analysis to detect adversarial and natural-looking semantic triggers. Detected nodes are removed or neutralized before reaching the downstream Code LLM. Across six recent attack families, \CodeSentinel achieves 0.80 average node-level F1, outperforming CodeGarrison, DePA, and KillBadCode.


Source: arXiv:2606.19235v1 - http://arxiv.org/abs/2606.19235v1 PDF: https://arxiv.org/pdf/2606.19235v1 Original Link: http://arxiv.org/abs/2606.19235v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Jun 18, 2026
Topic:
Computer Science
Area:
Cybersecurity
Comments:
0
Bookmark