SShortSingh.
Back to feed

Model choice, not prompt design, drives AI agent system prompt leak risk

0
·1 views

A developer running informal red-team tests on self-hosted AI agents found that the underlying language model was the dominant factor in whether hidden system prompts were exposed to users. Using a single vulnerable test agent with planted fake credentials, disclosure rates ranged from near zero to roughly 96% across five different models given identical prompts and attack probes. The findings align with a 2023 academic study that recorded widely varying leak rates across different models, reinforcing that model selection is a first-order security variable. System prompt leakage is recognized by OWASP as a named risk category, and real-world incidents — including Samsung engineers inadvertently exposing internal code via public LLMs — illustrate the concrete harm involved. The author cautions that the results come from a single configuration with limited runs and are not sufficient to label any specific model as definitively unsafe.

Read the full story at DEV Community

This is an AI-generated summary. ShortSingh links to the original source for the complete article.

Discussion (0)

Log in to join the discussion and vote.

Log in

Related stories

0
ProgrammingDEV Community ·

Drift Protocol Lost $285M to Six Months of Social Engineering, Not a Code Bug

On April 1, 2026, attackers linked to North Korea's Lazarus Group drained approximately $285 million from Solana-based DeFi platform Drift Protocol, making it the second-largest exploit in Solana's history. The attack involved no vulnerability in the smart contract code; instead, operatives spent six months building trust with team members who held admin keys before ultimately gaining access. A similar human-targeted attack hit KelpDAO two weeks later, resulting in a $292 million loss via a compromised LayerZero bridge developer. North Korea-linked actors were attributed to 76% of all crypto hack losses in early 2026, with private-key compromise now surpassing code exploits as the leading cause of theft. Security analysts stress that multisig controls, hardware key storage, time delays, and strict operational hygiene are now as critical as smart contract audits.

0
ProgrammingDEV Community ·

MCP Server Tool Schemas Cost One Developer 42,000 Tokens Per Request

A software developer discovered that connecting a single GitHub MCP server added 42,000 tokens of schema overhead to every AI agent request, inflating token costs by 37% on the very first call. The Model Context Protocol requires each connected server to push its full tool schema into the model's context window at session start, meaning costs accumulate regardless of whether those tools are actually used. In a documented week-long test in June 2026 using Claude Opus 4.5, running four MCP servers with 47 tools consumed 80% of the context window on schema definitions alone, costing $8.92 per 1,000 requests versus $0.92 with no MCP. The developer applied four optimizations — including lazy schema loading triggered only when relevant and compressing tool descriptions from ~3,500 to ~800 characters — which cut token overhead from 80% down to 37%. The findings highlight a largely untracked cost factor for teams deploying AI agents with multiple MCP-connected tools in production environments.

0
ProgrammingDEV Community ·

Developer Switches to Local AI Model After Claude Session Limit Corrupts Git Commit

A developer using a one-liner shell command to auto-generate Git commit messages via Claude AI ended up with the error message "You've hit your session limit" saved as an actual commit in their repository. The incident prompted them to explore running a local language model using Ollama, ultimately choosing the quantized qwen2.5-coder:1.5b model due to its low memory footprint of around 1.2 GB on an 8 GB RAM laptop. While the local setup worked without session limits, the developer discovered that Ollama restricts the default context window to 4,096 tokens based on available VRAM, causing the model to only process part of the diff and return inaccurate commit messages. Crucially, the model did not flag its own incomplete input — it responded confidently despite missing context, highlighting a key limitation of truncated inference. This led the developer to explore Modelfiles, a configuration layer in Ollama that allows tuning model parameters beyond prompt engineering alone.

0
ProgrammingDEV Community ·

Ex-Accenture BA Seeks Advice After Resignation, Job Search Struggles

A former Accenture Business Analyst has shared their career struggles on DEV Community after resigning from their first job due to challenges with spoken English and lack of project guidance. Assigned to a banking domain role requiring constant client communication, the individual found the demands overwhelming and eventually lost confidence. After four months of unemployment, interviews with TCS and Infosys also proved difficult due to the same communication barriers. The person has since been exploring Backend Development and Data Science but is uncertain which path to pursue. They are seeking community guidance on skill-building, job applications, internship opportunities, and rebuilding confidence in spoken English.

Model choice, not prompt design, drives AI agent system prompt leak risk · ShortSingh