AgentGuard v0.5.5 Adds Cross-Function Taint Tracking to Catch Hidden LLM Vulnerabilities

·1 views

Security tool AgentGuard has released version 0.5.5 with interprocedural taint analysis, addressing a key blind spot in static application security testing (SAST) scanners. Most existing scanners fail to track tainted user input when it passes through multiple function calls before reaching a large language model, producing false negatives. AgentGuard now builds a catalog of Python functions, identifies LLM sinks, and traces tainted arguments across direct calls and multi-hop chains within the same file. The update ships with 56 passing tests and a 0% false positive rate across 32 benchmark samples. Cross-file call resolution and sanitizer tracking are planned for future phases, and the tool is available via PyPI as dfx-agentguard==0.5.5.

Read the full story at DEV Community

This is an AI-generated summary. ShortSingh links to the original source for the complete article.

Discussion (0)

Historical facts that reframe your sense of time, from Oxford to the Aztecs

A Smithsonian Magazine article highlights surprising historical comparisons that challenge common assumptions about the timeline of human civilization. One notable example is that the University of Oxford predates the founding of the Aztec Empire, illustrating how institutions can be older than entire civilizations. The piece presents several such juxtapositions to help readers gain a fresh perspective on how events in history overlap in unexpected ways. The article was shared on Hacker News, where it attracted modest attention with five points and no comments at the time of posting.

0 comments Read more at Hacker News

ProgrammingDEV Community ·

Osloq AI Agent Reproduces Bugs and Reports Root Causes Without Touching Your Code

Osloq is an AI agent designed to investigate software bugs by reading GitHub issues, tracing code paths, and reproducing problems inside an isolated sandbox environment. Unlike tools such as Devin or Sweep AI that automatically write fixes and open pull requests, Osloq focuses solely on identifying and documenting the root cause, leaving all fix decisions to the developer. Once a bug is reproduced, Osloq generates a detailed report containing screenshots, console logs, call stacks, and a plain-language explanation of what went wrong. The sandbox is destroyed after each investigation, and the tool operates with read-only access to repositories, meaning no code is stored or used for model training. Osloq positions itself as a low-risk "investigator" tool suited for QA teams, open-source maintainers, and safety-critical projects where evidence-based decision-making is a priority.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Security Scan Finds 332 Critical Flaws Across LlamaIndex, AutoGen, and CrewAI

A security audit using AgentGuard v0.6.1 uncovered 332 critical vulnerabilities across three widely used AI agent frameworks: LlamaIndex, AutoGen, and CrewAI. LlamaIndex alone accounted for 252 critical findings, including credential exposure in replay logs and unsafe trust boundary handling in its MCP host. CrewAI showed 391 medium-severity findings, with agent data flowing to external endpoints without proper constraints. All three frameworks are in active production use, with some boasting over 30,000 GitHub stars and deployments at Fortune 500 companies. The researchers note that fixes exist for all identified issues, including input validation, sandbox enforcement, and log scrubbing, representing standard application security practices not yet consistently applied to AI agent code.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

CPMO Playbook Chapter 9: How Product Leaders Should Navigate the Scale Stage

Chapter 9 of 'The CPMO Playbook' by Ali Sadhik Shaik focuses on what scaling means for a Chief Product and Marketing Officer. The chapter outlines team structures at scale, covering roles such as PM, PMM, Growth, Brand, and Ops. It details key operational cadences including weekly reviews, monthly business reviews, and quarterly planning cycles. Core metrics discussed include Net Revenue Retention, segment win rate, and pipeline coverage. The chapter also warns against a common trap: over-optimizing the funnel while the broader product category is shifting.

0 comments Read more at DEV Community