Miscoped Failure Cache Caused Autonomous Pentesting Agent to Silently Skip Tools

·1 views

A developer building Halo, an open-source autonomous pentesting agent powered by the Gemma 4 12B local LLM, discovered a state-management bug that caused the agent to permanently skip tools without any error or warning. The agent's failure cache in agent_cache.py used SHA-256 fingerprints keyed only by tool name and target, with no engagement-level scoping. This meant a single failed tool run against one target would globally blacklist that tool across all future, unrelated engagements. The fix involved adding an engagement_id field to the cache key, so failure records are now isolated per session and tools start fresh on new engagements. The incident highlights a broader design risk in agentic systems: caches scoped too broadly can silently degrade an agent's capabilities over time without triggering any obvious errors.

Read the full story at DEV Community

This is an AI-generated summary. ShortSingh links to the original source for the complete article.

Discussion (0)

Engineer builds DIY mmWave radar system for material classification

A developer named Gauthier Lechevalier has built a millimeter-wave (mmWave) radar system capable of classifying different materials. The project is documented on his personal website, where he shares technical details of the build. mmWave radar operates at high frequencies and can detect material properties beyond simple motion or distance sensing. The project has garnered attention on Hacker News, reflecting growing hobbyist interest in advanced radar technology.

0 comments Read more at Hacker News

ProgrammingDEV Community ·

Developer Guide: Implementing Role-Based Access Control in Go with SQL

A software developer has published a technical walkthrough detailing how to build a role-based access control (RBAC) system using Go and SQL. The tutorial models a blog platform where three user roles — viewer, editor, and admin — are granted distinct permissions over post-related operations. The implementation relies on a relational database schema with linked tables for roles, permissions, users, and posts to manage access rules. The guide covers database migrations, entity setup, authorization logic, and route configuration for a working RBAC system. As a bonus, the tutorial also demonstrates adding an in-memory cache to reduce repeated database lookups when validating user permissions.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Seven AI Coding Tools Ranked on End-to-End App Development Capabilities in 2026

A 2026 analysis evaluated seven AI coding tools — including Bleenk, Replit Agent, Bolt.new, Cursor, and GitHub Copilot — on their ability to cover the full development loop from scaffolding to deployment. The review used a five-stage benchmark requiring scaffold, live preview, end-to-end testing, security audit, and one-click deploy within a single environment. According to the assessment, most tools only address one or two of these stages, forcing developers to manually connect external services for the rest. Bleenk was identified as the only tool covering all five stages natively, while Replit Agent was noted for its limited preview and absence of security auditing. The findings come as 2026 research shows 84% of developers are already using or planning to adopt AI coding tools, highlighting a gap between AI-assisted coding and production-ready app delivery.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Developer Tests 7 Hyped AI Coding Tools: 4 Saved Time, 3 Did Not

A developer conducted hands-on testing of seven widely promoted AI tools within real production workflows, not demos or sandbox environments. GitHub Copilot, Cursor, Perplexity, and Warp terminal were found to deliver genuine time savings across code generation, multi-file refactoring, research, and command-line tasks respectively. Cursor stood out for complex, cross-file architectural work, while Perplexity eliminated lengthy documentation searches by returning cited answers quickly. Amazon Q Developer performed well only within AWS-focused stacks, losing out to Copilot on general-purpose tasks. The findings suggest that AI tool value is highly dependent on workflow fit, with specialised tools outperforming generalist alternatives in their respective domains.

0 comments Read more at DEV Community