SShortSingh.
Back to feed

AI coding agent found just 2 of 16 key dependencies in 36,000-file Rails audit

0
·1 views

A developer tested a Claude-based coding agent on a real maintainability task using GitLab's open-source Rails monolith, one of the largest of its kind with over 36,000 tracked files. The agent was asked to audit every part of the codebase that depends on the MergeRequest model before a planned rework, with a hand-built gold set of 16 scattered dependents used as the benchmark. Without a codebase map, the agent relied on grep-style token searches, returning tens of thousands of hits it could only partially sample within its token budget. It produced a confident, well-structured report citing only real files, but identified just 2 of the 16 true dependents, missing those linked through shared concerns like the Issuable module that never reference MergeRequest by name. Critically, the agent showed no awareness that its audit was incomplete, a finding the author notes is the core risk in using such tools on large, convention-heavy codebases.

Read the full story at DEV Community

This is an AI-generated summary. ShortSingh links to the original source for the complete article.

Discussion (0)

Log in to join the discussion and vote.

Log in

Related stories

0
ProgrammingDEV Community ·

How to Fetch Job Listings from Lever's Public Postings API Using Node.js

Lever, an applicant tracking system, offers a public Postings API that allows read-only access to a company's published job listings without requiring an API key. Developers only need a company's Lever site slug, which can be found in the career page URL, to query job data in JSON format. The API supports pagination via skip and limit parameters, along with filters for team, department, location, and commitment type. Companies hosted in Europe use a separate EU-region base URL, while US-hosted companies use the standard api.lever.co endpoint. A Node.js code walkthrough demonstrates how to paginate through all postings and normalize the returned data into a structured format.

0
ProgrammingDEV Community ·

How LLM Function Calling Works: Structured Outputs via Constrained Token Generation

Large language models cannot browse the internet or natively return structured data, but function calling allows them to invoke external tools like APIs in a controlled way. Unlike plain text or basic JSON mode, function calling lets developers define an exact output schema — including field names, types, enums, and required fields — that the model must follow. This works through constrained decoding, where the API restricts which tokens the model can generate at each step to ensure the output matches the specified schema. As a result, function calling is the most reliable of the three main LLM output methods, eliminating the fragile parsing required with free-form text responses. A single user query can trigger multiple sequential tool calls, enabling the model to orchestrate complex, multi-step answers within one interaction.

0
ProgrammingDEV Community ·

How to Set Up DMARC Forensic Reports for Email Authentication Debugging

DMARC Forensic Reports (RUF) are detailed failure reports generated when individual emails fail DMARC authentication, helping postmasters diagnose specific delivery and spoofing issues. Unlike aggregate RUA reports, RUF reports include email headers, subject lines, and message snippets, making them powerful but privacy-sensitive debugging tools. Domain owners can enable RUF by adding the 'ruf' tag to their DMARC DNS TXT record, along with the 'fo' tag to control when reports are triggered. Because RUF reports contain sensitive message content, many major email providers disable or heavily redact them by default to limit privacy risks. Experts recommend starting with a 'p=none' policy, using a dedicated secure mailbox for receiving reports, and ensuring compliance with data protection regulations before enabling RUF.

0
ProgrammingDEV Community ·

DMARC p=reject Boosts Email Security But Cannot Guarantee Full Deliverability

DMARC's p=reject policy instructs receiving mail servers to block unauthenticated emails impersonating a domain, making it a key milestone in email security. However, its effectiveness depends on correct configuration of underlying protocols SPF and DKIM, both of which can fail due to email forwarding or header modifications by intermediate servers. Subdomains are not automatically covered and require separate policy tags, while legitimate third-party senders must also be properly aligned or their messages will be blocked. Beyond authentication, sender reputation — shaped by spam complaint rates, bounce rates, and sending history — remains a critical factor that DMARC alone cannot address. Complementary standards such as BIMI, MTA-STS, and TLS Reporting are needed alongside DMARC to build a more complete email security and deliverability framework.