Kubernetes Controllers Can Silently Fail on Stale Tokens While Appearing Healthy

·1 views

A silent failure mode in Kubernetes controllers built on client-go causes them to retry 401 Unauthorized errors indefinitely instead of restarting to fetch a fresh ServiceAccount token. The issue surfaced during a Helm upgrade via ArgoCD on the CNCF Sandbox KAI-Scheduler, where a deleted and recreated Config CR invalidated projected ServiceAccount tokens, leaving the scheduler stuck in a retry loop. Throughout the incident, the pod continued to show Running and Ready status, and ArgoCD reported the application as Synced and Healthy, masking the problem entirely. A fix was proposed by wrapping the HTTP transport layer in rest.Config to call os.Exit on the first 401 response, prompting kubelet to restart the pod and mount a valid token. While controller-runtime recently added opt-in support for custom watch error handlers, no Kubernetes controller addresses 401 errors by default, making this a widespread gap worth auditing.

Read the full story at DEV Community

This is an AI-generated summary. ShortSingh links to the original source for the complete article.

Discussion (0)

React 19 useCallback Stale Closures Can Leak Tenant Data in Multi-Tenant Apps

A developer shipping a multi-tenant AI dashboard discovered that React 19's useCallback memoization can cause stale closures that leak one tenant's data into another tenant's API calls. The bug surfaced in production when rapid tenant switches occurred while async AI requests were still in-flight, resulting in Tenant B's Claude API calls inadvertently using Tenant A's system prompts. Although the dependency array correctly lists tenantId, it only guards against stale values at render time and cannot cancel or correct closures already captured during ongoing async operations. The issue is particularly hard to catch because it leaves no TypeScript errors, no failed requests, and no visible staging-environment symptoms. In SaaS applications, this silent timing flaw constitutes a data isolation violation that can affect hundreds of tenant accounts concurrently.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Lightchain.ai Flagged as Fraudulent Platform After User Loses $4,321

A user reported losing $4,321.95 on Lightchain.ai after a withdrawal attempt failed and their account balance disappeared. The platform is alleged to be an exit scam that uses fake AI trading bots and malicious smart contracts to block withdrawals. Victims are reportedly pressured into paying fabricated 'upgrade' or 'tax' fees under the false promise of unlocking their funds, but payments yield no results. Security experts warn that any additional fees paid to such platforms only benefit the scammers, who continue to invent new reasons to withhold payouts. Affected users are advised to report incidents to the FBI's Internet Crime Complaint Center at ic3.gov and to avoid unverified third-party recovery services.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Missing one config line cost a solo dev over $600 in excess AI agent spend

A solo developer auditing 34 days and roughly 26,000 traces of AI agent usage discovered that 22.6% of model-routing decisions deviated from his intended policy, amounting to $1,248 in excess list-price spend. The single largest cost cluster — $657 — stemmed from a missing 'model:' line in subagent definition files, which caused cheaper mechanical tasks to silently inherit and run on the more expensive parent model. He also found that live session logs from Claude Code rewrite themselves over time, meaning usage dashboards built on those files can drift; an append-only immutable snapshot resolved the discrepancy. An attempted quality comparison between expensive and cheaper model outputs was inconclusive due to methodological flaws, including context mismatch and self-recognition bias during blind judging. The developer published the full audit, including its failures, as part of an open instrumentation SDK called traceguard, designed to track model usage locally without sending data externally.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Why Most 'Adaptive' Apps Are Just If-Else Logic in Disguise

Many apps marketed as adaptive or personalized rely on simple heuristics — showing harder content after a correct answer and easier content after a wrong one — rather than true adaptive logic. A genuinely adaptive system maintains a continuously updated model of a hidden user variable, such as ability or preference, treating each new action as evidence rather than a standalone trigger. A developer building an exam prep platform identified three essential components for real adaptivity: estimating hidden user state from behavior, selecting the next input calibrated to that estimate, and maintaining a pipeline that keeps generating relevant content. Without all three working together, even a sophisticated statistical model cannot deliver meaningful personalization. The distinction matters because thin underlying models produce unstable, noisy, and non-generalizable experiences regardless of how they are marketed.

0 comments Read more at DEV Community