The Log Is the Agent

·1 views

Article URL: https://arxiv.org/abs/2605.21997 Comments URL: https://news.ycombinator.com/item?id=48790912 Points: 4 # Comments: 0

Read the full story at Hacker News

This is an AI-generated summary. ShortSingh links to the original source for the complete article.

Discussion (0)

Silero VAD and ONNX Runtime Detect 12 Speech Segments in 14-Second Audio Clip

A developer used the Silero VAD ONNX model with ONNX Runtime's CPU provider to detect speech in a 14.171-second two-speaker MP3 conversation. FFmpeg decoded the audio into a 16 kHz mono waveform, which was then processed in 32-millisecond chunks to generate speech probability scores. Using a detection threshold of 0.5 to open segments and 0.35 to close them, the system identified 12 distinct speech segments while discarding clips shorter than 250 milliseconds. The entire detection process completed in just 0.028 seconds on a Mac Studio, achieving a real-time factor of 0.002x. Each detected segment was saved as a separate 16-bit PCM WAV file, with the full reproducible code available in the kiarina/labs GitHub repository.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Bilateral AI provenance standard adds agent self-signing to notarized records

A new cryptographic protocol called Bilateral Signature (v0x04) addresses a gap in AI output provenance by requiring an AI agent to sign its own work hash before an independent notary counter-signs it. Previously, the v0x03 standard only proved a hash existed at a given timestamp, but could not confirm the agent actually authored the underlying content. The updated protocol fuses the agent's Ed25519 signature into the notarized record, meaning any forgery would require compromising two separate private keys instead of one. The new version maintains the same 239-byte size, $0.01 cost, and binary layout as its predecessor, with the notary automatically selecting v0x04 when an agent signature is included in the request. All nine existing mainnet records under v0x03 remain valid without migration.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

React Explained: Virtual DOM, Components, and State Through a Mall Analogy

A DEV Community article uses a shopping mall metaphor to explain how React works internally, mapping core concepts like components, props, state, and the Virtual DOM to familiar real-world equivalents. Before React, every website update required directly manipulating the live DOM, much like rearranging a shop floor in front of customers — a slow and error-prone process. jQuery sped up these manual changes but did not eliminate the need to plan and execute each update individually. At Facebook's scale, with millions of simultaneous users triggering thousands of DOM updates per second, this approach became unmanageable. React introduced the Virtual DOM — a private design studio where changes are drafted, compared against the current state via a diffing process, and only the minimal necessary updates are applied to the real DOM.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Developer proposes 'Token Clustering' theory to explain AI reasoning failures in complex tasks

A developer who has built over 20 AI applications, including a multi-agent gold trading system and a 9-agent YouTube automation pipeline, reports persistent logical breakdowns in GPT-4o and Claude Opus during multi-step reasoning tasks. The failures are not factual errors but appear as inconsistent outputs, broken logic chains, and arithmetic mistakes embedded within larger reasoning flows. The issues became more noticeable following the GPT-4o update in May 2024 and specific Claude Opus model versions. The developer hypothesizes that pressure to increase token throughput and reduce latency may cause models to internally 'cluster' semantic groups rather than process tokens with deep sequential attention. This shortcut, termed 'reasoning-token clustering,' may prevent models from fully integrating logical dependencies across complex prompts, leading to gaps in final outputs.

0 comments Read more at DEV Community

The Log Is the Agent

Discussion (0)

Related stories

Silero VAD and ONNX Runtime Detect 12 Speech Segments in 14-Second Audio Clip

Bilateral AI provenance standard adds agent self-signing to notarized records

React Explained: Virtual DOM, Components, and State Through a Mall Analogy

Developer proposes 'Token Clustering' theory to explain AI reasoning failures in complex tasks