Study: Using Local AI as Free Executor Actually Raised Cloud Costs in Agentic Coding

A developer-researcher ran 40 controlled trials testing four AI configurations for automated code-repair tasks, comparing solo models against orchestrator-executor pairings. The combination of Opus 4.7 as orchestrator and a locally hosted Qwen 3.5-9B as a zero-cost executor turned out to be the most expensive cloud configuration across all three tasks tested. The higher cost was not driven by executor token usage but by the orchestrator repeatedly re-reading Qwen's returned summaries, causing Opus's input volume to balloon to 1.4–5.3 times that of Opus running alone. Among cloud-only options, the Opus plus Haiku pairing offered the best cost-performance balance, while Haiku running solo was 5.5 times cheaper than Opus solo but failed 25% of trials. The findings, published on Zenodo and GitHub, challenge the widely held assumption that offloading execution to a free local model reduces overall cloud spending in agentic pipelines.
This is an AI-generated summary. ShortSingh links to the original source for the complete article.
Discussion (0)
Log in to join the discussion and vote.
Log in