SShortSingh.
Back to feed

How Trino, Spark, and DuckDB each query the same Apache Iceberg table

0
·1 views

Apache Iceberg allows multiple query engines to read the same table stored in object storage without duplicating data, with each engine differing only in how it accesses the shared metadata. Trino connects via a catalog and offers clean, straightforward SQL for interactive queries, making it well-suited for shared lakehouse environments. Spark requires additional session configuration with Iceberg extensions but is the preferred choice when queries are part of larger data pipelines involving transforms or batch writes. DuckDB provides the fastest path for local, read-only inspection by scanning Iceberg metadata files directly, though it can also attach a REST catalog for broader catalog-backed workflows. Understanding how all three engines interact with the same underlying table is essential for teams building and operating real lakehouse architectures.

Read the full story at DEV Community

This is an AI-generated summary. ShortSingh links to the original source for the complete article.

Discussion (0)

Log in to join the discussion and vote.

Log in

Related stories

0
ProgrammingDEV Community ·

How One Developer Keeps Their Notion Workspace Down to Just 3 Pages

A developer has shared their minimalist Notion setup, which consists of only three pages: a Tab Dump, a Daily Focus, and a Brain Dump. The approach was designed around a personal rule that any system requiring more than two clicks to use will simply not be used. Rather than building complex dashboards with multiple databases and formulas, the setup prioritizes quick information capture and focused work. The author argues that many publicly shared Notion workspaces are too elaborate to be practically sustainable on a daily basis.

0
ProgrammingDEV Community ·

Developer Uses Claude AI to Audit Another AI Agent System, Documents the Process

On July 5, 2026, a developer used a Claude Code session codenamed Fable 5 to conduct a comprehensive methodology audit of their autonomous AI agent system called ALICE, which was built on the Pi agent framework. ALICE had accumulated over 100 skills and 38 pending tasks but suffered a core reliability problem: its handoff memory files frequently referenced files and directories that no longer existed. To address this, Fable 5 deployed six parallel sub-agents, each assigned a distinct, non-overlapping review perspective — covering functional gaps, UX, security, performance, operations, and data lifecycle — with every finding required to cite a source file and line number. Fable 5 also critically evaluated its own audit, identifying false positives in the security review and blind spots including test quality, i18n, and cost control that no single perspective had covered. The developer concluded that prompt-writing alone is insufficient to instill reliable verification habits in an AI agent, and that structural enforcement mechanisms such as pre-action hooks and post-execution audits are necessary.

0
ProgrammingDEV Community ·

Developer Uses One Claude AI Instance to Audit Another in Stateless Memory Experiment

On July 5, 2026, a developer used a Claude Code AI session called Fable 5 to conduct a full methodology audit on ALICE, an autonomous AI agent built on a Pi framework over three weeks. ALICE is designed to persist across sessions by passing handoff documents to her next instance, but faces a core problem: stored memory often contradicts real-world state. To address structural blind spots rather than surface bugs, the developer brought in a second, independent Claude Code session with no shared memory or context. Fable 5 proposed a multi-agent audit framework where parallel sub-agents each examine one non-overlapping lens — such as security, performance, or data lifecycle — and must cite specific file paths and line numbers for every finding. The experiment yielded a reusable framework for investigation-first system audits, emphasizing mandatory evidence, value-effort scoring, and orthogonal lens design as the key drivers of audit quality.

0
ProgrammingDEV Community ·

Fud AI: Open-Source Calorie Tracker with Photo Logging and BYOK Support

Fud AI is a newly launched open-source nutrition tracking app available on both iOS and Android, released under the MIT license. The app allows users to log meals by snapping a photo, with AI estimating calories and macronutrients automatically. Additional input methods include barcode scanning, voice entry, manual input, and saved meals. Users can bring their own Gemini or OpenRouter API key, or opt into a paid Fud AI Plus tier that includes an AI coach, weight and body-fat tracking, and BMR calculation. The project's source code has been made publicly available on GitHub as part of a build-in-public initiative.

How Trino, Spark, and DuckDB each query the same Apache Iceberg table · ShortSingh