Hybrid Edge-Cloud AI Architecture Cuts Latency for Mobile Intent Classification

·1 views

A proposed hybrid architecture aims to improve mobile app performance by running lightweight intent classification models directly on the client device rather than routing every user request to a cloud-hosted large language model. Simple, predictable commands like 'Show my leave balance' or 'Open settings' can be resolved locally, while only ambiguous or complex queries are forwarded to the cloud. This approach reduces response latency, lowers operational costs, decreases dependence on network availability, and keeps routine user data on the device. The architecture is demonstrated using Core ML on iOS but is designed to apply broadly to Android, desktop, and embedded systems. The core argument is that generative AI, despite its capabilities, is not always the appropriate tool for deterministic user commands in enterprise and consumer applications.

Read the full story at DEV Community

This is an AI-generated summary. ShortSingh links to the original source for the complete article.

Discussion (0)

Android Power Profiler Is Essential for Optimizing Edge AI Apps, Developers Warned

A technical guide published on DEV Community highlights a critical but often overlooked challenge in Android Edge AI development: thermal throttling and power consumption. When on-device AI models like Gemini Nano are deployed, the CPU, GPU, and NPU together draw significant energy, and sustained high utilization can cause the Android OS to reduce chip clock speeds, sharply degrading inference performance. The article argues that developers who skip the Android Studio Power Profiler are essentially guessing, since real bottlenecks often stem from data movement energy costs rather than raw compute limits. Developers are advised to navigate a trilemma between model accuracy, inference latency, and energy efficiency, aiming for a balanced configuration rather than optimizing any single factor. Google's AICore platform is presented as a major architectural improvement, allowing multiple apps to share a single in-memory copy of Gemini Nano and enabling model updates without APK changes.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

Developer Builds Auto-Updating Script to Find Working Telegram MTProto Proxies

A developer created an automated script to scrape and verify working MTProto proxies for Telegram, eliminating the need for manual testing. The tool pulls from multiple public proxy channels, tests each one for availability, and outputs results as clean JSON alongside a live web page. GitHub Actions runs the scraper on a schedule, keeping the proxy list continuously updated without human intervention. The first run of the script returned approximately 30 functional proxies with fake TLS support. The project is publicly available on GitHub for users who want to self-host it or simply access the latest proxy list.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

VTEX exposes a free public catalog API that most developers overlook

Every VTEX-powered store in Brazil, including major retailers like Americanas and Submarino, exposes a public REST API for product catalog data that requires no authentication or API key. The endpoint follows the pattern https://{store-domain}/api/catalog_system/pub/products/search and supports full-text search, pagination, filtering, and sorting parameters. The API returns structured JSON with product, SKU, seller, and pricing data, and uses HTTP 206 for paginated responses rather than 200, which can catch integrators off guard. A long-standing typo in the API names the price object commertialOffer instead of commercialOffer, and correcting it has never been possible without breaking existing integrations. Developers can use this API to build competitor price-monitoring tools by scheduling periodic queries, snapshotting price data, and comparing results to detect changes over time.

0 comments Read more at DEV Community

ProgrammingDEV Community ·

MarketNow open-sources security audit revealing four critical payment vulnerabilities

AI agent marketplace MarketNow conducted four parallel security audits two weeks after its launch, uncovering four critical vulnerabilities in its USDC payment system on the Base blockchain. The flaws included a mandate spending bypass that could allow $500 in purchases against a $10 cap, a transaction hash reuse exploit enabling unlimited free licenses, an underpayment loophole from a range-check error, and a missing sender-verification bug that allowed transaction hijacking. All four critical issues have since been patched with fixes including fail-closed license issuance, exact payment matching, transaction deduplication, and wallet address validation. Several medium-severity issues were also resolved, such as open CORS policies, exposed user emails in an API, and weak default secrets. The team acknowledged remaining gaps including the absence of an independent third-party audit and a per-instance rather than global rate limiter, both flagged on their public roadmap.

0 comments Read more at DEV Community