Google cuts Spanner cache costs 5% by treating memory as rent, not sunk cost
Google Research published a paper this week detailing a production deployment of linear elastic caching in Spanner, its globally distributed database. The approach, based on the classic ski rental problem, assigns each cached page a TTL at access time rather than relying solely on traditional eviction policies like LRU or LFU. In a fleet-wide rollout, the system reduced cache memory usage by 15.5% and lowered total cache ownership cost by around 5%, while cache misses rose by only 5.5%. To meet Spanner's performance demands, the team implemented the TTL predictor as a shallow decision tree translatable into a few lines of C++, using features such as page size, miss cost, and operation type. The key insight is that memory in a large fleet carries an ongoing cost, and pages should be evicted not just when space runs out but when the cost of retaining them exceeds the cost of a future cache miss.
This is an AI-generated summary. ShortSingh links to the original source for the complete article.
Discussion (0)
Log in to join the discussion and vote.
Log in