Paper Proposes Formal Definition of 'Data' to Fix Core Engineering Failures
A new academic paper published on DEV Community argues that computing's informal definition of data as 'raw facts' lacks the structural precision needed for reliable system design. The author traces the word 'datum' from its Latin origins to modern technical usage, revealing an unstated structural implication shared across all common definitions. The paper formalizes data as a function mapping an index set to a value set, with conditions for totality, functionality, and codomain-consistency, showing this framework encompasses sequences, tuples, and multisets. Four engineering failure modes — schema drift, type confusion, unsound equality checks, and serialization ambiguity — are directly linked to violations of these formal conditions, each illustrated with a worked example. The paper concludes by distinguishing data from its representation, arguing that understanding this separation is essential for sound reasoning about encoding, type systems, and protocols.
This is an AI-generated summary. ShortSingh links to the original source for the complete article.
Discussion (0)
Log in to join the discussion and vote.
Log in