Why Decision Trees Are Highly Sensitive to Changes in Training Data
Decision Trees are described as 'high variance' models because even small changes in training data can produce drastically different tree structures. This happens because the algorithm selects the best split at each step, and when two features are nearly equal in quality, a minor data shift can alter the root node entirely. Since every subsequent branch depends on earlier splits, one early change cascades through the whole tree, making the model unstable across different data samples. This sensitivity is not a flaw in accuracy but a consequence of the model's flexibility to learn complex patterns without requiring feature scaling or linear assumptions. Understanding this instability is what motivated the development of ensemble techniques like Bagging and Random Forest, which address high variance by combining multiple trees.
This is an AI-generated summary. ShortSingh links to the original source for the complete article.
Discussion (0)
Log in to join the discussion and vote.
Log in