How ResNet's Skip Connections Made 152-Layer Deep Networks Trainable in 2015
Before 2015, adding more layers to neural networks often made them perform worse, with a 56-layer network showing higher training error than a 20-layer one due to vanishing gradients. Researchers introduced ResNet, which solved this by having each block learn only the residual difference F(x) rather than the full mapping, then adding the original input back via a shortcut connection. This identity path acts as a gradient highway, allowing error signals to flow back to early layers without fading. The 2015 ResNet paper used this technique to successfully train 152-layer networks, winning the ImageNet competition that year. Skip connections have since become a foundational building block across modern architectures, including U-Nets and Transformer-based large language models.
This is an AI-generated summary. ShortSingh links to the original source for the complete article.
Discussion (0)
Log in to join the discussion and vote.
Log in