About 50 results
Open links in new tab
  1. What is regularization in plain english? - Cross Validated

    Is regularization really ever used to reduce underfitting? In my experience, regularization is applied on a complex/sensitive model to reduce complexity/sensitvity, but never on a simple/insensitive model to …

  2. L1 & L2 double role in Regularization and Cost functions?

    Mar 19, 2023 · Regularization - penalty for the cost function, L1 as Lasso & L2 as Ridge Cost/Loss Function - L1 as MAE (Mean Absolute Error) and L2 as MSE (Mean Square Error) Are [1] and [2] the …

  3. What are Regularities and Regularization? - Cross Validated

    Is regularization a way to ensure regularity? i.e. capturing regularities? Why do ensembling methods like dropout, normalization methods all claim to be doing regularization?

  4. Regularization methods for logistic regression - Cross Validated

    Feb 15, 2017 · Regularization using methods such as Ridge, Lasso, ElasticNet is quite common for linear regression. I wanted to know the following: Are these methods applicable for logistic …

  5. neural networks - Why would regularization reduce training error ...

    Feb 11, 2026 · An answer on this very site states that "regularization (including L2) will increase the error on training set" so observing the obverse is certainly noteworthy.

  6. Why is the L2 regularization equivalent to Gaussian prior?

    Dec 13, 2019 · I keep reading this and intuitively I can see this but how does one go from L2 regularization to saying that this is a Gaussian Prior analytically? Same goes for saying L1 is …

  7. When will L1 regularization work better than L2 and vice versa?

    Nov 29, 2015 · Note: I know that L1 has feature selection property. I am trying to understand which one to choose when feature selection is completely irrelevant. How to decide which regularization (L1 or …

  8. How is adding noise to training data equivalent to regularization?

    Oct 18, 2021 · I've noticed that some people argue that adding noise to training data equivalent to regularizing our predictor parameters. How is this the case? Some of the examples listed on SE …

  9. Difference between weight decay and L2 regularization

    Apr 6, 2025 · I'm reading [Ilya Loshchilov's work] [1] on decoupled weight decay and regularization. The big takeaway seems to be that weight decay and $L^2$ norm regularization are the same for SGD …

  10. How does regularization reduce overfitting? - Cross Validated

    Mar 13, 2015 · A common way to reduce overfitting in a machine learning algorithm is to use a regularization term that penalizes large weights (L2) or non-sparse weights (L1) etc. How can such …