Top Mathematics discussions
Will@Recent Questions - Mathematics Stack Exchange - 25d
A recent analysis has delved into the probabilistic interpretation of linear regression coefficients, highlighting the differences in reasoning when using expected values versus covariances. It has been shown that when calculating regression coefficients, employing expected values leads to correct formulations that correspond to the ordinary least squares (OLS) method. Specifically, the formula a=E[XY]/E[X^2] is derived using the expected value of the product of the independent and dependent variables. This approach aligns with the traditional understanding of linear regression where a model is expressed as Y=aX+ε, with ε being a centered error term independent of X.
However, using covariances for the probabilistic interpretation fails, especially in models without an intercept term. While covariance is often used to calculate the correlation between variables, the derived formula a=cov(X,Y)/var(X) does not align with the correct regression coefficient when there isn't an intercept. This divergence arises because the assumption of an intercept is implicit when using covariance, and its absence invalidates the formula using covariance. The study clarifies how formulas are derived in both scenarios and why the probabilistic reasoning fails when using covariances in situations where there is no intercept included in the model. The use of empirical means versus population means was also discussed to explore the nuances further.
References :
Classification: