Questions tagged [linear-regression]

For questions about linear regressions, an approach for modeling the relationship between a scalar dependent variable y and one or more explanatory variables.

In statistics, linear regression is an approach for modeling the relationship between a scalar dependent variable y and one or more explanatory variables (or independent variables) denoted X. The case of one explanatory variable (independent variable) is called simple linear regression. For more than one explanatory variable (independent variable), the process is called multiple linear regression.

In linear regression, the relationships are modeled using linear predictor functions whose unknown model parameters are estimated from the data. Such models are called linear models. Most commonly, the conditional mean of y given the value of X is assumed to be an affine function of X; less commonly, the median or some other quantile of the conditional distribution of y given X is expressed as a linear function of X. Like all forms of regression analysis, linear regression focuses on the conditional probability distribution of y given X, rather than on the joint probability distribution of y and X, which is the domain of multivariate analysis.

Source: https://en.wikipedia.org/wiki/Linear_regression

1308 questions
20
votes
1 answer

what is the variance of a constant matrix times a random vector?

$\newcommand{\Var}{\operatorname{Var}}$In this video is claimed that if the equation of errors in OLS is given by: $$u=y - X\beta$$ Then in the presence of heteroscedasticity the variance of $u$, will not be constant, $\sigma^2 \times I$, where $I$…
Mario GS
  • 313
19
votes
3 answers

Proof that trace of 'hat' matrix in linear regression is rank of X

I understand that the trace of the projection matrix (also known as the "hat" matrix) X*Inv(X'X)*X' in linear regression is equal to the rank of X. How can we prove that from first principles, i.e. without simply asserting that the trace of a…
14
votes
2 answers

why is the least square cost function for linear regression convex

I was looking at Andrew Ng's machine learning course and for linear regression he defined a hypothesis function to be $h(x) = \theta_0 + \theta_1x_1 + \dots + \theta_nx_n$, where $x$ is a vector of values, so the goal of linear regression is to find…
12
votes
1 answer

Derivative of Mean Squared Error

I'm studying with a book and I'm at the Linear Regression part. The author is showing that we have to calculate the derivative of each part of the equation that leads to the loss. But he's using the MSE to calculate the loss and so, I tried to…
9
votes
2 answers

When can we say that $A^{\mathrm T} B = B^{\mathrm T} A$?

I was looking at the derivation of the normal equation from here. Now, the author has used the fact that $A^{\mathrm T} B = B^{\mathrm T} A$ to reach the step shown in the below image. Can anyone provide some information like, when is it true, or…
8
votes
4 answers

Best Fit Line with 3d Points

Okay, I need to develop an alorithm to take a collection of 3d points with x,y,and z components and find a line of best fit. I found a commonly referenced item from Geometric Tools but there doesn't seem to be a lot of information to get someone…
8
votes
3 answers

Proof of Gauss-Markov theorem

Theorem: Let $Y=X\beta+\varepsilon$ where $$Y\in\mathcal M_{n\times 1}(\mathbb R),$$ $$X\in \mathcal M_{n\times p}(\mathbb R),$$ $$\beta\in\mathcal M_{n\times 1}(\mathbb R ),$$ and $$\varepsilon\in\mathcal M_{n\times 1}(\mathbb R ).$$ We suppose…
Rick
  • 1,757
7
votes
1 answer

Explain about the Correlation of Error Terms in Linear Regression Models

I would like to ask for the interpretation, both mathematically and intuitively if possible, about the homoscedasticity of the variance of errors in linear regression models. If there is correlation among the error terms, then how it would affect…
7
votes
4 answers

Why does $A^TAx = A^Tb$ have infinitely many solution algebraically when $A$ has dependent columns?

This is a problem from least square approximation, where we solve the equation $A^TAx = A^Tb$ when $Ax = b$ is unsolvable. The case I am dealing with is when A has dependent columns, i.e. A is an m by n matrix where the rank r is smaller than n. In…
7
votes
0 answers

Perturbation theory for least squares for very different A, b

Consider the least squares problem $f(x;A,b) = \|Ax-b\|_2^2$ and define $x^*$ the minimizer of $f(x;\hat A,\hat b)$, and $\hat x$ the minimizer of $f(x; A_2, b_2)$. I want to put some bound on $\|x^* - \hat x\|$. Looking through Golub/Van Loan, I…
6
votes
1 answer

Proof of $\frac{1}{n}\mathrm{E} \left[ \| \mathbf{X}\mathbf{\hat{w}} - \mathbf{X}\mathbf{w}^{*} \|^{2}_{2} \right] = \sigma^{2}\frac{d}{n}$

I am trying to find a proof for the MSE of a linear regression: \begin{gather} \frac{1}{n}\mathrm{E} \left[ \| \mathbf{X}\mathbf{\hat{w}} - \mathbf{X}\mathbf{w}^{*} \|^{2}_{2} \right] = \sigma^{2}\frac{d}{n} \end{gather} The variables are defined as…
6
votes
1 answer

Applying the Normal Equations to solve the Linear Regression Problems.

I am new to machine learning and I am currently studying the gradient descent method and its application for linear regression. An iterative method known as gradient descent is finding the linear…
6
votes
0 answers

Prime number intercept

Suppose I arrange my (infinite) list of prime numbers in the following way: \begin{array}{c|c}x_i&2&5&11&17&23&31&\cdots\\\hline y_i&3&7&13&19&29&37&\cdots\end{array} so that if $p_k$ denotes the $k$th prime, $x_i$ contains $p_{2k-1}$ and $y_i$…
6
votes
1 answer

Bayesian Interpretation for Ridge Regression and the Lasso

I'm learning the book "Introduction to Statistical Learning" and in the Chapter 6 about "Linear Model Selection and Regularization", there is a small part about "Bayesian Interpretation for Ridge Regression and the Lasso" that I haven't understood…
Sophil
  • 405
6
votes
1 answer

Connection Between Orthogonal Projection onto the Unit Simplex and the Softmax Function

Referring to papers Softmax to Sparsemax and Efficient Projections onto the L1-Ball, what is the relationship between a euclidean projection onto the probability simplex and applying the Softmax function? Both resulting vectors $\boldsymbol{w}$ will…
1
2 3
87 88