# Least-squares¶

In a least-squares, or linear regression, problem, we have measurements $$A \in \mathcal{R}^{m \times n}$$ and $$b \in \mathcal{R}^m$$ and seek a vector $$x \in \mathcal{R}^{n}$$ such that $$Ax$$ is close to $$b$$. Closeness is defined as the sum of the squared differences:

$\sum_{i=1}^m (a_i^Tx - b_i)^2,$

also known as the $$\ell_2$$-norm squared, $$\|Ax - b\|_2^2$$.

For example, we might have a dataset of $$m$$ users, each represented by $$n$$ features. Each row $$a_i^T$$ of $$A$$ is the features for user $$i$$, while the corresponding entry $$b_i$$ of $$b$$ is the measurement we want to predict from $$a_i^T$$, such as ad spending. The prediction is given by $$a_i^Tx$$.

We find the optimal $$x$$ by solving the optimization problem

$\begin{array}{ll} \mbox{minimize} & \|Ax - b\|_2^2. \end{array}$

Let $$x^\star$$ denote the optimal $$x$$. The quantity $$r = Ax^\star - b$$ is known as the residual. If $$\|r\|_2 = 0$$, we have a perfect fit.

## Example¶

In the following code, we solve a least-squares problem with CVXPY.

# Import packages.
import cvxpy as cp
import numpy as np

# Generate data.
m = 20
n = 15
np.random.seed(1)
A = np.random.randn(m, n)
b = np.random.randn(m)

# Define and solve the CVXPY problem.
x = cp.Variable(n)
cost = cp.sum_squares(A @ x - b)
prob = cp.Problem(cp.Minimize(cost))
prob.solve()

# Print result.
print("\nThe optimal value is", prob.value)
print("The optimal x is")
print(x.value)
print("The norm of the residual is ", cp.norm(A @ x - b, p=2).value)

The optimal value is 7.005909828287484
The optimal x is
[ 0.17492418 -0.38102551  0.34732251  0.0173098  -0.0845784  -0.08134019
0.293119    0.27019762  0.17493179 -0.23953449  0.64097935 -0.41633637
0.12799688  0.1063942  -0.32158411]
The norm of the residual is  2.6468679280023557