Least-squares¶
In a least-squares, or linear regression, problem, we have measurements \(A \in \mathcal{R}^{m \times n}\) and \(b \in \mathcal{R}^m\) and seek a vector \(x \in \mathcal{R}^{n}\) such that \(Ax\) is close to \(b\). Closeness is defined as the sum of the squared differences:
also known as the \(\ell_2\)-norm squared, \(\|Ax - b\|_2^2\).
For example, we might have a dataset of \(m\) users, each represented by \(n\) features. Each row \(a_i^T\) of \(A\) is the features for user \(i\), while the corresponding entry \(b_i\) of \(b\) is the measurement we want to predict from \(a_i^T\), such as ad spending. The prediction is given by \(a_i^Tx\).
We find the optimal \(x\) by solving the optimization problem
Let \(x^\star\) denote the optimal \(x\). The quantity \(r = Ax^\star - b\) is known as the residual. If \(\|r\|_2 = 0\), we have a perfect fit.
Example¶
In the following code, we solve a least-squares problem with CVXPY.
# Import packages.
import cvxpy as cp
import numpy as np
# Generate data.
m = 20
n = 15
np.random.seed(1)
A = np.random.randn(m, n)
b = np.random.randn(m)
# Define and solve the CVXPY problem.
x = cp.Variable(n)
cost = cp.sum_squares(A @ x - b)
prob = cp.Problem(cp.Minimize(cost))
prob.solve()
# Print result.
print("\nThe optimal value is", prob.value)
print("The optimal x is")
print(x.value)
print("The norm of the residual is ", cp.norm(A @ x - b, p=2).value)
The optimal value is 7.005909828287484
The optimal x is
[ 0.17492418 -0.38102551 0.34732251 0.0173098 -0.0845784 -0.08134019
0.293119 0.27019762 0.17493179 -0.23953449 0.64097935 -0.41633637
0.12799688 0.1063942 -0.32158411]
The norm of the residual is 2.6468679280023557