Atomic Functions¶
This section of the tutorial describes the atomic functions that can be applied to CVXPY expressions. CVXPY uses the function information in this section and the DCP rules to mark expressions with a sign and curvature.
Operators¶
The infix operators +, -, *, /, @
are treated as functions. The operators +
and
-
are always affine functions. The expression expr1*expr2
is affine in
CVXPY when one of the expressions is constant, and expr1/expr2
is affine
when expr2
is a scalar constant.
Historically, CVXPY used expr1 * expr2
to denote matrix multiplication.
This is now deprecated. Starting with Python 3.5, users can write
expr1 @ expr2
for matrix multiplication and dot products.
As of CVXPY version 1.1, we are adopting a new standard:
@
should be used for matrix-matrix and matrix-vector multiplication,*
should be matrix-scalar and vector-scalar multiplication
Elementwise multiplication can be applied with the multiply function.
Indexing and slicing¶
Indexing in CVXPY follows exactly the same semantics as NumPy ndarrays.
For example, if expr
has shape (5,)
then expr[1]
gives the second entry.
More generally, expr[i:j:k]
selects every kth
element of expr
, starting at i
and ending at j-1
.
If expr
is a matrix, then expr[i:j:k]
selects rows,
while expr[i:j:k, r:s:t]
selects both rows and columns.
Indexing drops dimensions while slicing preserves dimensions.
For example,
x = cvxpy.Variable(5)
print("0 dimensional", x[0].shape)
print("1 dimensional", x[0:1].shape)
O dimensional: ()
1 dimensional: (1,)
Transpose¶
The transpose of any expression can be obtained using the syntax
expr.T
. Transpose is an affine function.
Power¶
For any CVXPY expression expr
,
the power operator expr**p
is equivalent to
the function power(expr, p)
.
Scalar functions¶
A scalar function takes one or more scalars, vectors, or matrices as arguments and returns a scalar.
Function |
Meaning |
Domain |
Sign |
Curvature |
Monotonicity |
---|---|---|---|---|---|
constant \(W \in \mathbf{R}^{o \times p}\) |
\(\langle sort\left(vec(X)\right), sort\left(vec(W)\right) \rangle\) |
\(X \in \mathbf{R}^{m \times n}\) |
depends on \(X\), \(W\) |
||
\(p \in \mathbf{R}^n_{+}\) \(p \neq 0\) |
\(x_1^{1/n} \cdots x_n^{1/n}\) \(\left(x_1^{p_1} \cdots x_n^{p_n}\right)^{\frac{1}{\mathbf{1}^T p}}\) |
\(x \in \mathbf{R}^n_{+}\) |
|||
\(\frac{n}{\frac{1}{x_1} + \cdots + \frac{1}{x_n}}\) |
\(x \in \mathbf{R}^n_{+}\) |
||||
\((x_1\cdots x_n)^{-1}\) |
\(x \in \mathbf{R}^n_+\) |
||||
\(\lambda_{\max}(X)\) |
\(X \in \mathbf{S}^n\) |
None |
|||
\(\lambda_{\min}(X)\) |
\(X \in \mathbf{S}^n\) |
None |
|||
\(k = 1,\ldots, n\) |
\(\text{sum of $k$ largest}\\ \text{eigenvalues of $X$}\) |
\(X \in\mathbf{S}^{n}\) |
None |
||
\(k = 1,\ldots, n\) |
\(\text{sum of $k$ smallest}\\ \text{eigenvalues of $X$}\) |
\(X \in\mathbf{S}^{n}\) |
None |
||
\(\log \left(\det (X)\right)\) |
\(X \in \mathbf{S}^n_+\) |
None |
|||
\(\log \left(\sum_{ij}e^{X_{ij}}\right)\) |
\(X \in\mathbf{R}^{m \times n}\) |
||||
\(x^T P^{-1} x\) |
\(x \in \mathbf{R}^n\) \(P \in\mathbf{S}^n_{++}\) |
None |
|||
\(\max_{ij}\left\{ X_{ij}\right\}\) |
\(X \in\mathbf{R}^{m \times n}\) |
same as X |
|||
\(\frac{1}{m n}\sum_{ij}\left\{ X_{ij}\right\}\) |
\(X \in\mathbf{R}^{m \times n}\) |
same as X |
|||
\(\min_{ij}\left\{ X_{ij}\right\}\) |
\(X \in\mathbf{R}^{m \times n}\) |
same as X |
|||
\(\left(\sum_k\left(\sum_l\lvert x_{k,l}\rvert^p\right)^{q/p}\right)^{1/q}\) |
\(X \in\mathbf{R}^{n \times n}\) |
None |
|||
norm(x, 2) |
\(\sqrt{\sum_{i} \lvert x_{i} \rvert^2 }\) |
\(X \in\mathbf{R}^{n}\) |
|||
\(\sum_{i}\lvert x_{i} \rvert\) |
\(x \in\mathbf{R}^{n}\) |
||||
\(\max_{i} \{\lvert x_{i} \rvert\}\) |
\(x \in\mathbf{R}^{n}\) |
||||
\(\sqrt{\sum_{ij}X_{ij}^2 }\) |
\(X \in\mathbf{R}^{m \times n}\) |
||||
\(\max_{j} \|X_{:,j}\|_1\) |
\(X \in\mathbf{R}^{m \times n}\) |
||||
\(\max_{i} \|X_{i,:}\|_1\) |
\(X \in\mathbf{R}^{m \times n}\) |
||||
\(\mathrm{tr}\left(\left(X^T X\right)^{1/2}\right)\) |
\(X \in\mathbf{R}^{m \times n}\) |
None |
|||
norm(X, 2) |
\(\sqrt{\lambda_{\max}\left(X^T X\right)}\) |
\(X \in\mathbf{R}^{m \times n}\) |
None |
||
\(sf(x/s)\) |
\(x \in \mathop{\bf dom} f\) \(s \geq 0\) |
same as f |
same as \(f\) |
None |
|
\(p \geq 1\) or |
\(\|X\|_p = \left(\sum_{ij} |X_{ij}|^p \right)^{1/p}\) |
\(X \in \mathbf{R}^{m \times n}\) |
|||
\(p < 1\), \(p \neq 0\) |
\(\|X\|_p = \left(\sum_{ij} X_{ij}^p \right)^{1/p}\) |
\(X \in \mathbf{R}^{m \times n}_+\) |
|||
\(\max_{ij} X_{ij} - \min_{ij} X_{ij}\) |
\(X \in \mathbf{R}^{m \times n}\) |
None |
|||
constant \(P \in \mathbf{S}^n_+\) |
\(x^T P x\) |
\(x \in \mathbf{R}^n\) |
|||
constant \(P \in \mathbf{S}^n_-\) |
\(x^T P x\) |
\(x \in \mathbf{R}^n\) |
|||
constant \(c \in \mathbf{R}^n\) |
\(c^T X c\) |
\(X \in\mathbf{R}^{n \times n}\) |
depends on c, X |
depends on c |
|
\(\left(\sum_{ij}X_{ij}^2\right)/y\) |
\(x \in \mathbf{R}^n\) \(y > 0\) |
||||
\(\sqrt{\frac{1}{mn} \sum_{ij}\left(X_{ij} - \frac{1}{mn}\sum_{k\ell} X_{k\ell}\right)^2}\) |
\(X \in\mathbf{R}^{m \times n}\) |
None |
|||
\(\sum_{ij}X_{ij}\) |
\(X \in\mathbf{R}^{m \times n}\) |
same as X |
|||
\(k = 1,2,\ldots\) |
\(\text{sum of } k\text{ largest }X_{ij}\) |
\(X \in\mathbf{R}^{m \times n}\) |
same as X |
||
\(k = 1,2,\ldots\) |
\(\text{sum of } k\text{ smallest }X_{ij}\) |
\(X \in\mathbf{R}^{m \times n}\) |
same as X |
||
\(\sum_{ij}X_{ij}^2\) |
\(X \in\mathbf{R}^{m \times n}\) |
||||
\(\mathrm{tr}\left(X \right)\) |
\(X \in\mathbf{R}^{n \times n}\) |
same as X |
|||
\(\mathrm{tr}\left(X^{-1} \right)\) |
\(X \in\mathbf{S}^n_{++}\) |
None |
|||
\(\sum_{i}|x_{i+1} - x_i|\) |
\(x \in \mathbf{R}^n\) |
None |
|||
\(\sum_{ij}\left\| \left[\begin{matrix} X_{i+1,j} - X_{ij} \\ X_{i,j+1} -X_{ij} \end{matrix}\right] \right\|_2\) |
\(X \in \mathbf{R}^{m \times n}\) |
None |
|||
\(\sum_{ij}\left\| \left[\begin{matrix} X_{i+1,j}^{(1)} - X_{ij}^{(1)} \\ X_{i,j+1}^{(1)} -X_{ij}^{(1)} \\ \vdots \\ X_{i+1,j}^{(k)} - X_{ij}^{(k)} \\ X_{i,j+1}^{(k)} -X_{ij}^{(k)} \end{matrix}\right] \right\|_2\) |
\(X^{(i)} \in\mathbf{R}^{m \times n}\) |
None |
|||
\({\frac{1}{mn} \sum_{ij}\left(X_{ij} - \frac{1}{mn}\sum_{k\ell} X_{k\ell}\right)^2}\) |
\(X \in\mathbf{R}^{m \times n}\) |
None |
|||
\(-\operatorname{tr}(X\operatorname{logm}(X))\) |
\(X \in \mathbf{S}^{n}_+\) |
None |
Clarifications for scalar functions¶
The domain \(\mathbf{S}^n\) refers to the set of symmetric matrices. The domains \(\mathbf{S}^n_+\) and \(\mathbf{S}^n_-\) refer to the set of positive semi-definite and negative semi-definite matrices, respectively. Similarly, \(\mathbf{S}^n_{++}\) and \(\mathbf{S}^n_{--}\) refer to the set of positive definite and negative definite matrices, respectively.
For a vector expression x
, norm(x)
and norm(x, 2)
give the Euclidean norm. For a matrix expression X
, however, norm(X)
and norm(X, 2)
give the spectral norm.
The function norm(X, "fro")
is called the Frobenius norm
and norm(X, "nuc")
the nuclear norm. The nuclear norm can also be defined as the sum of X
’s singular values.
The functions max
and min
give the largest and smallest entry, respectively, in a single expression. These functions should not be confused with maximum
and minimum
(see Elementwise functions). Use maximum
and minimum
to find the max or min of a list of scalar expressions.
The CVXPY function sum
sums all the entries in a single expression. The built-in Python sum
should be used to add together a list of expressions. For example, the following code sums a list of three expressions:
expr_list = [expr1, expr2, expr3]
expr_sum = sum(expr_list)
Functions along an axis¶
The functions sum
, norm
, max
, min
, mean
, std
, var
, and ptp
can
be applied along an axis.
Given an m
by n
expression expr
, the syntax func(expr, axis=0, keepdims=True)
applies func
to each column, returning a 1 by n
expression.
The syntax func(expr, axis=1, keepdims=True)
applies func
to each row,
returning an m
by 1 expression.
By default keepdims=False
, which means dimensions of length 1 are dropped.
For example, the following code sums
along the columns and rows of a matrix variable:
X = cvxpy.Variable((5, 4))
col_sums = cvxpy.sum(X, axis=0, keepdims=True) # Has size (1, 4)
col_sums = cvxpy.sum(X, axis=0) # Has size (4,)
row_sums = cvxpy.sum(X, axis=1) # Has size (5,)
Elementwise functions¶
These functions operate on each element of their arguments. For example, if X
is a 5 by 4 matrix variable,
then abs(X)
is a 5 by 4 matrix expression. abs(X)[1, 2]
is equivalent to abs(X[1, 2])
.
Elementwise functions that take multiple arguments, such as maximum
and multiply
, operate on the corresponding elements of each argument.
For example, if X
and Y
are both 3 by 3 matrix variables, then maximum(X, Y)
is a 3 by 3 matrix expression.
maximum(X, Y)[2, 0]
is equivalent to maximum(X[2, 0], Y[2, 0])
. This means all arguments must have the same dimensions or be
scalars, which are promoted.
Function |
Meaning |
Domain |
Sign |
Curvature |
Monotonicity |
---|---|---|---|---|---|
\(\lvert x \rvert\) |
\(x \in \mathbf{C}\) |
||||
complex conjugate |
\(x \in \mathbf{C}\) |
None |
|||
\(-x \log (x)\) |
\(x > 0\) |
None |
|||
\(e^x\) |
\(x \in \mathbf{R}\) |
||||
\(M \geq 0\) |
\(\begin{cases}x^2 &|x| \leq M \\2M|x| - M^2&|x| >M\end{cases}\) |
\(x \in \mathbf{R}\) |
|||
imaginary part of a complex number |
\(x \in \mathbf{C}\) |
none |
|||
\(1/x\) |
\(x > 0\) |
||||
\(x \log(x/y) - x + y\) |
\(x > 0\) \(y > 0\) |
None |
|||
\(\log(x)\) |
\(x > 0\) |
||||
approximate log of the standard normal CDF |
\(x \in \mathbf{R}\) |
||||
\(\log(x+1)\) |
\(x > -1\) |
same as x |
|||
\(x > 0\) |
None |
||||
\(\log(1 + e^{x})\) |
\(x \in \mathbf{R}\) |
||||
\(\max \left\{x, y\right\}\) |
\(x,y \in \mathbf{R}\) |
depends on x,y |
|||
\(\min \left\{x, y\right\}\) |
\(x, y \in \mathbf{R}\) |
depends on x,y |
|||
\(c \in \mathbf{R}\) |
c*x |
\(x \in\mathbf{R}\) |
\(\mathrm{sign}(cx)\) |
depends on c |
|
\(\max \left\{-x, 0 \right\}\) |
\(x \in \mathbf{R}\) |
||||
\(\max \left\{x, 0 \right\}\) |
\(x \in \mathbf{R}\) |
||||
\(1\) |
\(x \in \mathbf{R}\) |
constant |
|
||
\(x\) |
\(x \in \mathbf{R}\) |
same as x |
|||
\(p = 2, 4, 8, \ldots\) |
\(x^p\) |
\(x \in \mathbf{R}\) |
|||
\(p < 0\) |
\(x^p\) |
\(x > 0\) |
|||
\(0 < p < 1\) |
\(x^p\) |
\(x \geq 0\) |
|||
\(p > 1,\ p \neq 2, 4, 8, \ldots\) |
\(x^p\) |
\(x \geq 0\) |
|||
real part of a complex number |
\(x \in \mathbf{C}\) |
||||
\(x \log(x/y)\) |
\(x > 0\) \(y > 0\) |
None in \(x\) |
|||
\(\text{alpha} \geq 0\) \(\text{beta} \geq 0\) |
\(\alpha\mathrm{pos}(x)+ \beta\mathrm{neg}(x)\) |
\(x \in \mathbf{R}\) |
|||
\(\sqrt x\) |
\(x \geq 0\) |
||||
\(x^2\) |
\(x \in \mathbf{R}\) |
||||
\(x e^x\) |
\(x \geq 0\) |
Clarifications on elementwise functions¶
The functions log_normcdf
and loggamma
are defined via approximations. log_normcdf
has highest accuracy
over the range -4 to 4, while loggamma
has similar accuracy over all positive reals.
See CVXPY GitHub PR #1224
and CVXPY GitHub Issue #228
for details on the approximations.
Vector/matrix functions¶
A vector/matrix function takes one or more scalars, vectors, or matrices as arguments and returns a vector or matrix.
CVXPY is conservative when it determines the sign of an Expression returned by one of these functions. If any argument to one of these functions has unknown sign, then the returned Expression will also have unknown sign. If all arguments have known sign but CVXPY can determine that the returned Expression would have different signs in different entries (for example, when stacking a positive Expression and a negative Expression) then the returned Expression will have unknown sign.
Function |
Meaning |
Domain |
Curvature |
Monotonicity |
---|---|---|---|---|
\(\left[\begin{matrix} X^{(1,1)} & \cdots & X^{(1,q)} \\ \vdots & & \vdots \\ X^{(p,1)} & \cdots & X^{(p,q)} \end{matrix}\right]\) |
\(X^{(i,j)} \in\mathbf{R}^{m_i \times n_j}\) |
|||
\(c\in\mathbf{R}^m\) |
\(c*x\) |
\(x\in \mathbf{R}^n\) |
depends on c |
|
cumulative sum along given axis. |
\(X \in \mathbf{R}^{m \times n}\) |
|||
\(\left[\begin{matrix}x_1 & & \\& \ddots & \\& & x_n\end{matrix}\right]\) |
\(x \in\mathbf{R}^{n}\) |
|||
\(\left[\begin{matrix}X_{11} \\\vdots \\X_{nn}\end{matrix}\right]\) |
\(X \in\mathbf{R}^{n \times n}\) |
|||
\(k \in 0,1,2,\ldots\) |
kth order differences along given axis |
\(X \in\mathbf{R}^{m \times n}\) |
||
\(\left[\begin{matrix}X^{(1)} \cdots X^{(k)}\end{matrix}\right]\) |
\(X^{(i)} \in\mathbf{R}^{m \times n_i}\) |
|||
constant \(X\in\mathbf{R}^{p \times q}\) |
\(\left[\begin{matrix}X_{11}Y & \cdots & X_{1q}Y \\ \vdots & & \vdots \\ X_{p1}Y & \cdots & X_{pq}Y \end{matrix}\right]\) |
\(Y \in \mathbf{R}^{m \times n}\) |
depends on \(X\) |
|
constant \(Y\in\mathbf{R}^{m \times n}\) |
\(\left[\begin{matrix}X_{11}Y & \cdots & X_{1q}Y \\ \vdots & & \vdots \\ X_{p1}Y & \cdots & X_{pq}Y \end{matrix}\right]\) |
\(X \in \mathbf{R}^{p \times q}\) |
depends on \(Y\) |
|
constant \(y \in \mathbf{R}^m\) |
\(x y^T\) |
\(x \in \mathbf{R}^n\) |
depends on \(y\) |
|
partial trace |
\(X \in\mathbf{R}^{n \times n}\) |
|||
partial transpose |
\(X \in\mathbf{R}^{n \times n}\) |
|||
\(X' \in\mathbf{R}^{m' \times n'}\) |
\(X \in\mathbf{R}^{m \times n}\) \(m'n' = mn\) |
|||
flatten the strictly upper-triangular part of \(X\) |
\(X \in \mathbf{R}^{n \times n}\) |
|||
\(x' \in\mathbf{R}^{mn}\) |
\(X \in\mathbf{R}^{m \times n}\) |
|||
vec_to_upper_tri(X, strict=False) |
\(x' \in\mathbf{R}^{n(n-1)/2}\) for \(x' \in\mathbf{R}^{n(n+1)/2}\) for |
\(X \in\mathbf{R}^{n \times n}\) |
||
\(\left[\begin{matrix}X^{(1)} \\ \vdots \\X^{(k)}\end{matrix}\right]\) |
\(X^{(i)} \in\mathbf{R}^{m_i \times n}\) |
Clarifications on vector and matrix functions¶
The input to \(\texttt{bmat}\) is a list of lists of CVXPY expressions. It constructs a block matrix. The elements of each inner list are stacked horizontally and then the resulting block matrices are stacked vertically.
The output \(y = \mathbf{convolve}(c, x)\) has size \(n+m-1\) and is defined as \(y_k =\sum_{j=0}^{k} c[j]x[k-j]\).
The output \(y = \mathbf{vec}(X)\) is the matrix \(X\) flattened in column-major order into a vector. Formally, \(y_i = X_{i \bmod{m}, \left \lfloor{i/m}\right \rfloor }\).
The output \(Y = \mathbf{reshape}(X, (m', n'), \text{order='F'})\) is the matrix \(X\) cast into an \(m' \times n'\) matrix. The entries are taken from \(X\) in column-major order and stored in \(Y\) in column-major order. Formally, \(Y_{ij} = \mathbf{vec}(X)_{m'j + i}\). If order=’C’ then \(X\) will be read in row-major order and \(Y\) will be written to in row-major order.
The output \(y = \mathbf{upper\_tri}(X)\) is formed by concatenating partial rows of \(X\). I.e., \(y = (X[0,1{:}],\, X[1, 2{:}],\, \ldots, X[n-1, n])\).