Convex sets

We start off with reviewing some basic definitions.

Affine sets

Definition

Let \(x_1\) and \(x_2\) be two points in \(\RR^N\). Points of the form

\[y = \theta x_1 + (1 - \theta) x_2 \text{ where } \theta \in \RR\]

form a line passing through \(x_1\) and \(x_2\).

  • at \(\theta=0\) we have \(y=x_2\).
  • at \(\theta=1\) we have \(y=x_1\).
  • \(\theta \in [0,1]\) corresponds to the points belonging to the [closed] line segment between \(x_1\) and \(x_2\).

We can also rewrite \(y\) as

\[y = x_2 + \theta (x_1 - x_2)\]

In this definition:

  • \(x_2\) is called the base point for this line.
  • \(x_1 - x_2\) defines the direction of the line.
  • \(y\) is the sum of the base point and the direction scaled by the parameter \(\theta\).
  • As \(\theta\) increases from \(0\) to \(1\), \(y\) moves from \(x_2\) to \(x_1\).
Definition

A set \(C \subseteq \RR^N\) is affine if the line through any two distinct points in \(C\) lies in \(C\).

In other words, for any \(x_1, x_2 \in C\), we have \(\theta x_1 + (1 - \theta) x_2 \in C\) for all \(\theta \in \RR\).

If we denote \(\alpha = \theta\) and \(\beta = (1 - \theta)\) we see that \(\alpha x_1 + \beta x_2\) represents a linear combination of points in \(C\) such that \(\alpha + \beta = 1\).

The idea can be generalized in following way.

Definition
A point of the form \(\theta_1 x_1 + \dots + \theta_k x_k\) where \(\theta_1 + \dots + \theta_k = 1\) with \(\theta_i \in \RR\) and \(x_i \in \RR^N\), is called an affine combination of the points \(x_1,\dots,x_k\).

It can be shown easily that an affine set \(C\) contains all affine combinations of its points.

Remark
If \(C\) is an affine set, \(x_1, \dots, x_k \in C\), and \(\theta_1 + \dots + \theta_k = 1\), then the point \(y = \theta_1 x_1 + \dots + \theta_k x_k\) also belongs to \(C\).
Lemma

Let \(C\) be an affine set and \(x_0\) be any element in \(C\). Then the set

\[V = C - x_0 = \{ x - x_0 | x \in C\}\]

is a subspace of \(\RR^N\).

Proof

Let \(v_1\) and \(v_2\) be two elements in \(V\). Then by definition, there exist \(x_1\) and \(x_2\) in \(C\) such that

\[v_1 = x_1 - x_0\]

and

\[v_2 = x_2 - x_0\]

Thus

\[a v_1 + v_2 = a (x_1 - x_0) + x_2 - x_0 = (a x_1 + x_2 - a x_0 ) - x_0 \Forall a \in \RR.\]

But since \(a + 1 - a = 1\), hence \(x_3 = (a x_1 + x_2 - a x_0 ) \in C\) (an affine combination).

Hence \(a v_1 + v_2 = x_3 - x_0 \in V\) [by definition of \(V\)].

Thus any linear combination of elements in \(V\) belongs to \(V\). Hence \(V\) is a subspace of \(\RR^N\).

With this, we can use the following notation:

\[C = V + x_0 = \{ v + x_0 | v \in V\}\]

i.e. an affine set is a subspace with an offset.

Remark
Let \(C\) be an affine set and let \(x_1\) and \(x_2\) be two distinct elements. Let \(V_1 = C - x_1\) and \(V_2 = C - x_2\), then the subspaces \(V_1\) and \(V_2\) are identical.

Thus the subspace \(V\) associated with an affine set \(C\) doesn’t depend upon the choice of offset \(x_0\) in \(C\).

Definition
We define the affine dimension of an affine set \(C\) as the dimension of the associated subspace \(V = C - x_0\) for some \(x_0 \in C\).
ExampleSolution set of linear equations

We now show that the solution set of linear equations forms an affine set.

Let \(C = \{ x | A x = b\}\) where \(A \in \RR^{M \times N}\) and \(b \in \RR^M\).

\(C\) is the set of all vectors \(x \in \RR^N\) which satisfy the system of linear equations given by \(A x = b\). Then \(C\) is an affine set.

Let \(x_1\) and \(x_2\) belong to \(C\). Then we have

\[ A x_1 = b and\]
\[A x_2 = b\]

Thus

\[\begin{split}&\theta A x_1 + ( 1 - \theta ) A x_2 = \theta b + (1 - \theta ) b\\ &\implies A (\theta x_1 + (1 - \theta) x_2) = b\\ &\implies (\theta x_1 + (1 - \theta) x_2) \in C\end{split}\]

Thus \(C\) is an affine set.

The subspace associated with \(C\) is nothing but the null space of \(A\) denoted as \(\NullSpace(A)\).

Remark
Every affine set can be expressed as the solution set of a system of linear equations.
ExampleMore affine sets
  • The empty set \(\EmptySet\) is affine.
  • A singleton set containing a single point \(x_0\) is affine. Its corresponding subspace is \(\{0 \}\) of zero dimension.
  • The whole euclidean space \(\RR^N\) is affine.
  • Any line is affine. The associated subspace is a line parallel to it which passes through origin.
  • Any plane is affine. If it passes through origin, its a subspace. The associated subspace is the plane parallel to it which passes through origin.
Definition

The set of all affine combinations of points in some arbitrary set \(S \subseteq \RR^N\) is called the affine hull of \(S\) and denoted as \(\AffineHull(S)\):

\[\AffineHull(S) = \{\theta_1 x_1 + \dots + \theta_k x_k | x_1, \dots, x_k \in S \text{ and } \theta_1 + \dots + \theta_k = 1\}.\]
Remark
The affine hull is the smallest affine set containing \(S\). In other words, let \(C\) be any affine set with \(S \subseteq C\). Then \(\AffineHull(S) \subseteq C\).
Definition
A set of vectors \(v_0, v_1, \dots, v_K \in \RR^N\) is called affine independent, if the vectors \(v_1 - v_0, \dots, v_K - v_0\) are linearly independent.

Essentially the difference vectors \(v_k - v_0\) belong to the associated subspace.

If the associated subspace has dimension \(L\) then a maximum of \(L\) vectors can be linearly independent in it. Hence a maximum of \(L+1\) vectors can be affine independent for the affine set.

Convex sets

Definition

A set \(C\) is convex if the line segment between any two points in \(C\) lies in \(C\). i.e.

\[\theta x_1 + (1 - \theta) x_2 \in C \Forall x_1, x_2 \in C \text{ and } 0 \leq \theta \leq 1.\]
Definition

We call a point of the form \(\theta_1 x_1 + \dots + \theta_k x_k\), where \(\theta_1 + \dots + \theta_k = 1\) and \(\theta_i \geq 0, i=1,\dots,k\), a convex combination of the points \(x_1, \dots, x_k\).

It is like a weighted average of the points \(x_i\).

Remark
A set is convex if and only if it contains all convex combinations of its points.
ExampleConvex sets
  • A line segment is convex.
  • A circle [including its interior] is convex.
  • A ray is defined as \(\{ x_0 + \theta v | \theta \geq 0 \}\) where \(v \neq 0\) indicates the direction of ray and \(x_0\) is the base or origin of ray. A ray is convex but not affine.
  • Any affine set is convex.
Definition

The convex hull of an arbitrary set \(S \subseteq \RR^n\) denoted as \(\ConvexHull(S)\), is the set of all convex combinations of points in \(S\).

\[\ConvexHull(S) = \{ \theta_1 x_1 + \dots + \theta_k x_k | x_k \in S, \theta_i \geq 0, i = 1,\dots, k, \theta_1 + \dots + \theta_k = 1\}.\]
Remark
The convex hull \(\ConvexHull(S)\) of a set \(S\) is always convex.
Remark
The convex hull of a set \(S\) is the smallest convex set containing it. In other words, let \(C\) be any convex set such that \(S \subseteq C\). Then \(\ConvexHull(S) \subseteq C\).

We can generalize convex combinations to include infinite sums.

Lemma

Let \(\theta_1, \theta_2, \dots\) satisfy

\[\theta_i \geq 0, i = 1,2,\dots, \quad \sum_{i=1}^{\infty} \theta_i = 1,\]

and let \(x_1, x_2, \dots \in C\), where \(C \subseteq \RR^N\) is convex. Then

\[\sum_{i=1}^{\infty} \theta_i x_i \in C,\]

if the series converges.

We can generalize it further to density functions.

Lemma

Let \(p : \RR^N \to \RR\) satisfy \(p(x) \geq 0\) for all \(x \in C\) and

\[\int_{C} p(x) d x = 1\]

Then

\[\int_{C} p(x) x d x \in C\]

provided the integral exists.

Note that \(p\) above can be treated as a probability density function if we define \(p(x) = 0 \Forall x \in \RR^N \setminus C\).

Cones

Definition
A set \(C\) is called a cone or nonnegative homogeneous, if for every \(x \in C\) and \(\theta \geq 0\), we have \(\theta x \in C\).

By definition we have \(0 \in C\).

Definition

A set \(C\) is called a convex cone if it is convex and a cone. In other words, for every \(x_1, x_2 \in C\) and \(\theta_1, \theta_2 \geq 0\), we have

\[\theta_1 x_1 + \theta_2 x_2 \in C\]
Definition
A point of the form \(\theta_1 x_1 + \dots + \theta_k x_k\) with \(\theta_1 , \dots, \theta_k \geq 0\) is called a conic combination (or a non-negative linear combination) of \(x_1,\dots, x_k\).
Remark

Let \(C\) be a convex cone. Then for every \(x_1, \dots, x_k \in C\), a conic combination \(\theta_1 x_1 + \dots + \theta_k x_k\) with \(\theta_i \geq 0\) belongs to \(C\).

Conversely if a set \(C\) contains all conic combinations of its points, then its a convex cone.

The idea of conic combinations can be generalized to infinite sums and integrals.

Definition

The conic hull of a set \(S\) is the set of all conic combinations of points in \(S\). i.e.

\[\{\theta_1 x_1 + \dots \theta_k x_k | x_i \in S, \theta_i \geq 0, i = 1, \dots, k \}\]
Remark
Conic hull of a set is the smallest convex cone that contains the set.
ExampleConvex cones
  • A ray with its base at origin is a convex cone.
  • A line passing through zero is a convex cone.
  • A plane passing through zero is a convex cone.
  • Any subspace is a convex cone.

We now look at some more important convex sets one by one.

Hyperplanes and half spaces

Definition

A hyperplane is a set of the form

\[H = \{ x : a^T x = b \}\]

where \(a \in \RR^N, a \neq 0\) and \(b \in \RR\).

The vector \(a\) is called the normal vector to the hyperplane.

  • Analytically it is a solution set of a nontrivial linear equation. Thus it is an affine set.
  • Geometrically it is a set of points with a constant inner product to a given vector \(a\).

Now let \(x_0\) be an arbitrary element in \(H\). Then

\[\begin{split} &a^T x_0 = b\\ \implies &a^T x = a^T x_0 \Forall x \in H\\ \implies &a^T (x - x_0) = 0 \Forall x \in H\\ \implies &H = \{ x | a^T(x-x_0) = 0\}\end{split}\]

Now consider the orthogonal complement of \(a\) defined as

\[a^{\bot} = \{ v | a^T v = 0\}\]

i.e. the set of all vectors that are orthogonal to \(a\).

Now consider the set

\[S = x_0 + a^{\bot}\]

Clearly for every \(x \in S\), \(a^T x = a^T x_0 = b\).

Thus we can say that

\[H = \{ x | a^T(x-x_0) = 0\} = x_0 + a^{\bot}\]

Thus the hyperplane consists of an offset \(x_0\) plus all vectors orthogonal to the (normal) vector \(a_0\).

Definition

A hyperplane divides \(\RR^N\) into two halfspaces. The two (closed) halfspaces are given by

\[H_+ = \{ x : a^T x \geq b \}\]

and

\[H_- = \{ x : a^T x \leq b \}\]

The halfspace \(H_+\) extends in the direction of \(a\) while \(H_-\) extends in the direction of \(-a\).

  • A halfspace is the solution set of one (nontrivial) linear inequality.

  • A halfspace is convex but not affine.

  • The halfspace can be written alternatively as

    \[\begin{split}H_+ = \{ x | a^T (x - x_0) \geq 0\}\\ H_- = \{ x | a^T (x - x_0) \leq 0\}\end{split}\]

    where \(x_0\) is any point in the associated hyperplane \(H\).

  • Geometrically, points in \(H_+\) make an acute angle with \(a\) while points in \(H_-\) make an obtuse angle with \(a\).

Definition

The sets given by

\[\begin{split}\Interior{H_+} = \{ x | a^T x > b\}\\ \Interior{H_-} = \{ x | a^T x < b\}\end{split}\]

are called open halfspaces. They are the interior of corresponding closed halfspaces.

Euclidean balls and ellipsoids

Definition

A Euclidean closed ball (or just ball) in \(\RR^N\) has the form

\[B = \{ x | \| x - x_c\|_2 \leq r \} = \{x | (x - x_c)^T (x - x_c) \leq r^2 \},\]

where \(r > 0\) and \(\| \|_2\) denotes the Euclidean norm.

\(x_c\) is the center of the ball.

\(r\) is the radius of the ball.

An equivalent definition is given by

\[B = \{x_c + r u | \| u \|_2 \leq 1 \}.\]
Remark
A Euclidean ball is a convex set.
Proof

Let \(x_1, x_2\) be any two points in \(B\). We have

\[\| x_1 - x_c\|_2 \leq r\]

and

\[\| x_2 - x_c\|_2 \leq r\]

Let \(\theta \in [0,1]\) and consider the point \(x = \theta x_1 + (1 - \theta) x_2\). Then

\[\begin{split}\| x - x_c \|_2 &= \| \theta x_1 + (1 - \theta) x_2 - x_c\|_2\\ &= \| \theta (x_1 - x_c) + (1 - \theta) (x_2 - x_c) \|_2\\ &\leq \theta \| (x_1 - x_c)\|_2 + (1 - \theta)\| (x_2 - x_c)\|_2\\ &\leq \theta r + (1 - \theta) r\\ &= r\end{split}\]

Thus \(x \in B\), hence \(B\) is a convex set.

Definition

An ellipsoid is a set of the form

\[\xi = \{x | (x - x_c)^T P^{-1} (x - x_c) \leq 1\}\]

where \(P = P^T \succ 0\) i.e. \(P\) is symmetric and positive definite.

The vector \(x_c \in \RR^N\) is the centroid of the ellipse.

Eigen values of the matrix \(P\) (which are all positive) determine how far the ellipsoid extends in every direction from \(x_c\).

The lengths of semi-axes of \(\xi\) are given by \(\sqrt{\lambda_i}\) where \(\lambda_i\) are the eigen values of \(P\).

Remark
A ball is an ellipsoid with \(P = r^2 I\).

An alternative representation of an ellipsoid is given by

\[\xi = \{x_c + A u | \| u\|_2 \leq 1 \}\]

where \(A\) is a square and nonsingular matrix.

To show the equivalence of the two definitions, we proceed as follows.

Let \(P = A A^T\). Let \(x\) be any arbitrary element in \(\xi\).

Then \(x - x_c = A u\) for some \(u\) such that \(\| u \|_2 \leq 1\).

Thus

\[\begin{split}&(x - x_c)^T P^{-1} (x - x_c) = (A u)^T (A A^T)^{-1} (A u)\\ &= u^T A^T (A^T)^{-1} A^{-1} A u = u^T u \\ &= \| u \|_2^2 \leq 1\end{split}\]

The two representations of an ellipsoid are therefore equivalent.

Remark
An ellipsoid is a convex set.

Norm balls and norm cones

Definition

Let \(\| \cdot \| : \RR^N \to R\) be any norm on \(\RR\). A norm ball with radius \(r\) and center \(x_c\) is given by

\[B = \{ x | \| x - x_c \| \leq r \}\]
Remark
A norm ball is convex.
Definition

Let \(\| \cdot \| : \RR^N \to R\) be any norm on \(\RR\). The norm cone associated with the norm \(\| \cdot \|\) is given by the set

\[C = \{ (x,t) | \| x \| \leq t \} \subseteq \RR^{N+1}\]
Remark
A norm cone is convex. Moreover it is a convex cone.
ExampleSecond order cone

The second order cone is the norm cone for the Euclidean norm, i.e.

\[C = \{(x,t) | \| x \|_2 \leq t \} \subseteq \RR^{N+1}\]

This can be rewritten as

\[\begin{split}C = \left \{ \begin{bmatrix} x \\ t \end{bmatrix} \middle | \begin{bmatrix} x \\ t \end{bmatrix}^T \begin{bmatrix} I & 0 \\ 0 & -1 \end{bmatrix} \begin{bmatrix} x \\ t \end{bmatrix} \leq 0 , t \geq 0 \right \}\end{split}\]

Polyhedra

Definition

A polyhedron is defined as the solution set of a finite number of linear inequalities.

\[P = \{ x | a_j^T x \leq b_j, j = 1, \dots, M, c_k^T x = d_k, k = 1, \dots, P\}\]

A polyhedron thus is the intersection of a finite number of halfspaces (\(M\)) and hyperplanes (\(P\)).

ExamplePolyhedra
  • Affine sets ( subspaces, hyperplanes, lines)
  • Rays
  • Line segments
  • Halfspaces
Remark
A polyhedron is a convex set.
Definition
A bounded polyhedron is known as a polytope.

We can combine the set of inequalities and equalities in the form of linear matrix inequalities and equalities.

\[P = \{ x | A x \preceq b, C x = d\}\]

where

\[\begin{split}&A = \begin{bmatrix} a_1^T \\ \vdots \\ a_M^T \end{bmatrix} , b = \begin{bmatrix} b_1 \\ \vdots \\ b_M \end{bmatrix}\\ &C = \begin{bmatrix} c_1^T \\ \vdots\\ c_P^T \end{bmatrix} , d = \begin{bmatrix} d_1 \\ \vdots \\ d_P \end{bmatrix}\end{split}\]

and the symbol \(\preceq\) means vector inequality or component wise inequality in \(\RR^M\) i.e. \(u \preceq v\) means \(u_i \leq v_i\) for \(i = 1, \dots, M\).

Note that \(b \in \RR^M\), \(A \in \RR^{M \times N}\), \(A x \in \RR^M\), \(d \in \RR^P\), \(C \in \RR^{P \times N}\) and \(C x \in \RR^P\).

ExampleSet of nonnegative numbers
Let \(\RR_+ = \{ x \in \RR | x \geq 0\}\). \(\RR_+\) is a polyhedron (a solution set of a single linear inequality). Hence its a convex set. Moreover its a ray and a convex cone.
ExampleNon-negative orthant

We can generalize \(\RR_+\) as follows. Define

\[\RR_+^N = \{ x \in \RR^N | x_i \geq 0 , i = 1, \dots , N\} = \{x \in \RR^N | x \succeq 0 \}.\]

\(\RR_+^N\) is called nonnegative orthant. It is a polyhedron (solution set of \(N\) linear inequalities). It is also a convex cone.

Definition

Let \(K+1\) points \(v_0, \dots, v_K \in \RR^N\) be affine independent (see here).

The simplex determined by them is given by

\[C = \ConvexHull \{ v_0, \dots, v_K\} = \{ \theta_0 v_0 + \dots + \theta_K v_K | \theta \succeq 0, 1^T \theta = 1\}\]

where \(\theta = [\theta_1, \dots, \theta_K]^T\) and \(1\) denotes a vector of appropriate size \((K)\) with all entries one.

In other words, \(C\) is the convex hull of the set \(\{v_0, \dots, v_K\}\).

The positive semidefinite cone

Definition

We define the set of symmetric \(N\times N\) matrices as

\[S^N = \{X \in \RR^{N \times N} | X = X^T\}.\]
Lemma
\(S^N\) is a vector space with dimension \(\frac{N(N+1)}{2}\).
Definition

We define the set of symmetric positive semidefinite matrices as

\[S_+^N = \{X \in S^N | X \succeq 0 \}.\]

The notation \(X \succeq 0\) means \(v^T X v \geq 0 \Forall v \in \RR^N\).

Definition

We define the set of symmetric positive definite matrices as

\[S_{++}^N = \{X \in S^N | X \succ 0 \}.\]

The notation \(X \succ 0\) means \(v^T X v > 0 \Forall v \in \RR^N\).

Lemma
The set \(S_+^N\) is a convex cone.
Proof

Let \(A, B \in S_+^N\) and \(\theta_1, \theta_2 \geq 0\). We have to show that \(\theta_1 A + \theta_2 B \in S_+^N\).

\[A \in S_+^N \implies v^T A v \geq 0 \Forall v \in \RR^N.\]
\[B \in S_+^N \implies v^T B v \geq 0 \Forall v \in \RR^N.\]

Now

\[v^T (\theta_1 A + \theta_2 B) v = \theta_1 v^T A v + \theta_2 v^T B v \geq 0 \Forall v \in \RR^N.\]

Hence \(\theta_1 A + \theta_2 B \in S_+^N\).

Operations that preserve convexity

In the following, we will discuss several operations which transform a convex set into another convex set, and thus preserve convexity.

Understanding these operations is useful for determining the convexity of a wide variety of sets.

Usually its easier to prove that a set is convex by showing that it is obtained by a convexity preserving operation from a convex set compared to directly verifying the convexity property i.e.

\[\theta x_1 + (1 - \theta) x_2 \in C \Forall x_1, x_2 \in C, \theta \in [0,1].\]

Intersection

Lemma
If \(S_1\) and \(S_2\) are convex sets then \(S_1 \cap S_2\) is convex.
Proof

Let \(x_1, x_2 \in S_1 \cap S_2\). We have to show that

\[\theta x_1 + (1 - \theta) x_2 \in S_1 \cap S_2, \Forall \theta \in [0,1].\]

Since \(S_1\) is convex and \(x_1, x_2 \in S_1\), hence

\[\theta x_1 + (1 - \theta) x_2 \in S_1, \Forall \theta \in [0,1].\]

Similarly

\[\theta x_1 + (1 - \theta) x_2 \in S_2, \Forall \theta \in [0,1].\]

Thus

\[\theta x_1 + (1 - \theta) x_2 \in S_1 \cap S_2, \Forall \theta \in [0,1].\]

which completes the proof.

We can generalize it further.

Lemma
Let \(\{ A_i\}_{i \in I}\) be a family of sets such that \(A_i\) is convex for all \(i \in I\). Then \(\cap_{i \in I} A_i\) is convex.
Proof

Let \(x_1, x_2\) be any two arbitrary elements in \(\cap_{i \in I} A_i\).

\[\begin{split}&x_1, x_2 \in \cap_{i \in I} A_i\\ \implies & x_1, x_2 \in A_i \Forall i \in I\\ \implies &\theta x_1 + (1 - \theta) x_2 \in A_i \Forall \theta \in [0,1] \Forall i \in I \text{ since $A_i$ is convex }\\ \implies &\theta x_1 + (1 - \theta) x_2 \in \cap_{i \in I} A_i\end{split}\]

Hence \(\cap_{i \in I} A_i\) is convex.

Affine functions

Definition

A function \(f : \RR^N \to \RR^M\) is affine if it is a sum of a linear function and a constant, i.e.

\[f = A x + b\]

where \(A \in \RR^{M \times N}\) and \(b \in \RR^M\).

Lemma

Let \(S \subseteq \RR^N\) be convex and \(f : \RR^N \to \RR^M\) be an affine function. Then the image of \(S\) under \(f\) given by

\[f(S) = \{ f(x) | x \in S\}\]

is a convex set.

It applies in the reverse direction also.

Lemma

Let \(f : \RR^K \to \RR^N\) be affine and \(S \subseteq \RR^N\) be convex. Then the inverse image of \(S\) under \(f\) given by

\[f^{-1}(S) = \{ x \in \RR^K | f(x) \in S\}\]

is convex.

ExampleAffine functions preserving convexity

Let \(S \in \RR^N\) be convex.

  1. For some \(\alpha \in \RR\) , \(\alpha S\) given by

    \[\alpha S = \{\alpha x | x \in S\}\]

    is convex. This is the scaling operation.

  2. For some \(a \in \RR^N\), \(S + a\) given by

    \[S + a = \{x + a | x \in S\}\]

    is convex. This is the translation operation.

  3. Let \(N = M + K\) where \(M, N \in \Nat\). Thus let \(\RR^N = \RR^M \times \RR^K\). A vector \(x \in S\) can be written as \(x = (x_1, x_2)\) where \(x_1 \in \RR^M\) and \(x_2 \in \RR^K\). Then

    \[T = \{ x_1 \in \RR^M | (x_1, x_2) \in S \text{ for some } x_2 \in \RR^K\}\]

    is convex. This is the projection operation.

Definition

Let \(S_1\) and \(S_2\) be two arbitrary subsets of \(\RR^N\). Then their sum is defined as

\[S_1 + S_2 = \{ x + y | x \in S_1 , y \in S_2\}.\]
Lemma
Let \(S_1\) and \(S_2\) be two convex subsets of \(\RR^N\). Then \(S_1 + S_2\) is convex.

Proper cones and generalized inequalities

Definition

A cone \(K \in \RR^N\) is called a proper cone if it satisfies the following:

  • \(K\) is convex.
  • \(K\) is closed.
  • \(k\) is solid i.e. it has a nonempty interior.
  • \(K\) is pointed i.e. it contains no line. In other words
\[x \in K, -x \in K \implies x = 0.\]

A proper cone \(K\) can be used to define a generalized inequality, which is a partial ordering on \(\RR^N\).

Definition

Let \(K \subseteq \RR^N\) be a proper cone. A partial ordering on \(\RR^N\) associated with the proper cone \(K\) is defined as

\[x \preceq_{K} y \iff y - x \in K.\]

We also write \(x \succeq_K y\) if \(y \preceq_K x\). This is also known as a generalized inequality.

A strict partial ordering on \(\RR^N\) associated with the proper cone \(K\) is defined as

\[x \prec_{K} y \iff y - x \in \Interior{K}.\]

where \(\Interior{K}\) is the interior of \(K\). We also write \(x \succ_K y\) if \(y \prec_K x\). This is also known as a strict generalized inequality.

When \(K = \RR_+\), then \(\preceq_K\) is same as usual \(\leq\) and \(\prec_K\) is same as usual \(<\) operators on \(\RR_+\).

ExampleNonnegative orthant and component-wise inequality

The nonnegative orthant \(K=\RR_+^N\) is a proper cone. Then the associated generalized inequality \(\preceq_{K}\) means that

\[x \preceq_K y \implies (y-x) \in \RR_+^N \implies x_i \leq y_i \Forall i= 1,\dots,N.\]

This is usually known as component-wise inequality and usually denoted as \(x \preceq y\).

ExamplePositive semidefinite cone and matrix inequality

The positive semidefinite cone \(S_+^N \subseteq S^N\) is a proper cone in the vector space \(S^N\).

The associated generalized inequality means

\[X \preceq_{S_+^N} Y \implies Y - X \in S_+^N\]

i.e. \(Y - X\) is positive semidefinite. This is also usually denoted as \(X \preceq Y\).

Minimum and minimal elements

The generalized inequalities (\(\preceq_K, \prec_K\)) w.r.t. the proper cone \(K \subset \RR^N\) define a partial ordering over any arbitrary set \(S \subseteq \RR^N\).

But since they may not enforce a total ordering on \(S\), not every pair of elements \(x, y\in S\) may be related by \(\preceq_K\) or \(\prec_K\).

ExamplePartial ordering with nonnegative orthant cone

Let \(K = \RR^2_+ \subset \RR^2\). Let \(x_1 = (2,3), x_2 = (4, 5), x_3=(-3, 5)\). Then we have

  • \(x_1 \prec x_2\), \(x_2 \succ x_1\) and \(x_3 \preceq x_2\).
  • But neither \(x_1 \preceq x_3\) nor \(x_1 \succeq x_3\) holds.
  • In general For any \(x , y \in \RR^2\), \(x \preceq y\) if and only if \(y\) is to the right and above of \(x\) in the \(\RR^2\) plane.
  • If \(y\) is to the right but below or \(y\) is above but to the left of \(x\), then no ordering holds.
Definition
We say that \(x \in S \subseteq \RR^N\) is the minimum element of \(S\) w.r.t. the generalized inequality \(\preceq_K\) if for every \(y \in S\) we have \(x \preceq y\).
  • \(x\) must belong to \(S\).
  • It is highly possible that there is no minimum element in \(S\).
  • If a set \(S\) has a minimum element, then by definition it is unique (Prove it!).
Definition
We say that \(x \in S \subseteq \RR^N\) is the maximum element of \(S\) w.r.t. the generalized inequality \(\preceq_K\) if for every \(y \in S\) we have \(y \preceq x\).
  • \(x\) must belong to \(S\).
  • It is highly possible that there is no maximum element in \(S\).
  • If a set \(S\) has a maximum element, then by definition it is unique.
ExampleMinimum element
Consider \(K = \RR^N_+\) and \(S = \RR^N_+\). Then \(0 \in S\) is the minimum element since \(0 \preceq x \Forall x \in \RR^N_+\).
ExampleMaximum element
Consider \(K = \RR^N_+\) and \(S = \{x | x_i \leq 0 \Forall i=1,\dots,N\}\). Then \(0 \in S\) is the maximum element since \(x \preceq 0 \Forall x \in S\).

There are many sets for which no minimum element exists. In this context we can define a slightly weaker concept known as minimal element.

Definition
An element \(x\in S\) is called a minimal element of \(S\) w.r.t. the generalized inequality \(\preceq_K\) if there is no element \(y \in S\) distinct from \(x\) such that \(y \preceq_K x\). In other words \(y \preceq_K x \implies y = x\).
Definition
An element \(x\in S\) is called a maximal element of \(S\) w.r.t. the generalized inequality \(\preceq_K\) if there is no element \(y \in S\) distinct from \(x\) such that \(x \preceq_K y\). In other words \(x \preceq_K y \implies y = x\).
  • The minimal or maximal element \(x\) must belong to \(S\).
  • It is highly possible that there is no minimal or maximal element in \(S\).
  • Minimal or maximal element need not be unique. A set may have many minimal or maximal elements.
Lemma

A point \(x \in S\) is the minimum element of \(S\) if and only if

\[S \subseteq x + K\]
Proof

Let \(x \in S\) be the minimum element. Then by definition \(x \preceq_K y \Forall y \in S\). Thus

\[\begin{split}& y - x \in K \Forall y \in S \\ \implies & \text{ there exists some } k \in K \Forall y \in S \text{ such that } y = x + k\\ \implies & y \in x + K \Forall y \in S\\ \implies & S \subseteq x + K.\end{split}\]

Note that \(k \in K\) would be distinct for each \(y \in S\).

Now let us prove the converse.

Let \(S \subseteq x + K\) where \(x \in S\). Thus

\[\begin{split}& \exists k \in K \text{ such that } y = x + k \Forall y \in S\\ \implies & y - x = k \in K \Forall y \in S\\ \implies & x \preceq_K y \Forall y \in S.\end{split}\]

Thus \(x\) is the minimum element of \(S\) since there can be only one minimum element of S.

\(x + K\) denotes all the points that are comparable to \(x\) and greater than or equal to \(x\) according to \(\preceq_K\).

Lemma

A point \(x \in S\) is a minimal point if and only if

\[\{ x - K \} \cap S = \{ x \}.\]
Proof

Let \(x \in S\) be a minimal element of \(S\). Thus there is no element \(y \in S\) distinct from \(x\) such that \(y \preceq_K x\).

Consider the set \(R = x - K = \{x - k | k \in K \}\).

\[r \in R \iff r = x - k \text { for some } k \in K \iff x - r \in K \iff r \preceq_K x.\]

Thus \(x - K\) consists of all points \(r \in \RR^N\) which satisfy \(r \preceq_K x\). But there is only one such point in \(S\) namely \(x\) which satisfies this. Hence

\[\{ x - K \} \cap S = \{ x \}.\]

Now let us assume that \(\{ x - K \} \cap S = \{ x \}\). Thus the only point \(y \in S\) which satisfies \(y \preceq_K x\) is \(x\) itself. Hence \(x\) is a minimal element of \(S\).

\(x - K\) represents the set of points that are comparable to \(x\) and are less than or equal to \(x\) according to \(\preceq_K\).