Proximity Operator of

Given a closed, convex and proper function , its proximity operator is defined as

The scaled Euclidean norm with has a closed-form proximity operator given by

This can be derived using the Moreau identity or by applying optimality conditions directly.

We now consider a generalization of this result. Suppose is any real-valued matrix. There is a closed-form expression for the proximity operator of , provided we interpret “closed-form” liberally to allow the solution of a one-dimensional secular equation. Stack Exchange user River Li posted a similar formula for diagonal in 2020, but the result extends to arbitrary matrices .

Theorem: Proximity Operator of

Let and . Denote the positive singular values of by and the corresponding right-singular vectors by . Then

(1)

where is the projection onto the kernel (also known as the null space) of , denotes the Moore–Penrose pseudoinverse of , and is the unique positive solution to the equation

(2)

Before demonstrating this result, we make several observations about its structure. The norm threshold condition can be equivalently written as , where and , using the fact that is invariant under orthogonal transformation. Equation (2) generally yields a polynomial equation in of degree , which means it does not admit a closed-form solution for . When , however, equation (2) has a unique positive solution. The expression on the left-hand side is strictly decreasing for , taking the value at and decreasing to zero as . These monotonicity properties guarantee the existence and uniqueness of the solution.

Proof

We provide a proof using the SVD. Our goal is to compute , the optimal solution to the problem

with variable .

Subspace decomposition. The key insight is that this problem decomposes naturally with respect to the fundamental subspaces of . Let be the SVD of , with rank . We write

where spans the image of , spans the kernel of , and . We decompose , and we correspondingly decompose and as and , where the subscripts and denote the components in and , respectively. Note that .

The objective function decouples because and the components are orthogonal, giving

Minimizing over immediately yields .

We now focus on minimizing over . We parameterize and , where . Note that , so . We have and . The problem reduces to

with variable .

Optimality condition. The optimality condition for the minimizer can be written as , where . Since is invertible, we can write the subdifferential as , where . Thus, we require some such that

The structure of the solution depends on whether the optimal point is zero or nonzero.

The case . Suppose that . Then the optimality condition requires with , which means . This implies , so the condition becomes . To get the condition provided in equation (1), we note the identity

Thus, if and only if . In this case, , and the solution is .

The case . Suppose now that . Since is positive definite, we have . The subdifferential of at a nonzero point reduces to a singleton, so is unique and given by . Let . Substituting and rearranging, the optimality condition becomes

Solving for yields

We determine using its definition,

Dividing by (since ) gives

The function is strictly decreasing for , with . The value at is

A unique positive solution exists if and only if , which is equivalent to . In this case, we define . Since , we have . Substituting into the equation yields

This confirms equation (2), and is the unique positive solution.

We verify the form of . Let . We want to show . Using the SVD representation, we have

The matrix is diagonal, with diagonal entries for and for , giving

Decomposing into its first components and the remaining components, we obtain

The first term is since . The second term is . Thus, .

This completes the proof of the proposition.