Projected Dynamical Systems on Irregular, Non-Euclidean Domains for Nonlinear Optimization

Continuous-time projected dynamical systems are an elementary class of discontinuous dynamical systems with trajectories that remain in a feasible domain by means of projecting outward-pointing vector fields. They are essential when modeling physical saturation, constraints of motion, as well as studying projection-based numerical optimization algorithms. Motivated by the emerging application of feedback-based continuous-time optimization schemes that rely on the physical system to enforce nonlinear hard constraints, we study the fundamental properties of these dynamics on general locally-Euclidean sets. Among others, we propose the use of Krasovskii solutions, show their existence on nonconvex, irregular subsets of low-regularity Riemannian manifolds, and investigate how they relate to conventional Carath\'eodory solutions. Furthermore, we establish conditions for uniqueness, thereby introducing a generalized definition of prox-regularity which is suitable for non-flat domains. Finally, we use these results to study the stability and convergence of projected gradient flows as an illustrative application of our framework. We provide simple counter-examples for our main results to illustrate the necessity of our already weak assumptions.


1.
Introduction. An important class of discontinuous dynamical system are projected dynamical systems whose trajectories remain in a domain X by projecting outward portions of a vector field f at the boundary of X to prevent a trajectory from leaving the domain. This qualitative behavior is illustrated in Fig. 1a.
Even though projected dynamical systems have a long history in different contexts such as the study of variational inequalities or differential inclusions, new compelling applications in real-time, nonlinear optimization require a holistic study in a more general setting. Hence, this paper is primarily motivated by the renewed interest in dynamical systems that solve optimization problems. Early works in this spirit such as [10] have designed continuous-time systems to solve computational problems such as diagonalizing matrices or solving linear programs. This has further resulted in the study of optimization algorithms over manifolds [2]. Recently, interest has shifted towards analyzing existing iterative schemes with tools from dynamical systems including Lyapunov theory [47] and integral quadratic constraints [19,31]. Most of these have considered unconstrained optimization problems [44] and algorithms that can be modelled with a standard ODE [30] or with variational tools [46]. With this paper we hope to pave the way for the analysis of algorithms for constrained optimization whose continuous-time limits are discontinuous.
The need for substantial generalization of the existent theory on projected dynamical systems is particularly visible in the recent trend to design nonlinear feedback controllers that steer a physical system to the solution of an optimization problem [38,48]. Precursors of this idea have been used in the analysis of congestion control in communication networks [29,34]. More recently, the concept has been widely applied to power systems [18,21,25,32,36,45]. This context is particularly challenging, because the physical laws of power flow, saturating components, and other constraints define a highly non-linear, nonconvex feasible domain over which to optimize.
New features to consider for projected dynamical systems include, for example, irregular feasible domains (Fig. 1b) for which traditional Carathéodory solutions can fail to exist or may not be unique. Furthermore, non-orthogonal projections occur in non-Euclidean spaces and may alter the dynamics. Finally, a coordinate-free definitions are required to study projected dynamical systems on subsets of manifolds ( Fig. 1c).
Literature review. Different approaches have been reviewed and explored to establish the results in this paper. One of the earliest formulations of projected dynamical systems goes back to [26] which establishes the existence of Carathéodory solutions on closed convex domains. In [16] this requirement is relaxed to X being Clarke regular (for existence) and prox-regular (for uniqueness). In the larger context of differential inclusions and viability theory [5,7], projected dynamical systems are often presented as specific examples of more general differential inclusions, but without substantially generalizing the results of [16,26]. In the context of variational equalities, [37] provides alternative proofs of existence and uniqueness of Carathéodory solutions when the domain X is a convex by using techniques from stochastic analysis. In [11] various equivalence results between the different formulations are established for convex X . Finally, [15] provides generalized existence for projected dynamical systems defined on closed convex subsets of a Hilbert space.
Projected dynamical systems are naturally part the more general class of hybrid systems. Many approaches towards existence and uniqueness of hybrid trajectories have been developed [22,33,35], even on non-Euclidean state spaces [43]. The high level of generality nevertheless comes at the expense of less concise and self-contained statements than those specialized to projected dynamical systems.
A special case of projected dynamical systems are subgradient and saddle-point flows arising in non-smooth and constrained optimization. Whereas projection-based algorithms and subgradients are ubiquitous in the analysis of iterative algorithms, work on their continuous-time counterparts is far less prominent has only been studied with limited generality [4,12,17,24], e.g., restricted to convex problems.
Contributions. In this paper, we study projected dynamical systems in finite dimensions without making a priori assumptions on the regularity of the feasible domain X and the vector field f . We also introduce and study oblique projection directions by means of a (possibly non-differentiable) metric g defined on the feasible set.
Our main contribution is the development of a self-contained theory for this general setup. We provide weak requirements on the feasible set X , the vector field f , the metric g and the differentiable structure of the underlying manifold that guarantee existence and uniqueness of trajectories, as well as other properties. Table 1 at the end of the paper concisely summarizes these results.
We initially consider the notion of so-called Krasovskii solutions that are a weaker notion than the classical Carathéodory solutions for discontinuous dynamical systems and therefore exist in more general situations. We establish existence of such solutions and continuity with respect to initial conditions and parameters under minimal regularity requirements which are mostly of topological nature. Under the slightly stronger assumptions involving continuity and Clarke regularity, we show that Krasovskii solutions coincide with the classical Carathéodory solutions. Finally, we lay out the requirements for uniqueness of solutions which are based on Lipschitz-continuity and prox-regularity. Our already weak regularity conditions are sharp in the sense that counter-examples can be constructed to show that requirements cannot be violated individually without the respective result failing to hold.
A major appeal of our analysis framework is its geometric nature: All of our notions are preserved by sufficiently regular coordinate transformations, which allows us to extend all of our results to constrained subsets of differential manifolds. A noteworthy by-product of this analysis is the fact that prox-regularity (required for the uniqueness of trajectories) is an intrinsic property of subsets of C 1,1 manifolds, i.e., independent of the metric even though the traditional definition (on R n ) suggests that prox-regularity depends on the choice of metric.
Through a series of examples, we demonstrate the application of our framework to general (nonlinear and nonconvex) optimization problems and study the stability and convergence of projected gradient dynamics under very weak regularity assumptions.
Thus, we believe that our results are not only of interest within the context of discontinuous dynamical systems, but we also envision their use in the analysis of algorithms for nonlinear, nonconvex optimization problems, possibly on manifolds. The properties developed in the present paper also form a solid foundation for constrained feedback control and online optimization in various contexts. Some preliminary results for online optimization in power systems can be found in [24,25].
Paper organization. After introducing notation and preliminary definitions in Sections 2 and 3, we establish the existence of Krasovskii solutions to projected dynamical systems on R n in Section 4. In Section 5 we consider Krasovskii solutions of projected gradient systems on irregular domains and study their convergence and stability. Section 6 establishes equivalence of Krasovskii and Carathéodory solutions under Clarke regularity. Furthermore, we point out the connection to related work and to continuous-time subgradient flows. In Section 7, we elaborate on the requirements for uniqueness. Finally, in Section 8 we define projected dynamical systems on low-regularity Riemannian manifolds and establish the requirements on the differentiable structure that guarantee existence and uniqueness. Throughout the paper we illustrate our theoretical developments with insightful examples. Finally, Section 9 concisely summarizes our results in the form of Table 1 and concludes the paper. The appendix includes technical definitions and results that are used in proofs but are not required to understand the main results of the paper.

Preliminaries.
2.1. Notation. We only consider finite-dimensional spaces. Unless explicitly noted otherwise, we will work in the usual Euclidean setup for R n with inner product ·, · and 2-norm · . Whenever it is informative, we make a formal distinction between R n and its tangent space T x R n at x ∈ R, even though they are isomorphic. For a set A ⊂ R n we use the notation A := sup v∈A v . The closure, convex hull and closed convex hull of A are denoted by cl A, co A, and co A, respectively. The set A is locally compact if it is the intersection of a closed and an open set. A neighborhood U ⊂ A of x ∈ A is understood to be relative neighborhood, i.e., with respect to the subspace topology on A. Given a convergent sequence {x k }, the notation x k → A x implies that x k ∈ A for all k. If x k ∈ R, the notation x → 0 + means x k > 0 for all k and x k converges to 0. Let V and W be vector spaces and let A ⊂ V . Continuous maps Φ : A → W are denoted by C 0 . The map Φ is (locally) Lipschitz (denoted by C 0,1 ) if for every x ∈ A there exists L > 0 such that for all z, y ∈ A in a neighborhood of x it holds that The map Φ is globally Lipschitz if (2.1) holds holds for the same L for all z, y. Differentiability is understood in the sense of Fréchet. Namely, if V and W are endowed norms · V and · W respectively and A is open, then the map Φ is In our context, a set-valued map F : A ⇒ R n where A ⊂ R n is a map that assigns to every point x ∈ A a set F (x) ⊂ T x R n . The set-valued map F is non-empty, closed, convex, or compact if for every x ∈ A the set F (x) is non-empty, closed, convex, or compact, respectively. It is locally bounded if for every x ∈ A there exists L > 0 such that F (y) ≤ L for all y ∈ A in a neighborhood of x. The same definition also applies to single-valued functions. The map F is bounded if there exists L > 0 such that F (y) ≤ L for all x ∈ A. The inner and outer limits of F at x are denoted by lim inf y→x F (y) and lim sup y→x F (y) respectively (see appendix for a formal definition and summary of continuity concepts which are required for certain proofs only).
Definition 2.1. Given a set X ⊂ R n and x ∈ X , a vector v ∈ T x R n is a tangent vector of X at x if there exist sequences The set of all tangent vectors is the tangent cone of X at x and denoted by T x X .
The tangent cone T x X (also known as (Bouligand's) contingent cone [14] ) is closed and non-empty (namely, 0 ∈ T x X ) for any x ∈ X .
In the following definition of Clarke regularity and in most of paper we limit ourselves to locally compact subsets of R n . In our context, a more general definition Clarke regularity does not improve our results and only adds to the technicalities.
Definition 2.2. For a locally compact set X ⊂ R n the Clarke tangent cone at x ∈ X is defined as the inner limit of the tangent cones, i.e., T C x X := lim inf y→x T y X .
By definition of the inner limit, we have T C x X ⊆ T x X . Furthermore, T C x X is closed, convex and non-empty for all x ∈ X [41, Thm 6.26].
Definition 2.3. We call a set X ⊂ R n Clarke regular at x if it is locally compact and T x X = T C x X . The set X is Clarke regular if it is Clarke regular for all x ∈ X .  Figure 2a illustrates the definition of a tangent vector by a sequence {x k } that approaches x in a tangent direction. Figure 2b shows a set that is not Clarke regular.
The following example illustrates that, under standard constraint qualifications as used in optimization theory, sets defined by C 1 inequality constraints are Clarke regular. Such sets are generally encountered in nonlinear programming.
Example 2.4 (sets defined by inequality constraints). Let h : R n → R m be C 1 such that ∇h(x) has full rank for all x. 1 Then, the set X := {x | h(x) ≤ 0} is Clarke regular [41,Thm 6.31]. In particular, let h be expressed componentwise denote the set of active constraints at x ∈ X and define h I(x) := [h i (x)] i∈I(x) as the function obtained from stacking the active constraint functions. Then, the (Clarke) tangent cone at x in the canonical basis is given by 3. Low-regularity Riemannian metrics. A natural extension for projected dynamical systems are oblique projection directions. These are conveniently defined via a (Riemannian) metric which defines a variable inner product on T x R n as function of x. Furthermore, the notion of a Riemannian metric is essential to define projected dynamical systems in a coordinate-free setup on manifolds.
We quickly review the definition of bilinear forms and inner products. Let L n 2 denote the space of bilinear forms on R n , i.e., every g ∈ L n 2 is a map g : R n × R n → R such that for every u, v, w ∈ R n and λ ∈ R it holds that g(u + v, w) = g(u, w) + g(v, w) and g(u, v + w) = g(u, v) + g(u, w) as well as g(λv, w) = λg(v, w) = g(v, λw). Given the canonical basis of R n , g can be written in matrix form as g(u, v) := u T Gv where G ∈ R n×n . In particular, L n 2 is itself a n 2 -dimensional space isomorphic to R n×n . An inner product g ∈ L n 2 is a symmetric, positive-definite bilinear form, that is, for all u, v ∈ R n we have g(u, v) = g(v, u). Further, g(u, u) ≥ 0, and g(u, u) = 0 holds if and only if u = 0. If g is an inner product we use the notation u, v g := g(u, v). In matrix form, we can write u, v g := u T Gv where G is symmetric positive definite.
We write · g given by v g := v, v g to denote the 2-norm induced by g. The maximum and minimum eigenvalues of g are denoted by λ max g := max{ v g | v = 1} and λ min g = min{ v g | v = 1} respectively, and the condition number is defined as κ g := λ max g /λ min g .
In this context, also recall that the 2-norms induced by any two inner products on a finite-dimensional vector space are equivalent, that is, for a vector space V with norms · a and · b there are constants > 0 and L > 0 such that for every v ∈ V it holds that v a ≤ v b ≤ L v a . For instance, = λ min b /λ max a and L = λ max b /λ min a . Hence, we can define a metric as a variable inner product over a given set.
Definition 2.5. Given a set X ⊂ R n , a (Riemannian) metric is a map g : X → L n 2 that assigns to every point x ∈ X an inner product ·, · g(x) . A metric is (Lipschitz) continuous if is (Lipschitz) continuous as a map from X to L n 2 . If clear from the context at which point x the metric g is applied, we drop the argument in the subscript and write ·, · g or · g . We always retain the subscript g, in order to draw a distinction between the Euclidean norm · .
Since g is positive definite for all x by definition, it follows that λ max g(x) , λ min g(x) and κ g(x) are well-defined for all x. However, κ g(x) is not necessarily locally bounded (even if g is bounded as a map). In particular, λ min g(x) might not be bounded below, away from 0. Hence, for metrics we require the following definition of local boundedness.
Definition 2.6. A metric g on X is locally weakly bounded if for every x ∈ X there exist , L > 0 such that ≤ κ(y) ≤ L holds for all y ∈ X in a neighborhood of x. It is weakly bounded if ≤ κ(x) ≤ L holds for all x ∈ X .
A metric g can be locally weakly bounded even if its not locally bounded as a map X → L n 2 . Furthermore, since maximum and minimum eigenvalues (and hence the condition number) are continuous functions of a metric (or the representing matrix) it follows that a continuous metric is always locally weakly bounded.
Remark 2.7. In the following, we will continue to use the Euclidean norm as a distance function on R n and use any Riemannian metric only in the context of projection directions. Thereby, we avoid the notational complexity introduced by Riemannian geometry, and more importantly we do not need to make an a priori assumption on the differentiability on the metric g (which is a prerequisite for many Riemannian constructs to exist), thus preserving a high degree of generality.
2.4. Normal Cones. Given a metric g, we can define normal cones.
Definition 2.8. Let X ⊂ R n be Clarke regular and let g be a metric on X , then the normal cone at x ∈ X with respect to g is defined as the polar cone of T C x X with respect to the metric g, i.e., The normal cone with respect to the Euclidean metric is simply denoted by N x X .
Remark 2.9. For simplicity, we will use the notion of normal cone only in the context of Clarke regular sets. If X is not Clarke regular, one needs to distinguish between the regular, general and Clarke normal cones [41].
Example 2.10 (normal cone to constraint-defined sets). As in Example 2.4 consider X := {x | h(x) ≤ 0} where h : R n → R m is C 1 and ∇h(x) has full rank for all x. Further, let g denote a metric on X represented by G(x) ∈ R n×n . Then, the normal cone of X at x is given by which can be derived by inserting any η into (2.3) and using T x X in Example 2.4.

Projected Dynamical Systems.
With the above notions we can now formally define our main object of study.
Definition 3.1. Given a set X ⊂ R n , a metric g on X , and a vector field f : X → R n , the projected vector field of f is defined as the set-valued map For simplicity, we call Π g X f a vector field even though Π g X f (x) might not be a singleton. We will write Πf whenever X and g are clear from the context.
where h : R n → R m is C 1 and ∇h(x) has full rank for all x and let g denote a metric on X represented by G(x) ∈ R n×n . Furthermore, consider a vector field f : X → R n . Then, the projected vector field Π g X f (x) at x ∈ X is given as the solution of the convex quadratic program Note that x is not an optimization variable. Hence, the properties of f and g as function of x are irrelevant when doing a pointwise evaluation of Π g X f (x). Since T x X is non-empty and closed, a minimum norm projection exists, and therefore Π g X f (x) is non-empty for all x ∈ X . 2 Hence, a projected dynamical system is described by the initial value problem is always strictly convex as function of v). In this case we will slightly abuse notation and not distinguish between the set-valued map and its induced vector field, i.e., instead of (3.2) we simply writeẋ = Π g X f (x), x(0) = x 0 . An absolutely continuous function x : [0, T ) → X with T > 0 and x(0) = x 0 that satisfiesẋ ∈ Π g X f (x) almost everywhere (i.e., for all t ∈ [0, T ) except on a subset of Lebesgue measure zero) is called a Carathéodory solution to (3.2).
Remark 3.3. The class of systems (3.2) can be generalized to f being set-valued, i.e., f : R n ⇒ R n . This avenue has been explored in [5,7,16,26], albeit only for g Euclidean and X Clarke regular. In order not to overload our contributions with technicalities we assume that f is single-valued, although an extension is possible.
As the following example shows, Carathéodory solutions to (3.2) can fail to exist unless various regularity assumptions X , f and g hold. Hence, in the next section we propose the use of Krasovskii solutions which exist in more general settings. Furthermore, we will show that the Krasovskii solutions reduce to Carathéodory solutions under the same assumptions that guarantee the existence of the latter.
Example 3.4 (non-existence of Carathéodory solution). Consider R 2 with the Euclidean metric, the uniform "vertical" vector field f = (0, 1), and the self-similar closed set X illustrated in Figure 3 and defined by The tangent cone at 0 is given by It is not "derivable", that is, there are no differentiable curves leaving 0 in a tangent direction and remaining in X . However, by definition there is a sequence of points in X approaching 0 in the direction of any tangent vector. At 0 the projection of f on the tangent cone is not unique as seen in Figure 3a, namely To see this, we can argue that any solution starting at 0 can neither stay at 0 nor leave 0. More precisely, on one hand the constant curve On the other hand, the points p k = ± 2 3 1+2k , 2 3 1+2k illustrated in Figure 3b are locally asymptotically stable equilibria of the system. Namely there is an equilibrium point arbitrarily close to 0. Thus, loosely speaking, any solution leaving 0 would need to converge to an equilibrium arbitrarily close to 0.

Existence of Krasovskii solutions.
The pathology in Example 3.4 can be resolved either by placing additional assumptions on the feasible set X or by relaxing the notion of a solution. In this section we focus on the latter. Definition 4.1. Given a set-valued map F : X ⇒ R n , its Krasovskii regularization is defined as the set-valued map given by Given a set-valued map F : X ⇒ R n , an absolutely continuous function x :  . Let X ⊂ R n be a locally compact set, f : X → R n a locally bounded vector field and g a locally weakly bounded metric defined on X . Then, for any x 0 ∈ X there exists a Krasovskii solution x : In addition, for r > 0 such that U r := {x ∈ X | x − x 0 ≤ r} is closed and L = max y∈Ur K[Π g X f ](y) exists, the solution is C 0,1 and exists for T > r/L. Proof. We show that the general existence result [23, Cor 1.1] (Proposition A.7) is applicable to Krasovskii regularized projected vector fields. Namely, we need to verify that K[Π g X f ] is convex, compact, non-empty, upper semicontinuous (usc), and For this, we first introduce an auxiliary metricĝ defined asĝ(x) : , that is, we scale the metric at every x ∈ X by dividing it by its maximum eigenvalue at that point. This for all x ∈ X , and consequentlyĝ is locally weakly bounded since g is locally weakly bounded.
Given any for all v ∈ T y R n and all y ∈ X in a neighborhood of x. Combining these arguments, there exist L , L > 0 such that for every y ∈ X in a neighborhood of x it holds that Thus, since gph Π g X f | U is bounded, gph F | U is compact, and consequently F (y) is locally bounded for every y ∈ U . In particular, since F (x) is compact, and the closed convex hull of a bounded set is compact [27,Thm 1.4 For this, note that the map F is outer semicontinuous (osc) and closed by definition. Furthermore, it is locally bounded (as shown above). Consequently, by Lemma A.2, F is also usc. Hence, X f ] satisfies the conditions for Proposition A.7 to be applicable, and therefore the existence of Krasovskii solution to (4.1) is guaranteed for all x 0 ∈ X .
Besides weaker requirements for existence, the choice to consider Krasovskii solutions is also motivated by their inherent "robustness" towards perturbations, i.e., solutions to a perturbed system still approximate the solutions of the nominal systems [22,Chap 4]. In the same spirit, one can also establish results about the continuous dependence of solutions on initial values and problem parameters [20].
The existence of solutions for t → ∞ is guaranteed under the following conditions. (i) X is closed, f is bounded, and g is weakly bounded, or (ii) X is compact, f and g are continuous, or (iii) X is closed, f is globally Lipschitz and g is weakly bounded, then for every x 0 ∈ X every Krasovskii solution to (4.1) can be extended to T → ∞.
Proof. (i) If f is bounded and g is weakly bounded, then the local boundedness argument of the proof of Theorem 4.2 can be applied globally, i.e., (4.3) holds for all y ∈ X for the same L , L and hence K[Π g X f ] is bounded. Hence, in Theorem 4.2 the constant L > 0 exists for r → ∞ and consequently T → ∞.
(ii) Since f is continuous it only takes bounded values on a compact set. Furthermore, continuity of g implies local weak boundedness, i.e., for every x ∈ X there exist x , L x > 0 such that x < κ g (y) < L x for all y ∈ X in a neighborhood of x. Since X is compact, there exist := min x∈X x and L := max x∈X L x and (4.3) holds for all y ∈ X . Hence, g is weakly bounded. Then, the same arguments as for (i) apply.
(iii) Assume without loss of generality that 0 ∈ X (possibly after a linear translation). Global Lipschitz continuity of f implies the existence of L > 0 such that f (x) ≤ L ( x + 1) for all x ∈ X (linear growth property [5]). To see this, recall that by the reverse triangle inequality and the definition of Lipschitz continuity there and hence L can be chosen as the maximum of L and f (0) to yield the linear growth property.
Since g is weakly bounded, the same arguments used for (4.3) can be used to establish that there exists L > 0 such that for all x ∈ X it holds that It follows by the same arguments as in the proof of Theorem 4.2 that K[Π g X f ](x) ≤ L( x + 1) where L = L /L , i.e., the linear growth condition applies to K[Π g X f ]. Hence using standard bounds [5, p. 100], one can conclude that any Krasovskii solution to (4.1) satisfies x(t) ≤ ( x 0 + 1)e Lt . Namely, define u(t) := L( x(t) + 1) and note thatu(t) = L d dt holds for all t whereẋ(t) exists. Hence, Gronwall's inequality (for discontinuous ODEs) implies the desired bound. It immediately follows that x(t) cannot have finite escape time and therefore can be extended to t → ∞, completing the proof of (iii).  Figure 3c. It is the convex hull of five limiting vectors: the two vectors in Πf (0), the projected vector field at the arbitrarily close-by equilibria p k which is Πf (p k ) = 0 and the projected vectors at the ascending and descending slopes.
Note that the map x(t) = 0 for all t ∈ [0, ∞) is a valid solution to the differential inclusionẋ ∈ K[Πf ](x) with initial point 0 and hence a Krasovskii solution to the projected dynamical system, but not a Carathéodory solution.
. If in addition X is Clarke regular at x, then Π g X f (x) is a singleton and there isη ∈ N g x X such that the following equivalent statements hold: , v g(x) = 0 holds. This proves the first part. The second part follows from Moreau's Theorem [27,Thm 3.2.5] since T x X is convex by Clarke regularity.
Lemma 4.6. Consider X ⊂ R n , let g be a continuous metric on X and f a continuous vector field on X . Then, for . If in addition X is Clarke regular, then forη : . By definition of the outer limit, there exist sequences g(x k ) holds for every k by Lemma 4.5. Since f and g are continuous the equality holds in the limit, , and α i ≥ 0 and i α i = 1, it must hold that η ∈ N g x X , which completes the proof. 5. Stability & Projected Gradient Descent. To illustrate how established stability concepts seamlessly apply to Krasovskii solutions of projected dynamical systems, we consider projected gradient systems, i.e., projected dynamical systems for which the vector field is the gradient of a function. Naturally, these systems are of prime interest for constrained optimization. The same techniques can also be used to assess the stability of equilibria of other vector fields ranging from saddle-point flows [12] to momentum methods [47]. In what follows, we will establish convergence and stability results that generalize our work in [24].
For simplicity, we consider systems defined on a subset of R n . Extensions to subsets of manifolds are straightforward using the results of the forthcoming Section 8.

Preliminaries and LaSalle
Invariance. In this section, we only consider projected dynamical systems with complete Krasovskii solutions (see Corollary 4.3).
Assumption 5.1. For a feasible set X , a metric g and a vector field f both defined on X , we assume that for every x 0 ∈ X every Krasovskii solution x : [0, T ) → X of We use the usual notions for the limiting behavior of trajectories of discontinuous dynamical systems [17]. Namely, a set S ⊂ X is weakly invariant if for every x 0 ∈ S there exists a solution starting at x 0 and remaining in S for all t ∈ [0, ∞). A set S is strongly invariant if all solutions starting at x 0 for any x 0 ∈ S remain in S for all t ∈ [0, ∞). The union of weakly (strongly) invariant subsets is again weakly (strongly) invariant, hence the notion of largest weakly (strongly) invariant set is well-defined.
A pointx ∈ X is a limit point for a solution x of (5.1) if there exist a sequence t k → ∞ such that x(t k ) →x. The set of all limit points of x is called the ω-limit set and denoted by Ω(x). Note that Ω(x) is always weakly invariant. Furthermore, if x is bounded, then x(t) converges to Ω(x) for t → ∞ [20, §12.4]. The pointx is a weak equilibrium if the constant function x(t) =x for all t ≥ 0 is a solution of the dynamical system (but possibly not unique). Similarly,x is a strong equilibrium if x(t) =x is the only solution starting atx.
A set S is strongly stable if for every neighborhood U of S there exists another neighborhood V ⊂ U of S such that every solution starting in V remains in U for all t ∈ [0, ∞). The set S is strongly asymptotically stable if it is strongly stable and every trajectory starting in V converges to S.
Given a C 1 scalar-valued function Ψ defined on an open neighborhood of X , the set-valued Lie derivative of Ψ with respect to a map F : X ⇒ R n is defined on X as Hence, the following invariance principle for projected dynamical system is an adaptation of [8,Thm 3] in so far as it requires the dynamical system to be defined only on a (possibly closed) subset of R n . We provide a proof for completeness.
Theorem 5.2 (invariance principle for projected dynamical systems). Consider a projected dynamical system (5.1) satisfying Assumption 5.1. Furthermore, let Ψ : V → R be a C 1 function defined on an open neighborhood V of X such that for every ∈ R the set S := {x | Ψ(x) ≤ } ∩ X is compact. If max L Π g X f Ψ(x) ≤ 0 for all x ∈ X , then every solution to (5.1) starting at x 0 ∈ S will converge to the largest weakly invariant subset of cl{x ∈ V | 0 ∈ L Π g X f Ψ(x)} ∩ S . Proof. First, we verify that if x 0 ∈ S , then any solution x of (5.1) remains in S , i.e., S is strongly invariant and x is bounded (since S is compact). Clearly, by definition x(t) ∈ X for all t. Further, assume that there exists τ such that x(τ ) / ∈ {x | Ψ(x) ≤ }. This, however, contradicts the fact that Ψ and x are continuous and L Π g X Ψ(x) ≤ 0 holds almost everywhere. Namely, we must have Second, we show that Ψ is constant on Ω(x) where x is a given trajectory. Namely, Ψ • x is continuous and bounded below since Ψ is continuous and x is bounded. Further, Ψ • x is non-increasing, and therefore lim t→∞ (Ψ • x)(t) = c exists. Furthermore, for any limit pointx ∈ Ω(x) for which t k → ∞ and x(t k ) →x it must hold that Ψ(x) = c where c depends on the trajectory x in general.
Third, we prove that Ω( Since Ω(x) is weakly invariant, for veryx ∈ Ω(x) there exists as solution x to (5.1) with x (0) =x and x ∈ Ω(x) for all t ∈ [0, ∞). Since Ψ • x = c it follows that d dt (Ψ • x )(t) = 0 for all t and therefore D x Ψ(ẋ(t)) = 0 for almost all t. This implies that 0 ∈ L Π g X f Ψ(x(t)) and therefore x (t) ∈ Z for almost all t. Taking a sequence t k → 0 such that x (t k ) ∈ Z and hence x (t k ) →x shows thatx ∈ cl Z.
Finally, recall that Ω(x) is weakly invariant for every solution x of (5.1), and x converges to Ω(x) since x is bounded. Hence, every solution converges to the union of all ω-limit sets, and hence to the largest weakly invariant subset of cl Z.
Remark 5.3. The function Ψ needs to be defined a neighborhood of X solely to guarantee that its derivative is well-defined everwhere on X . For convenience, we thus depart slightly from our principle that projected dynamical systems need only be defined on the feasible set X . This minor limitation can be avoided by resorting to more general differentiability concepts, e.g., along the same lines as in [8].

Stability of Projected Gradient Descent.
We turn to the specific case of projected gradient descent. Given a C 1 potential function Ψ : V → R defined on an open set V , we define the gradient of Ψ at x ∈ V with respect to a metric g as the unique element grad g Ψ(x) ∈ T x R n that satisfies ∀w ∈ T x R n .
In matrix notation we may equivalently write grad g Ψ(x) = G −1 (x)∇Ψ(x) T . Hence, in the following we consider projected gradient systems of the forṁ Such systems serve to find local solutions to the optimization problem minimize Ψ(x) subject to x ∈ X .
It is reasonable (but important to note) that in general the metric that defines the gradient has to be the same metric that defines the projection.
We use Theorem 5.2 to make the following stability result for trajectories of (5.2).
Proposition 5.4. Consider X ⊂ R n , a metric g defined on X , and a C 1 function Ψ : V → R defined on a neighborhood V of X such that for every ∈ R the set S := {x | Ψ(x) ≤ } ∩ X is compact. Let Assumption 5.1 be satisfied for the system (5.2). Then, every complete Krasovskii solution of (5.2) converges to the set of weak equilibrium points.
Proof. Let F (x) := K Π g X − grad g Ψ (x). In order to apply Theorem 5.2, we first need to show that max L F Ψ(x) ≤ 0 for all x ∈ X . For this, we first note that for every a ∈ L F Ψ, we have by definition of the gradient that Using Lemma 4.6, we have for any and consequently max L F Ψ(x) ≤ 0.
Finally, we need to show that 0 ∈ L F Ψ(x) implies that 0 ∈ F (x), and therefore x is a weak equilibrium point. For this, note that according to (5.3) 0 ∈ L F Ψ(x) is equivalent to grad g Ψ(x), w g(x) = 0 for some w ∈ F (x). Using Lemma 4.6 this implies that either grad g Ψ(x) = 0 or w = 0. Both imply that 0 ∈ F (x). Finally, from 0 ∈ F (x) it follows that x is a weak equilibrium since the constant trajectory starting at x is a solution toẋ ∈ F (x).
It is not a priori clear whether equilibria of (5.2) are minimizers of Ψ in X . Hence, the following result connects the two concepts.
Theorem 5.5 (stability of minimizers for projected gradient flows). Let X , g and Ψ be defined as in Proposition 5.4 and let Assumption 5.1 be satisfied. In addition, assume that X has a non-empty interior. Then, the following statements hold: (i) Ifx ∈ X is a strongly asymptotically stable equilibrium of (5.2), then it is a strict local minimum of Ψ on X . (ii) Ifx ∈ X is a strict local minimum of Ψ on X , then it is a strongly stable equilibrium (5.2).
It may seem plausible that strict minimizers are strongly asymptotically stable. This, however, is not true in general (even in the unconstrained case) as the counterexample in [1] shows. Similarly, minimizers are not guaranteed to be stable and stable equilibria are not in general minimizers. This can only be guaranteed under additional assumptions, e.g., minimizers being isolated [24] or Ψ being analytic (in the unconstrained case [1]).
Proof. To show (i), let V ⊂ X be a neighborhood ofx such any solution x(t) of (5.2) with x 0 ∈ V converges tox. Since Ψ is C 1 and x is absolutely continuous, Ψ • x is absolutely continuous, and we may write Since D x Ψ(ẋ(t)) ≤ 0 almost everywhere, it follows that +∞ 0 L F Ψ(x(t)) ≤ 0 and hence Ψ( x) ≤ Ψ(x(t)) ≤ Ψ(x 0 ) for all t ≥ 0. Since this reasoning applies to all x 0 in the region of attraction ofx, it follows thatx is a local minimizer of Ψ.
To see thatx is a strict minimizer, assume for the sake of contradiction that for some x in the region of attraction U ofx it holds that Ψ( x) ≤ Ψ(x). Every solution y(t) to (5.2) with y(0) = x nevertheless converges tox by assumption. Therefore, it must hold that +∞ 0 D y Ψ(ẏ(t)) = 0 and since D y Ψ(ẏ(t)) ≤ 0, it follows that D y Ψ(ẏ(t)) = 0 for almost all t ≥ 0. But as a consequence of Proposition 5.4, all points x with 0 ∈ L F Ψ(x) are weak equilibrium points, this holds in particular x. Consequentlyx cannot be strongly asymptotically stable in the neighborhood U .
For (ii) note that since X has non-empty interior, every (relative) neighborhood of a point x ∈ X has non-empty interior. Hence, consider a neighborhood U ⊂ X of x, and let U ⊆ U be a compact neighborhood ofx in whichx is a strict minimizer. Since X has non-empty interior, it follows that U has non-empty interior. Next, we construct a neighborhood V ⊂ U such that all trajectories starting in V remain in U .
Let α be such that Ψ(x) < α < min x∈∂U Ψ(x) where ∂U is the boundary of U . Define V := {x ∈ U | Ψ(x) ≤ α} ⊆ U which has a non-empty interior because Ψ(x) < α. Since for any trajectory, we have D x Ψ(ẋ(τ )) ≤ 0 we conclude that V is strongly invariant and consequently remains in U , thus establishing strong stability.
Example 5.6 (Constrained Newton Flow). Let X ⊂ R n be closed, and let Ψ : R n → R be strongly convex and globally Lipschitz continuous and twice differenatible. In particular, the Hessian of Ψ (denoted by ∇ 2 Ψ) is continuous and has lower and upper bounded eigenvalues. Hence, we may use ∇ 2 Ψ to define the weakly bounded metric u, v ψ(x) := u T ∇ 2 Ψ(x)v for u, v ∈ T x R n . Hence, the projected gradient floẇ where grad ψ Ψ(x) = (∇ 2 Ψ(x)) −1 ∇Ψ(x) T is a constrained form of a Newton flow, i.e., the continuous-time limit of the well-known Newton method for optimization.

Equivalence of Krasovskii and Carathéodory Solutions.
In this section we study the relation between Carathéodory and Krasovskii solutions. In particular, we show that the solutions are equivalent if the metric is continuous and the feasible domain is Clarke regular. Further, we establish the connection to related work [5,7,16] and highlight the relation between projected gradient flows, as defined in the previous section, and continuous-time subgradient flows for Clarke regular sets [14,17].
Definition 6.1. Consider a set X ⊂ R n , a metric g and a vector field f , both defined on X . The sets of Carathéodory and Krasovskii solutions of (3.2) with initial condition x 0 ∈ X are respectively given by where a.e. means almost everywhere and C A denotes absolutely continuous functions. Since , it is clear that every Carathéodory solution of (3.2) is also a Krasovskii solution, i.e., S C (x 0 ) ⊂ S K (x 0 ) for all x 0 ∈ X . An pointwise condition for the equivalence of the solution sets is given as follows: Proposition 6.2. Given any set X , metric g and vector field ∩ T x(t) X almost everywhere, and therefore, by assumption,ẋ(t) ∈ Π g X f (x(t)). The proof of the next result follows ideas from [16]. The requirement that g and f need to be continuous deserves particular attention. Theorem 6.3 (equivalence of solution sets). If X is Clarke regular, g is a continuous metric on X , and f is continuous on X , then S C (x 0 ) = S K (x 0 ) for all x 0 ∈ X .
Proof. It suffices to show that under the proposed assumptions Proposition 6.2 is applicable. By definition of Π g where the second inequality is due to Cauchy-Schwarz, and therefore v −η g(x) ≤ f (x) − η g(x) holds for all η ∈ N g x X . However, according to Lemma 4.5 the fact that η = arg min Note that Examples 3.4 and 4.4 show a case where the conclusion of Theorem 6.3 fails to hold because X is not Clarke regular at the origin. Hence, our sufficient characterization in terms of Clarke regularity is also a sharp one.
Assuming the standard Euclidean metric, the requirements of Theorem 6.3 coincide with the requirements for the existence of Carathéodory solutions given in [16], in particular Clarke regularity of X . Uniqueness, however, requires additional assumptions as will be shown in Section 7. In particular, uniquess of the projection Π g X f (x) does not imply uniquess of the trajectory (see forthcoming Remark 7.9). 6.1. Alternative formulations and subgradient flows. Next, we point out an important connection to other works on projected dynamical systems [5,7,16] and subgradients which are ubiquitous in non-smooth optimization [14,41]. Namely, under Clarke regularity of the feasible set X we may define an alternative differential inclusion given by the initial value problem In [5,7,16] and others inclusions of the form (6.1) appear under the name of "projected dynamical systems". To avoid confusion we will refer to (6.1) as a normal cone inclusion and define their solution set as

The next result is an adaptation of [16, Thm 2.3] to arbitrary metrics.
Corollary 6.4 (equivalence for normal cone inclusions). Consider a Clarke regular set X ⊂ R n , a continuous vector field f , and a continuous metric g, both defined on X . Then, S N (x 0 ) = S C (x 0 ) holds for systems of the form (3.2) and (6.1), and for all x 0 ∈ X .
In short, any solution to (6.1) is a Carathéodory solution of (3.2) and vice versa. However, Corollary 6.4 makes no statement about existence of solutions.
Consequently, f (x(t)) − η(x(t)), η(x(t)) g(x(t)) = 0, and using Lemma 4.5 it follows thatẋ(t) = Π g X f (x(t)). Remark 6.5. Defining normal cone inclusions of the form (6.1) for a set X that is not Clarke is possible but technical since one would need to distinguish between different types of normal cones (Remark 2.9). Furthermore, depending on the choice of normal cone the resulting set of solutions can be overly relaxed or too restrictive.
Assuming that f is the gradient field of a potential function and X is Clarke regular, we can establish the connection between projected gradients and so-called subgradients [14,17]. For this, recall that Ψ : Definition 6.6. Let g be a metric on V where V ⊂ R n is open, and let Ψ : In particular, if Ψ is differentiable at x, then ∂Ψ(x) = {grad g Ψ(x)}. Further, if X ⊂ V is Clarke regular and I X : V → R denotes its indicator function, then ∂I X (x) = N g x X . Note that in contrast to standard definitions, we use the straightforward generalization to non-Euclidean metrics.
The next result is a direct combination of [41,Ex 8.14] and [41,Cor 10.9].
Proposition 6.7. LetΨ := Ψ + I X where Ψ : V → R is a C 1 function and I X is the indicator function of a Clarke regular set X ⊂ V where V ⊂ R n is open. Then, for all x ∈ X one has It follows immediately from Corollary 6.4 that under the appropriate assumptions trajectories of projected gradient flows are also solutions to subgradient flows. Corollary 6.8 (equivalence with subgradient flows). Let X be Clarke regular, let g be a continuous metric on X , and let Ψ be a C 1 potential function on an open neighborhood of X . Then, for any x 0 ∈ X there exists a Carathéodory solution x : [0, T ) → X to the subgradient floẇ Furthermore, x is a solution if and only if it is a Carathéodory (and Krasovskii) solution to the projected gradient descent (5.2).
7. Prox-regularity and Uniqueness of Solutions. Next, we define proxregular sets on non-Euclidean spaces and show their significance for the uniqueness for solutions of projected dynamical systems. In the Euclidean setting prox-regularity is well-known to be a sufficient condition on the feasible domain X for uniqueness [16].
The key issue of this section is thus to generalize the definition of prox-regular sets and identify the requirements that lead to unique solutions. By doing so, we also show that prox-regularity of a set is independent of the choice of metric. In the subsequent section this allows us to state that prox-regularity is preserved under C 1,1 coordinate transformations an hence well-defined on C 1,1 manifolds. 7.1. Prox-regularity on non-Euclidean spaces. For illustration, we first recall and discuss one definition of prox-regularity in Euclidean space. Our treatment of the topic is deliberately kept limited. For a more general overview see [3,40].
Definition 7.1. A Clarke regular set X ⊂ R n is prox-regular at x ∈ X if there is L > 0 such that for every z, y ∈ X in a neighborhood of x and η ∈ N y X we have The set X is prox-regular if it is prox-regular at every x ∈ X .
One of the key features of a prox-regular set X is that for every point in a neighborhood of X there exists a unique projection on the set [3, Def 2.1, Thm 2.2].
Example 7.2 (Prox-regularity in Euclidean spaces). Consider the parametric set where 0 < α < 1 and which is illustrated in Figure 4. For α ≤ 0.5 the set is proxregular everywhere. In particular for the origin, a ball with non-zero radius can be placed tangentially such that it only intersects the set at 0. For α > 0.5 on the other hand the set is not prox-regular at the origin. In fact, all points on the positive axis have a non-unique projection on X α as illustrated in Figure 4c. Definition 7.1 cannot be directly generalized to non-Euclidean spaces since it requires the distance y − x between two points in X . Hence, in [28] prox-regularity is defined on smooth (i.e., C ∞ ) Riemannian manifolds resorting to geodesic distances. For our purposes we can avoid the notational complexity of Riemannian geometry, yet preserve a higher degree of generality. Thus, we introduce the following definitions. Definition 7.3. Given a Clarke regular set X ⊂ R n and a metric g, a normal vector η ∈ N g x X at x ∈ X is L-proximal with respect to g for L ≥ 0 if for all y ∈ X in a neighborhood of x we have The cone of all L-proximal normal vectors at x with respect to g is denoted byN g,L x X . A crucial detail in (7.3) is the fact that g is evaluated at x and is used as an inner product on R n (which is a slight abuse of notation). In other words, we exploit the canonical isomorphism between R n and T x R n to use g(x) as an inner product on R n . Definition 7.4. A Clarke regular set X ⊂ R n with a metric g is L-prox-regular at x ∈ X with respect to g ifN g,L y X = N g y X for all y ∈ X in a neighborhood of x. The set X is prox-regular with respect to g if for every x ∈ X there exists L > 0 such that X is L-prox-regular at x with respect to g.
Note that if g is the Euclidean metric, Definition 7.4 reduces to Definition 7.1. The following result shows that prox-regularity is in fact independent of the metric. This is the first step towards a coordinate-free definition of prox-regularity.
Proposition 7.5. Let X ⊂ R n be Clarke regular. If X is prox-regular with respect to a C 0 metric g, then it is prox-regular with respect to any other C 0 metric.
In particular if X is prox-regular with respect to the Euclidean metric, i.e., according to Definition 7.1, then it is prox-regular in any other continuous metric on R n . For the proof of Proposition 7.5 we require the following lemma.
Lemma 7.6. Let X ⊂ R n be Clarke regular and consider to metrics g, g defined on X . If for x ∈ X there is L > 0 such thatN g,L Proof. First note that for every x ∈ X the two metrics g and g induce a bijection between N g x X and N g x X . Namely, we define q : T x R n → T x R n as the unique element q(v) that satisfies by v, w g(x) = q(v), w g (x) for all w ∈ T x R n . To clarify, in matrix notation we can write v T G(x)w = q(v) T G (x)w and since G(x), G (x) are symmetric It follows that if η ∈ N g x X (hence, by definition η, w g(x) ≤ 0 for all w ∈ T x R n ), then q(η) ∈ N g x X . Furthermore, omitting the argument x, we have q(η) g = η T GG −1 Gη ≥ 1/λ max g Gη and η g = η T GG −1 Gη ≤ 1/λ min g Gη , and therefore q(η) g (x) ≥ λ min g(x) /λ max g (x) η g(x) . Hence, let η ∈ N g x X \ {0} be a L-proximal normal vector, then Finally, using the equivalence of norms, we have x X which completes the proof. Proof of Proposition 7.5. Since g and g are continuous it follows that κ g(x) and κ g (x) are continuous in x and therefore locally bounded. Given any x ∈ X and using the pointwise result in Lemma 7.6, we can choose L > 0 such that (7.4) is satisfied for all y ∈ X in a neighborhood of x.
We conclude this section by showing that feasible domains defined by C 1,1 constraint functions are prox-regular under the usual constraint qualifications.
Example 7.7 (prox-regularity of constraint-defined sets). As in Examples 2.4 and 2.10 let h : R n → R m be C 1 and ∇h(x) have full rank for all x and consider X := {x | h(x) ≤ 0}. If in addition, h is a C 1,1 map, then X := {x | h(x) ≤ 0} is prox-regular with respect to any C 0 metric g on R n .
To see this, we consider the Euclidean case without loss of generality as a consequence of Proposition 7.5. We first analyze the sets X i := {x | h i (x) ≤ 0} and then show prox-regularity of their intersection. For this, we only need to consider points x ∈ ∂X i on the boundary of X i since for allx / ∈ ∂X i we have NxX i = {0} and proxregularity is trivially satisfied. Hence, using the Descent Lemma A.6, for all z, y ∈ R n in a neighborhood of x and all i = 1, . . . , m there exists L i > 0 such that In particular, for z ∈ X i (i.e., h i (z) ≤ 0) and y ∈ ∂X i (i.e., h i (y) = 0) in a neighborhood of x we have For the set X = m i=1 X i recall from Example 2.10 that for x ∈ X we have Consider z ∈ X and y ∈ ∂X in a small enough neighborhood of x. Note that y ∈ ∂X implies that y ∈ ∂X i for all i ∈ I(y). Using (7.5), for all η ∈ N y X with η = i∈I(y) α i ∇h T i (y)/ ∇h i (y) we have η, z − y = i∈I(y) The first inequality can be shown by taking the square and proceeding by induction.
Since the final bound is with respect to all h i , it is continuous in y in a neighbhorhood of x. Consequently, we can chooseL such thatL ≥ L(y) for all y ∈ X in a neighborhood of x, and therefore η, z − y ≤L η z − y 2 for z ∈ X in a neighborhood of y. This provesL-prox-regularity at x and prox-regularity follows accordingly.

7.2.
Uniqueness of solutions to projected dynamical systems. Before formulating our main uniqueness result, we present an example that illustrates the impact of prox-regularity on the uniqueness of solutions.
Example 7.8 (prox-regularity and uniqueness of solutions). We consider the set X α := {(x 1 , x 2 ) | |x 2 | ≥ max{0, x 1 } α } for 0 < α < 1, as in Example 7.2. We study how the value of α affects the uniqueness of solutions of the projected dynamical system defined by the uniform "horizontal" vector field f (x) = (1, 0) for all x ∈ X and the initial condition x(0) = 0 as illustrated in Figure 5.
Since X α is Clarke regular and closed, since the vector field is uniform, and since we use the Euclidean metric, the existence of Krasovskii solutions and the equivalence of Carathéodory solutions is guaranteed for t → ∞ by Corollary 4.3 and Theorem 6.3, respectively. The prox-regularity of X α at the origin is however only guaranteed for 0 < α ≤ 1 2 (Example 7.2). A formal analysis reveals that for 0 < α ≤ 1 2 the origin is a strong equilibrium, i.e., the constant solution x(t) = 0 is the unique solution to the projected dynamical system. For 1 2 < α < 1, however, the origin is only a weak equilibrium point. Namely, a solution may remain at 0 for an arbitrary amount of time before leaving 0 on either upper or lower halfplane, and thus uniqueness is not guaranteed.
Remark 7.9. In general, whether Πf (x 0 ) is a singleton is unrelated to the uniqueness of solutions starting from x 0 . For instance, in Example 7.8, if α > 0 multiple solutions exists even though Πf (x) is a singleton. Conversely, Example 4.4 shows that even if Πf (x 0 ) is not unique, the (Krasovskii) solution starting from x 0 is unique.
For the proof of uniqueness under prox-regularity, we require the following lemma.
Lemma 7.10. Let X be L-prox-regular at x with respect to a C 0,1 metric g. Then, there existL > 0 such that for all y ∈ X in a neighborhood of x and all η ∈ N g,L y with η g(y) = 1 we have η, x − y g(x) ≤L y − x 2 g(x) . Proof. We know that η, y − x g(y) ≤ L y − x 2 g(y) for y close enough to x because η is a L-proximal normal vector at y with respect to g. Furthermore, by the equivalence of norms there exists L > 0 sucht that η, y − x g(y) ≤ L y − x 2 g(x) . Next, we show that | η, x − y g(y) − η, x − y g(x) | ≤ M y − x 2 g(x) for some M > 0. Since L n 2 is a vector space, we may write which is a slight abuse of notation since ·, · g(y)−g(x) is not necessarily positive definite and therefore not a metric. Nevertheless, any map of the form (u, v, g) → u, w g where g ∈ L n 2 is linear in u, v and in g (e.g., (u, v, g) → u, w λg = λ u, w g for any λ ∈ R). Therefore, there exist M , M > 0 such that where · L n 2 denotes any norm on the vector space L n 2 , and the second inequality follows directly from the Lipschitz continuity of g. Hence, we can conclude that that . Next, we can show the following Lipschitz-type property of projected vector fields.
Proposition 7.11. Let f be a C 0,1 field on X . If g is a C 0,1 metric and X is prox-regular, then for every x ∈ X there exists L > 0 such that for all y ∈ X in a neighborhood of x we have . Proof. As a consequence of Lemma 4.5, we can write where η y ∈ N g y X =N g,L y X and η x ∈ N g x X =N g,L x for some L > 0. For the first term, we get . by applying Cauchy-Schwarz. Since f is Lipschitz and using the equivalence of norms there exists L a > 0 such that . For the second and third term in (7.6) we have g(x) η x g(x) by Lemma 7.10 and the definition of a L-proximal normal vector, respectively. By Lemma 4.5 we know that η y g(y) ≤ f (y) g(y) and η x g(x) ≤ f (x) g(x) . Since g and f are continuous we can choose M > 0 such that f (z) g(z) ≤ M for all z ∈ X in a neighborhood of x. Therefore, (7.6) can be bounded by which completes the proof.
Hence, we can state our main result on the uniqueness of solutions. In this context, uniqueness is understood in the sense that any two solutions are equal on the interval on which they are both defined.
Theorem 7.12 (uniqueness of solutions). Let f be a C 0,1 vector field on X . If g is a C 0,1 metric and X is prox-regular, then for every x 0 ∈ X there exists T > 0 such that the initial value problemẋ ∈ Π g X f (x) with x(0) = x 0 has a unique Carathéodory solution x : [0, T ) → X (which is also the unique Krasovskii solution).
Proof of Theorem 7.12. The proof follows standard contraction ideas [20]. Let x(t) and y(t) be two solutions solving the same initial value problemẋ ∈ Π g X f (x) with x(0) = x 0 ∈ X , both defined on a non-empty interval [0, T ). Using Proposition 7.11, there exists M > 0 and a neighborhood V of x 0 such that d dt for all t in some non-empty subinterval [0, T ) ⊂ [0, T ) for which x(t) and y(t) remain in V . Next, consider the non-negative, absolutely continuous function q : [0, T ) → R defined as q(t) := 1 2 y(t) − x(t) 2 g(x0) e −2M t . Note that q(0) = 0. Furthermore, using (7.7) and applying the product rule we have and since y(0) = x(0) it follows that d dt q(t) ≤ 0 for t ≥ 0. However, since q is nonnegative and absolutely continuous, we conclude that x(t) = y(t) for all t ∈ [0, T ) thus finishing the proof of uniqueness.
Combining all the insights so far, we arrive at the following ready-to-use result: Example 7.13 (Existence and uniqueness on constraint-defined sets). As in Example 7.7 consider a set X := {x ∈ R n | h(x) ≤ 0} where h : R n → R m is of class C 1,1 and has full rank for all x ∈ R n . Further, consider a globally Lipschitz continuous vector field f : R n → R. Then, for every x 0 ∈ X there exists a unique and complete Carathéodory solution x : [0, ∞) → X to the initial value problemẋ = Π g X f (x) with x(0) = x 0 where g is any weakly bounded C 0,1 metric on X .
8. Existence and Uniqueness on low-regularity Riemannian Manifolds. The major appeal of Theorems 4.2, 6.3, and 7.12 is their geometric nature. Namely, as we will show next, their assumptions are preserved by sufficiently regular coordinate transformations which allows us to give a coordinate-free definition of projected dynamical system on manifolds with minimal degree of differentiability.
Recall that for open sets V, W ⊂ R n a map Φ : V → W is a C k diffeomorphism if it is a C k bijection with a C k inverse where, for our purposes, C k stands for either C 1 or C 1,1 . We employ the usual definition of a C k manifold as locally Euclidean, second countable Hausdorff space endowed with a C k differentiable structure. In particular, for a point p on a n-dimensional manifold M there exists a chart (U, φ) where U ⊂ M is open and φ : U → R n is a homeomorphism onto its image. For any two charts A C k (Riemannian) metric g is a map that assigns to every point p ∈ M an inner product on the tangent space 3 T p M such that in local coordinates (U, φ) the metric g(φ −1 (x)) is a C k metric for x ∈ φ(U ) according to Definition 2.5. A vector field defined on M is locally bounded at x if it is locally bounded in any local coordinate domain for x. Similarly, a metric is locally weakly bounded at x if its locally weakly bounded in local coordinates. Given a C k manifold M with k ≥ 1, a curve γ : [0, T ) → M is absolutely continuous if it is absolutely continuous in any chart domain where it is defined. 4 The next lemma shows that a C 1 diffeomorphism maps (Clarke) tangent cones to (Clarke) tangent cones. Hence, Clarke regularity is preserved by C 1 diffeomorpisms.
Lemma 8.1. Let V, W ⊂ R n be open and consider a C 1 diffeomorphism Φ : V → W . Given X ⊂ R n andX := X ∩ V , for every x ∈X it holds that Hence, Φ(X ) is Clarke regular at Φ(x) if and only ifX is Clarke regular at x ∈X .
Proof. We only need to show that T Φ(x) Φ(X ) ⊂ D x Φ(T xX ). Since Φ is a C 1 diffeomorphism the other direction follows by applying the same arguments to Φ −1 .
Let v ∈ T xX . Then, by definition there exist x k → x with x k ∈X and δ k → 0 + such that ( According to the definition of the derivative of Φ, for the same sequence {x k } we have lim Since the limit of the element-wise product of convergent sequences equals the product of its limits we can write which, using the fact that D x Φ is linear, simplifies to This implies that (Φ(x k )−Φ(x))/δ k → D x Φ(v), and hence D x Φ(v) is a tangent vector of Φ(X ) at Φ(x). This proves (8.1).
To show (8.2) we use (8.1) together with the definition of the Clarke tangent cone as the inner limit of the surrounding tangent cones (Definition 2.2). We can write . Again, since Φ is a diffeomorphism, the opposite inclusion holds by applying the same argument to Φ −1 . This shows (8.2) and completes the proof.
Hence, the notions of (Clarke) tangent cone and Clarke regularity are independent of the coordinate representation on a C 1 manifold.
Definition 8.2. Let M be a C 1 manifold with a metric g and consider a subset X ⊂ M. The (Clarke) is the (Clarke) tangent cone of φ(X ∩ U ) for any coordinate chart (U, φ) defined at x. The set X is Clarke regular at x ∈ X if it is Clarke regular in any local coordinate domain defined at x.
The next key result establishes that solutions of projected dynamical systems remain solutions of projected dynamical systems under C 1 coordinate transformations. Proposition 8.3. Let V, W ⊂ R n be open and consider a C 1 diffeomorphism Φ : V → W . Let X ⊂ R n be locally compact andX := X ∩ V . Further, let g be a locally weakly bounded metric on W and let Φ * g denote the pull-back metric along Φ, i.e., for all x ∈ V and v, w ∈ T x R n . Further, let f :X → R n be a locally bounded vector field. If x : [0, T ) →X for some T > 0 is a Krasovskii (respectively, Carathéodory) solution to the initial value problem where y 0 := Φ(x 0 ) andf (y) := D Φ −1 (y) Φ(f (Φ −1 (y))) is the pushforward vector field of f along Φ −1 .
Proof. First, note that since x is absolutely continuous and Φ is differentiable, Φ • x is absolutely continuous [42,Ex 6.44]. Second, it holds that y(t) ∈ Φ(X ) for all t ∈ [0, T ). Third, using (8.1) we can write for every x ∈X and y := Φ(x) that where for the last equality we introduce the transformation w : Hence, using the definition of the pullback metric (8.3) we continue with Consequently, if x(·) is a Carathéodory solution of (8.4) and henceẋ(t) ∈ Π Φ * g X f (x(t)) holds almost everywhere, then Φ • x(·) satisfies almost everywhere and hence Φ • x(·) is a Carathéodory solution to (8.5).
It remains to prove the statement is also true for Krasovskii solutions. For this, we need to show that . Expanding the definition of the Krasovskii regularization we get where the last equation is due to the fact that D x Φ is continuous in x. Next, with Lemma A.1 we can write where the equation follows from the fact that D x Φ is a linear map and hence commutes with taking the convex closure. To conclude we can proceed similar to the case of Carathéodory solutions. Let x(·) be a Krasovskii solution to (8.4) and y(·) := Φ • x(·). Then, for almost all t ∈ [0, T ) and we have thaṫ for almost all t ∈ [0, T ), and thus y is a Krasovskii solution of (8.5).
Hence, Theorems 4.2, 6.3 combined with Proposition 8.3 give rise to our main result on the existence of Krasovskii (Carethéodory) solutions to on manifolds.
Theorem 8.4 (existence on manifolds). Let M be C 1 manifold, g a locally weakly bounded Riemannian metric, X ⊂ M locally compact, and f a locally bounded vector field on X . Then for every x 0 ∈ X there exists a Krasovskii solution x : [0, T ) → X for some T > 0 that solvesẋ(t) ∈ Π g X f (x(t)) with x(0) = x 0 . Furthermore, if X is Clarke regular, and if f and g are continuous, then every Krasovskii solution is a Carathéodory solution and vice versa.
Similarly, Proposition 8.3 directly implies that other results such as Corollary 4.3 extend to C 1 manifolds. For instance, if M is compact and f and g are continuous, every initial condition admits a complete trajectory. However, to extend our uniqueness results, we require stronger conditions. Proposition 8.5. Let V, W ⊂ R n be open and Φ : V → W a C 1,1 diffeomorphism. Let X ⊂ R n be locally compact and considerX := X ∩ V . IfX is prox-regular then Φ(X ) is prox-regular.
Proof. By Proposition 7.5 it suffices to show prox-regularity with respect to a single metric on V and W respectively. Hence, let W be endowed with the Euclidean metric, and let e * denote its pullback metric on V along Φ, i.e., v, w e * (x) := D x Φ(v), D x Φ(w) . Similarly to Lemma 8.1, we show that (proximal) normal cones are preserved by C 1 coordinate transformations, i.e., for some L , L > 0 where N x ⊂X is a neighborhood of x. Since Φ is a diffeomorphism it suffices to show one direction only.
Hence, consider η ∈ N e * xX . By Definition 2.8 and using (8.1) we have We conclude that D x Φ(η) ∈ N Φ(x) Φ(X ) and (8.6) holds. For (8.7) we consider y ∈X in a neighborhood of x and η ∈N e * ,L yX such that holds for all z ∈X in a neighborhood of y. However, we need to show that for some L > 0 we have Hence, we define the C 1,1 function ψ(z) := D y Φ(η), Φ(z) and note that by linearity we have D z ψ(v) := D y Φ(η), D z Φ(v) . This enables us to apply the Desent Lemma A.6 and state that for some M > 0 it holds that This bound can be used to establish Finally note that z − y 2 ≤ L Φ(z) − Φ(y) 2 for some L since Φ −1 is Lipschitz continuous. Hence, (8.8) and therefore (8.7) holds for L = L (L + M ).
Apart from Proposition 8.5, we note that Lipschitz continuity of a metric and of vector fields is preserved under C 1,1 coordinate transformations. This allows us to generalize Theorem 7.12 to the following uniqueness result on manifolds.
Theorem 8.6 (uniqueness on manifolds). Let M be C 1,1 manifold, g a C 0,1 Riemannian metric, X ⊂ M is prox-regular, and f a C 0,1 vector field on X . Then, for every x 0 ∈ X there exists a unique Carathéodory solution x : [0, T ) → X for some T > 0 that solvesẋ(t) ∈ Π g X f (x(t)) with x(0) = x 0 . 9. Conclusion. We have provided a holistic study of projected dynamical systems on irregular subset on manifolds, including the model of oblique projection directions. We have carved out sharp regularity requirements on the feasible domain, vector field, metric and differentiable structure that are required for the existence, uniqueness and other properties of solution trajectories. Table 1 summarizes these results. In the process, we have established auxiliary findings, such as the fact that prox-reguality is an intrinsic property of subset of C 1,1 manifolds and independent of the choice of Riemannian metric.
While we believe these results are of general interest in the context of discontinuous dynamical systems, they particularly provide a solid foundation for the study of continuous-time constrained optimization algorithms for nonlinear, nonconvex problems. To illustrate this point, we have included a study the stability and convergence of Krasovskii solutions to projected gradient descent-argueable the most prototypical continuous-time constrained optimization algorithm.  For a comprehensive treatment of the following definitions and results see [7,27,39,41]. Given a sequence {x k } and a set X , the notation x k sub −→ X x denotes the existence of a subsequence {x k } that converges to x and x k ∈ X for all k . Similarly, x k ev −→ X x implies that x k ∈ X holds eventually, i.e., for all k larger than some K, and that {x k } converges to x. Given a sequence of sets {C k } in R n , its outer limit and inner limit are given as respectively. As a pedagogical example to distinguish between inner and outer limits, consider an alternating sequence of sets given by C 2m := A and C 2m+1 := B. Then, we have lim sup k→∞ C k = A ∪ B and lim inf k→∞ C k = A ∩ B. On the one hand any constant sequence {x k } with x k = c ∈ A ∩ B for all k satisfies the requirement such that c ∈ lim inf k→∞ C k . On the other hand, any sequence {x k } with x 2m = a ∈ A for m ∈ N has a trivial (constant) subsequence converging to a ∈ A and hence a ∈ lim sup k→∞ C k . The following result relates the image of an outer (inner) limit to the outer (inner) limit of images of a map f . For a set-valued map F : V ⇒ W with V ⊂ R n and W ⊂ R m its outer limit and inner limit at x are defined respectively as  The following result is a generalization of [41,Prop 6.5] to the case of a continuous metric instead of the standard Euclidean metric: Lemma A.4. Let X be Clarke regular. If the metric g on X is continuous, then the set-valued map X → N g x X is outer semi-continuous. Proof. Consider any two sequences x k → x with x k ∈ X and η k → η with η k ∈ N g x k X . To complete the proof we need to show that η ∈ N g x X . By definition of N g x k X we have v, η k g(x k ) ≤ 0 for all v ∈ T C x X . Furthermore, by continuity of g we have v, η g(x) ≤ 0 for all v ∈ lim sup x k →x T C x k X . (Namely, we must have v k , η k g(x k ) ≤ 0 for every sequence v k → v with v k ∈ T C x k X , hence the use of lim sup.) By definition of the Clarke tangent cone, we note that v, η g(x) ≤ 0 holds for all v ∈ T C x X = lim inf T C x k X , and therefore η ∈ N g x X . Lemma A.5. Given a set X ⊂ R n , for any absolutely continuous function x : [0, T ) → X with T > 0 it holds thatẋ(t) ∈ T x(t) X ∩ −T x(t) X almost everywhere on [0, T ), where −T x(t) := {v| − v ∈ T x(t) }.
Proof. Let t ∈ [0, T ) be such thatẋ(t) exists. This implies that by definitioṅ Thus, by choosing any sequence τ k → 0 with τ k > 0, the sequence x(t+τ k )−x(t) τ k converges to a tangent vector and −x(t−τ k )+x(t) τ k converges to a vector in −T x(t) X by definition of T x(t) X and the fact that x(t) ∈ X for all t ∈ [0, T ).
The following is a local version of [39,Lem 1.30].
Lemma A.6 (Descent Lemma). Let Φ : V → R be a C 1,1 map where V ⊂ R n is open. Given x ∈ V there exists L > 0 such that for all z, y ∈ V in a neighborhood of x it holds that |Φ(z) − Φ(y) − D y Φ(z − y)| ≤ L z − y 2 The following general existence and viability theorem goes back to [23]. Similar results can also be found in [6,13,22].
Proposition A.7 ( [23, Cor 1.1, Rem 3]). Let X be a locally compact subset of R n and F : X ⇒ R n an usc, non-empty, convex and compact set-valued map. Then, for any x 0 ∈ X there exists T > 0 and a Lipschitz continuous function x : [0, T ] → X such that x(0) = x 0 andẋ(t) ∈ F (x(t)) almost everywhere in [0,T] if and only if the condition F (x) ∩ T x X = ∅ holds for all x ∈ X . Furthermore, for r > 0 such that U r := {x ∈ X | x − x 0 ≤ r} is closed and L = max y∈Ur F (y) exists, the solution is Lipschitz and exists for T > r/L.