Constrained optimization and KKT Conditions

Understanding KKT conditions

Necessity Vs Sufficiency

Proving something requires proving both the necessity and sufficiency. So lets first understand what they mean.

Lets take \(A\)and \(B\).

Necessity: \(A \implies B\)

Sufficiency: \(A \impliedby B\)

Necessary and sufficient: \(A \iff B\)

For more check, Source 1.


Primal and Dual

Lets first establish some common notations before we prove anything The unconstrained optimisation problem is,

\[\min_x\;f(x)\] \[s.t.\; h_{i(x)}\leq 0, \forall i \quad [eqn (4)]\] \[l_{j(x)}= 0, \forall j \quad [eqn (3)]\]

This above is our primal.

The dual for it is,

\[\max_{u,v}\;\min_{x}\;f(x)+\sum\limits_{i=1}^mu_ih_i(x)+\sum\limits_{j=1}^rv_jl_j(x)\] \[s.t.\quad u\geq0 \quad [eqn (5)]\]

For the sake of simplicity lets define,

Lagrangian: \(L(x,u,v)=f(x)+\sum\limits_{i=1}^mu_ih_i(x)+\sum\limits_{j=1}^rv_jl_j(x) \quad [eqn (1)]\)

\[g(u,v)=\min_x\;L(x,u,v) \quad [eqn (2)]\]

Dual: \(\max_{u,v}\; g(u,v)\)

Solution of primal : \(x^{*}\)

Solution of dual : \(u^{*}, v^{*}\)


KKT Conditions

The KKT conditions are as follows,

  1. Stationary condition
\[0\in\partial\left(f(x)+\sum\limits_{i=1}^mu_ih_i(x)+\sum\limits_{j=1}^rv_jl_j(x)\right)\]
  1. Complementary slackness

    \[u_{i}.h_{i}(x)=0,\quad \forall i\in {1,\dots,m}\]

    Why is there no complementary slackness for \(v_jl_j\)? Because it is a equality constraint. At optimal \(x\) it will always be \(0\).

  2. Primal feasibility

    \[h_{i(x)}\leq 0, \forall i\in {1,\dots,m}\] \[l_{j(x)}= 0, \forall j\in {1,\dots,r}\]
  3. Dual feasibility

    \[u_{i(x)}\geq 0, \forall i\in {1,\dots,m}\]

    Why is there no dual feasibility for \(v_j\)? I dont know the answer yet. Will update once I understand it.

We are now going to prove \(\text{Optimality} \iff \text{KKT}\). To do that, like we saw earlier we need to prove both necessity and sufficiency.


Necessity

\[\text{Optimality} \implies \text{KKT}\]

We are also assuming strong duality (Look into slater’s condition) and as a result there is no duality gap, that is \(f(x^{*})-g(u^{*}, v^{*})=0\)

Lets first not worry about the KKT at all and solve as if we are solving a generic equation.

\[f(x^{*})= g(u^{*}, v^{*})\] \[= \min_{x}\;f(x)+\sum\limits_{i=1}^mu^{*}_ih_i(x)+\sum\limits_{j=1}^rv^{*}_jl_j(x)\] \[\leq f(x^*)+\sum\limits_{i=1}^mu^{*}_ih_i(x^*)+\sum\limits_{j=1}^rv^{*}_jl_j(x^*)\] \[\leq f(x^*)\]

Lets go through the above now step by step.

Now to prove necessity we have to prove that the inequalities in step 3 and step 4 are actually equalities. Now to do that we need the KKT conditions.

We have now shown that both the KKT condition 1 and 2 are required to prove necessity. The conditions 3 and 4 are by default required to enforce feasibility of the primal and the dual problem hence the condition names.

Hence necessity is proved.


Sufficieny

\[\text{Optimality} \impliedby \text{KKT}\]

Now lets try to prove the other direction. Compared to the necessity, the sufficiency here is much easier to prove and understand. So here we assume KKT conditions are true, now we need to prove optimality.

Say \(x^{*}, u^{*}, v^{*}\) satisfy the KKT conditions then we prove have to prove that they are the optimal values for primal and dual. That is we have to show \(g(u^{*}, v^{*})=f(x^{*})\).

\[g(u^{*}, v^{*}) = \min_{x}\;f(x)+\sum\limits_{i=1}^mu^{*}_ih_i(x)+\sum\limits_{j=1}^rv^{*}_jl_j(x)\] \[= f(x^{*})+\sum\limits_{i=1}^mu^{*}_ih_i(x^{*})+\sum\limits_{j=1}^rv^{*}_jl_j(x^{*})\] \[=f(x^{*})\]

Lets go through the above now step by step,

Hence sufficiency is proved.


Conclusion

We have now effectively proved both directions by proving necessity and sufficiency. To sum it up,

Acknowledgements

I heavily borrowed from Ryan Tibshrani’s video lecture and notes for understanding KKT conditions. There are other lectures of Ryan explaining KKT conditions but I personally felt this one was much more easier to understand.

Thank you Piyushi for helping me understand this topic and proof-reading this blog!