Controllers

pykal.algorithm_library.controllers

Control algorithms from the robotics and control theory literature.

Available Modules:: pid: PID controller and variants (standard, anti-windup, etc.) lqr: Linear Quadratic Regulator for optimal state feedback control mpc: Model Predictive Control for constrained optimal control

Control algorithms from the robotics and control theory literature.

PID Controller

class pykal.algorithm_library.controllers.pid.PID[source]

Bases: object

static f(ck, rk, xhatk)[source]

Perform one state update step for a discrete-time PID controller.

Parameters:

ck (Tuple[float, float, float]) –
A tuple (ek_prev, Ik_prev, ek_prev_prev) containing:
- ek_prev : the current error from the previous iteration
- Ik_prev : the accumulated integral of error
- ek_prev_prev : the error from two iterations ago (unused)
rk (float) – The reference (setpoint) value at time k.
xhatk (float) – The current state estimate (measured or estimated output) at time k.

Returns:

(ek, Ik, ek_prev) –

The updated controller state:

ek : current error at time k
Ik : updated integral of error
ek_prev : previous error (for derivative calculation)

Return type:

Tuple[float, float, float]

Notes

This function updates the internal state of the PID controller following the discrete-time formulation:

Error calculation:: ek = rk - xhatk
Integral update:: Ik = Ik_prev + ek
State propagation:: The current error becomes the previous error for the next iteration, enabling the derivative term calculation in standard_h.
The state tuple always maintains the structure:: (current_error, integral, previous_error)

static h(ck, KP, KI, KD)[source]

Compute the PID control output from the current controller state.

Parameters:

ck (Tuple[float, float, float]) –
A tuple (ek, Ik, ek_prev) containing:
- ek : current error at time k
- Ik : accumulated integral of error
- ek_prev : previous error for derivative calculation
KP (float) – Proportional gain coefficient.
KI (float) – Integral gain coefficient.
KD (float) – Derivative gain coefficient.

Returns:

uk – The control output signal at time k.

Return type:

float

Notes

This function computes the standard discrete-time PID control law:

PID control equation:

uk = KP * ek + KI * Ik + KD * (ek - ek_prev)

where:

Proportional term: KP * ek provides immediate response to current error
Integral term: KI * Ik eliminates steady-state error
Derivative term: KD * (ek - ek_prev) provides damping and anticipates future error

The parameters rk and xhat_k are included in the signature for compatibility with the DynamicalSystem framework but are not used in the computation, as the error is already available in the state tuple ck.

pykal.algorithm_library.controllers.pid.f(ck, rk, xhatk)

Perform one state update step for a discrete-time PID controller.

Parameters:

ck (Tuple[float, float, float]) –
A tuple (ek_prev, Ik_prev, ek_prev_prev) containing:
- ek_prev : the current error from the previous iteration
- Ik_prev : the accumulated integral of error
- ek_prev_prev : the error from two iterations ago (unused)
rk (float) – The reference (setpoint) value at time k.
xhatk (float) – The current state estimate (measured or estimated output) at time k.

Returns:

(ek, Ik, ek_prev) –

The updated controller state:

ek : current error at time k
Ik : updated integral of error
ek_prev : previous error (for derivative calculation)

Return type:

Tuple[float, float, float]

Notes

This function updates the internal state of the PID controller following the discrete-time formulation:

Error calculation:: ek = rk - xhatk
Integral update:: Ik = Ik_prev + ek
State propagation:: The current error becomes the previous error for the next iteration, enabling the derivative term calculation in standard_h.
The state tuple always maintains the structure:: (current_error, integral, previous_error)

pykal.algorithm_library.controllers.pid.h(ck, KP, KI, KD)

Compute the PID control output from the current controller state.

Parameters:

ck (Tuple[float, float, float]) –
A tuple (ek, Ik, ek_prev) containing:
- ek : current error at time k
- Ik : accumulated integral of error
- ek_prev : previous error for derivative calculation
KP (float) – Proportional gain coefficient.
KI (float) – Integral gain coefficient.
KD (float) – Derivative gain coefficient.

Returns:

uk – The control output signal at time k.

Return type:

float

Notes

This function computes the standard discrete-time PID control law:

PID control equation:

uk = KP * ek + KI * Ik + KD * (ek - ek_prev)

where:

Proportional term: KP * ek provides immediate response to current error
Integral term: KI * Ik eliminates steady-state error
Derivative term: KD * (ek - ek_prev) provides damping and anticipates future error

The parameters rk and xhat_k are included in the signature for compatibility with the DynamicalSystem framework but are not used in the computation, as the error is already available in the state tuple ck.

LQR Controller

class pykal.algorithm_library.controllers.lqr.LQR[source]

Bases: object

Linear Quadratic Regulator (LQR) for optimal state feedback control.

Based on Kalman (1960), “Contributions to the theory of optimal control”, Boletin de la Sociedad Matematica Mexicana, Vol. 5, No. 2, pp. 102-119.

The LQR controller computes the optimal state feedback gain that minimizes a quadratic cost function for linear discrete-time systems.

static compute_gain(A, B, Q, R)[source]

Compute the optimal LQR feedback gain by solving the Discrete-Time Algebraic Riccati Equation (DARE).

Parameters:

A (NDArray) – State transition matrix, shape (n, n)
B (NDArray) – Control input matrix, shape (n, m)
Q (NDArray) – State cost matrix (positive semi-definite), shape (n, n) Penalizes deviation from reference state
R (NDArray) – Control cost matrix (positive definite), shape (m, m) Penalizes control effort

Return type:

Tuple[ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]], ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]]]

Returns:

K (NDArray) – Optimal feedback gain matrix, shape (m, n)
P (NDArray) – Solution to the DARE, shape (n, n)

Notes

The LQR controller minimizes the infinite-horizon quadratic cost:

\[J = \sum_{k=0}^{\infty} \left( x_k^T Q x_k + u_k^T R u_k \right)\]

subject to the discrete-time linear dynamics:

\[x_{k+1} = A x_k + B u_k\]

The optimal control law is:

\[u_k = -K (x_k - x_{\text{ref}})\]

where the gain \(K\) is computed by solving the DARE:

\[A^T P A - P - A^T P B (B^T P B + R)^{-1} B^T P A + Q = 0\]

and then:

\[K = (B^T P B + R)^{-1} B^T P A\]

Ridge Regularization: To ensure numerical stability, a small ridge term \(\epsilon I\) is added to \(R\) if it is near-singular.

Examples

>>> import numpy as np
>>> # Simple double integrator: x[k+1] = [[1, 0.1], [0, 1]] x[k] + [[0], [0.1]] u[k]
>>> A = np.array([[1.0, 0.1], [0.0, 1.0]])
>>> B = np.array([[0.0], [0.1]])
>>> Q = np.eye(2)  # Equal weight on position and velocity
>>> R = np.array([[1.0]])  # Control effort penalty
>>> K, P = LQR.compute_gain(A, B, Q, R)
>>> K.shape
(1, 2)
>>> P.shape
(2, 2)
>>> # Verify closed-loop stability (all eigenvalues inside unit circle)
>>> np.all(np.abs(np.linalg.eigvals(A - B @ K)) < 1.0)
True

static f(lqr_state, xhat_k, xref_k)[source]

Perform one state update step for the LQR controller.

For time-invariant LQR, the controller state (optimal gain K and DARE solution P) remains constant across time steps.

Parameters:

lqr_state (Tuple[NDArray, NDArray]) –
A tuple (K, P) containing:
- K : the optimal feedback gain matrix, shape (m, n)
- P : the DARE solution matrix, shape (n, n)
xhat_k (NDArray) – The current state estimate at time k, shape (n,) or (n, 1)
xref_k (NDArray) – The reference state at time k, shape (n,) or (n, 1)

Returns:

(K, P) – The unchanged controller state (time-invariant for infinite-horizon LQR)

Return type:

Tuple[NDArray, NDArray]

Notes

For time-invariant infinite-horizon LQR, the optimal gain \(K\) is constant and computed once via compute_gain(). This function simply passes through the state unchanged.

For time-varying or finite-horizon LQR (not implemented here), this function would update \(K_k\) and \(P_k\) at each time step using backward Riccati recursion.

The parameters xhat_k and xref_k are included in the signature for compatibility with the DynamicalSystem framework but are not used in the state update for time-invariant LQR.

Examples

>>> import numpy as np
>>> A = np.array([[1.0, 0.1], [0.0, 1.0]])
>>> B = np.array([[0.0], [0.1]])
>>> Q = np.eye(2)
>>> R = np.array([[1.0]])
>>> K, P = LQR.compute_gain(A, B, Q, R)
>>> lqr_state = (K, P)
>>> xhat_k = np.array([1.0, 0.5])
>>> xref_k = np.array([0.0, 0.0])
>>> K_new, P_new = LQR.standard_f(lqr_state, xhat_k, xref_k)
>>> np.allclose(K_new, K)  # State unchanged for time-invariant LQR
True
>>> np.allclose(P_new, P)
True

static h(lqr_state, xhat_k, xref_k)[source]

Compute the optimal control input from the current controller state.

Parameters:

lqr_state (Tuple[NDArray, NDArray]) –
A tuple (K, P) containing:
- K : the optimal feedback gain matrix, shape (m, n)
- P : the DARE solution matrix (unused in output computation)
xhat_k (NDArray) – The current state estimate at time k, shape (n,) or (n, 1)
xref_k (NDArray) – The reference state at time k, shape (n,) or (n, 1)

Returns:

uk – The control input at time k, shape (m,) or (m, 1)

Return type:

NDArray

Notes

This function computes the optimal LQR control law:

\[u_k = -K (x_k - x_{\text{ref},k})\]

where:

\(K\) is the optimal feedback gain matrix
\(x_k\) is the current state estimate
\(x_{\text{ref},k}\) is the reference (setpoint) state

Negative feedback: The negative sign ensures that the control input opposes the error \((x_k - x_{\text{ref},k})\), driving the state toward the reference.

Reference tracking: When \(x_{\text{ref},k} = 0\), this reduces to the standard regulator form \(u_k = -K x_k\). For non-zero references, the controller tracks \(x_{\text{ref},k}\).

Examples

>>> import numpy as np
>>> A = np.array([[1.0, 0.1], [0.0, 1.0]])
>>> B = np.array([[0.0], [0.1]])
>>> Q = np.eye(2)
>>> R = np.array([[1.0]])
>>> K, P = LQR.compute_gain(A, B, Q, R)
>>> lqr_state = (K, P)
>>> xhat_k = np.array([1.0, 0.5])  # Current state
>>> xref_k = np.array([0.0, 0.0])  # Regulate to origin
>>> uk = LQR.standard_h(lqr_state, xhat_k, xref_k)
>>> uk.shape
(1,)
>>> # Control should oppose the error (negative of K @ error)
>>> error = xhat_k - xref_k
>>> np.allclose(uk, -K @ error)
True

pykal.algorithm_library.controllers.lqr.compute_gain(A, B, Q, R)

Compute the optimal LQR feedback gain by solving the Discrete-Time Algebraic Riccati Equation (DARE).

Parameters:

A (NDArray) – State transition matrix, shape (n, n)
B (NDArray) – Control input matrix, shape (n, m)
Q (NDArray) – State cost matrix (positive semi-definite), shape (n, n) Penalizes deviation from reference state
R (NDArray) – Control cost matrix (positive definite), shape (m, m) Penalizes control effort

Return type:

Tuple[ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]], ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]]]

Returns:

K (NDArray) – Optimal feedback gain matrix, shape (m, n)
P (NDArray) – Solution to the DARE, shape (n, n)

Notes

The LQR controller minimizes the infinite-horizon quadratic cost:

\[J = \sum_{k=0}^{\infty} \left( x_k^T Q x_k + u_k^T R u_k \right)\]

subject to the discrete-time linear dynamics:

\[x_{k+1} = A x_k + B u_k\]

The optimal control law is:

\[u_k = -K (x_k - x_{\text{ref}})\]

where the gain \(K\) is computed by solving the DARE:

\[A^T P A - P - A^T P B (B^T P B + R)^{-1} B^T P A + Q = 0\]

and then:

\[K = (B^T P B + R)^{-1} B^T P A\]

Ridge Regularization: To ensure numerical stability, a small ridge term \(\epsilon I\) is added to \(R\) if it is near-singular.

Examples

>>> import numpy as np
>>> # Simple double integrator: x[k+1] = [[1, 0.1], [0, 1]] x[k] + [[0], [0.1]] u[k]
>>> A = np.array([[1.0, 0.1], [0.0, 1.0]])
>>> B = np.array([[0.0], [0.1]])
>>> Q = np.eye(2)  # Equal weight on position and velocity
>>> R = np.array([[1.0]])  # Control effort penalty
>>> K, P = LQR.compute_gain(A, B, Q, R)
>>> K.shape
(1, 2)
>>> P.shape
(2, 2)
>>> # Verify closed-loop stability (all eigenvalues inside unit circle)
>>> np.all(np.abs(np.linalg.eigvals(A - B @ K)) < 1.0)
True

pykal.algorithm_library.controllers.lqr.f(lqr_state, xhat_k, xref_k)

Perform one state update step for the LQR controller.

For time-invariant LQR, the controller state (optimal gain K and DARE solution P) remains constant across time steps.

Parameters:

lqr_state (Tuple[NDArray, NDArray]) –
A tuple (K, P) containing:
- K : the optimal feedback gain matrix, shape (m, n)
- P : the DARE solution matrix, shape (n, n)
xhat_k (NDArray) – The current state estimate at time k, shape (n,) or (n, 1)
xref_k (NDArray) – The reference state at time k, shape (n,) or (n, 1)

Returns:

(K, P) – The unchanged controller state (time-invariant for infinite-horizon LQR)

Return type:

Tuple[NDArray, NDArray]

Notes

For time-invariant infinite-horizon LQR, the optimal gain \(K\) is constant and computed once via compute_gain(). This function simply passes through the state unchanged.

For time-varying or finite-horizon LQR (not implemented here), this function would update \(K_k\) and \(P_k\) at each time step using backward Riccati recursion.

The parameters xhat_k and xref_k are included in the signature for compatibility with the DynamicalSystem framework but are not used in the state update for time-invariant LQR.

Examples

>>> import numpy as np
>>> A = np.array([[1.0, 0.1], [0.0, 1.0]])
>>> B = np.array([[0.0], [0.1]])
>>> Q = np.eye(2)
>>> R = np.array([[1.0]])
>>> K, P = LQR.compute_gain(A, B, Q, R)
>>> lqr_state = (K, P)
>>> xhat_k = np.array([1.0, 0.5])
>>> xref_k = np.array([0.0, 0.0])
>>> K_new, P_new = LQR.standard_f(lqr_state, xhat_k, xref_k)
>>> np.allclose(K_new, K)  # State unchanged for time-invariant LQR
True
>>> np.allclose(P_new, P)
True

pykal.algorithm_library.controllers.lqr.h(lqr_state, xhat_k, xref_k)

Compute the optimal control input from the current controller state.

Parameters:

lqr_state (Tuple[NDArray, NDArray]) –
A tuple (K, P) containing:
- K : the optimal feedback gain matrix, shape (m, n)
- P : the DARE solution matrix (unused in output computation)
xhat_k (NDArray) – The current state estimate at time k, shape (n,) or (n, 1)
xref_k (NDArray) – The reference state at time k, shape (n,) or (n, 1)

Returns:

uk – The control input at time k, shape (m,) or (m, 1)

Return type:

NDArray

Notes

This function computes the optimal LQR control law:

\[u_k = -K (x_k - x_{\text{ref},k})\]

where:

\(K\) is the optimal feedback gain matrix
\(x_k\) is the current state estimate
\(x_{\text{ref},k}\) is the reference (setpoint) state

Negative feedback: The negative sign ensures that the control input opposes the error \((x_k - x_{\text{ref},k})\), driving the state toward the reference.

Reference tracking: When \(x_{\text{ref},k} = 0\), this reduces to the standard regulator form \(u_k = -K x_k\). For non-zero references, the controller tracks \(x_{\text{ref},k}\).

Examples

>>> import numpy as np
>>> A = np.array([[1.0, 0.1], [0.0, 1.0]])
>>> B = np.array([[0.0], [0.1]])
>>> Q = np.eye(2)
>>> R = np.array([[1.0]])
>>> K, P = LQR.compute_gain(A, B, Q, R)
>>> lqr_state = (K, P)
>>> xhat_k = np.array([1.0, 0.5])  # Current state
>>> xref_k = np.array([0.0, 0.0])  # Regulate to origin
>>> uk = LQR.standard_h(lqr_state, xhat_k, xref_k)
>>> uk.shape
(1,)
>>> # Control should oppose the error (negative of K @ error)
>>> error = xhat_k - xref_k
>>> np.allclose(uk, -K @ error)
True

MPC Controller

class pykal.algorithm_library.controllers.mpc.MPC[source]

Bases: object

Model Predictive Control (MPC) for constrained optimal control.

Based on Mayne et al. (2000), “Constrained model predictive control: Stability and optimality”, Automatica, Vol. 36, No. 6, pp. 789-814.

MPC solves a finite-horizon optimal control problem at each time step, applying only the first control input and resolving (receding horizon).

static f(*, mpc_state, xhat_k, xref_traj, A, B, Q, R, P, N, u_min=None, u_max=None, x_min=None, x_max=None)[source]

Solve the MPC optimization problem for linear discrete-time systems.

Parameters:

mpc_state (Tuple[NDArray, NDArray, float]) –
Previous MPC solution (u_opt, x_pred, cost) containing:
- u_opt : optimal control sequence from last solve, shape (N, m)
- x_pred : predicted state trajectory, shape (N+1, n)
- cost : optimal cost value from last solve
xhat_k (NDArray) – Current state estimate, shape (n,)
xref_traj (NDArray) – Reference state trajectory, shape (N+1, n)
A (NDArray) – State transition matrix, shape (n, n)
B (NDArray) – Control input matrix, shape (n, m)
Q (NDArray) – Stage cost state weight (positive semi-definite), shape (n, n)
R (NDArray) – Stage cost control weight (positive definite), shape (m, m)
P (NDArray) – Terminal cost weight (positive semi-definite), shape (n, n) Typically set to LQR DARE solution for stability
N (int) – Prediction horizon length
u_min (NDArray, optional) – Lower bound on control inputs, shape (m,)
u_max (NDArray, optional) – Upper bound on control inputs, shape (m,)
x_min (NDArray, optional) – Lower bound on states, shape (n,)
x_max (NDArray, optional) – Upper bound on states, shape (n,)

Return type:

Tuple[ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]], ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]], float]

Returns:

u_opt (NDArray) – Optimal control sequence over horizon, shape (N, m)
x_pred (NDArray) – Predicted state trajectory, shape (N+1, n)
cost (float) – Optimal cost value

Notes

The MPC optimization problem at time k is:

\[\min_{u_0, ..., u_{N-1}} \sum_{i=0}^{N-1} \left( ||x_i - x_{ref,i}||_Q^2 + ||u_i||_R^2 \right) + ||x_N - x_{ref,N}||_P^2\]

subject to:

\[\begin{split}x_{i+1} &= A x_i + B u_i, \quad i = 0, ..., N-1 \\ x_0 &= \hat{x}_k \quad \text{(initial condition)} \\ u_{min} \leq u_i &\leq u_{max}, \quad i = 0, ..., N-1 \quad \text{(input constraints)} \\ x_{min} \leq x_i &\leq x_{max}, \quad i = 1, ..., N \quad \text{(state constraints)}\end{split}\]

Receding Horizon Principle: Only the first control input u_0 is applied. At the next time step, the optimization is resolved with updated state and reference.

Stability: For unconstrained MPC with terminal cost P equal to the LQR DARE solution, the MPC controller is equivalent to LQR and guarantees closed-loop stability. With constraints, stability can be ensured via terminal constraint sets.

Solver: Uses CVXPY with OSQP backend for fast quadratic programming.

static h(mpc_state)[source]

Extract the first control input (receding horizon principle).

Parameters:: mpc_state (Tuple[NDArray, NDArray, float]) – MPC solution (u_opt, x_pred, cost)
Returns:: uk – Control input to apply at current time, shape (m,)
Return type:: NDArray

Notes

The receding horizon principle: - Solve optimization over N-step horizon - Apply only first control input u_0 - At next time step, shift horizon and resolve

This is the key to MPC’s ability to handle constraints and time-varying references/disturbances.

Examples

>>> import numpy as np
>>> u_opt = np.array([[1.0], [0.8], [0.6]])  # 3-step horizon
>>> x_pred = np.zeros((4, 2))  # Predicted trajectory
>>> cost = 5.2
>>> mpc_state = (u_opt, x_pred, cost)
>>> uk = MPC.h(mpc_state)
>>> uk
array([1.])

static simple_f(*, xk, xref, A, B, Q, R, P, N, u_min=None, u_max=None, x_min=None, x_max=None, mpc_output=None)[source]

Simplified stateless MPC for DynamicalSystem integration.

This is a wrapper around standard_f that handles state initialization. Returns (u_opt, x_pred) for compatibility with DynamicalSystem.

Parameters are same as solve_single_step.

Return type:

Tuple[ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]], ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]]]

Parameters:

xk (ndarray[tuple[Any, ...], dtype[_ScalarT]])
xref (ndarray[tuple[Any, ...], dtype[_ScalarT]])
A (ndarray[tuple[Any, ...], dtype[_ScalarT]])
B (ndarray[tuple[Any, ...], dtype[_ScalarT]])
Q (ndarray[tuple[Any, ...], dtype[_ScalarT]])
R (ndarray[tuple[Any, ...], dtype[_ScalarT]])
P (ndarray[tuple[Any, ...], dtype[_ScalarT]])
N (int)
u_min (ndarray[tuple[Any, ...], dtype[_ScalarT]] | None)
u_max (ndarray[tuple[Any, ...], dtype[_ScalarT]] | None)
x_min (ndarray[tuple[Any, ...], dtype[_ScalarT]] | None)
x_max (ndarray[tuple[Any, ...], dtype[_ScalarT]] | None)
mpc_output (Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]] | None)

static simple_h(mpc_output)[source]

Extract first control from simple_f output.

Return type:: ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]]
Parameters:: mpc_output (Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]])

static solve_single_step(*, xk, xref, A, B, Q, R, P, N, u_min=None, u_max=None, x_min=None, x_max=None, warm_start=None)[source]

Simplified MPC interface for single-step solving.

Parameters:

xk (NDArray) – Current state, shape (n,)
xref (NDArray) – Reference state (constant over horizon), shape (n,)
A (NDArray) – System matrices and cost weights
B (NDArray) – System matrices and cost weights
Q (NDArray) – System matrices and cost weights
R (NDArray) – System matrices and cost weights
P (NDArray) – System matrices and cost weights
N (int) – Prediction horizon
u_min (NDArray, optional) – Constraint bounds
u_max (NDArray, optional) – Constraint bounds
x_min (NDArray, optional) – Constraint bounds
x_max (NDArray, optional) – Constraint bounds
warm_start (Tuple, optional) – Previous MPC solution for warm-starting

Return type:

Tuple[ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]], ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]]]

Returns:

u_opt (NDArray) – Optimal control sequence, shape (N, m) or (N,) for scalar
x_pred (NDArray) – Predicted trajectory, shape (N+1, n)

pykal.algorithm_library.controllers.mpc.f(*, mpc_state, xhat_k, xref_traj, A, B, Q, R, P, N, u_min=None, u_max=None, x_min=None, x_max=None)

Solve the MPC optimization problem for linear discrete-time systems.

Parameters:

mpc_state (Tuple[NDArray, NDArray, float]) –
Previous MPC solution (u_opt, x_pred, cost) containing:
- u_opt : optimal control sequence from last solve, shape (N, m)
- x_pred : predicted state trajectory, shape (N+1, n)
- cost : optimal cost value from last solve
xhat_k (NDArray) – Current state estimate, shape (n,)
xref_traj (NDArray) – Reference state trajectory, shape (N+1, n)
A (NDArray) – State transition matrix, shape (n, n)
B (NDArray) – Control input matrix, shape (n, m)
Q (NDArray) – Stage cost state weight (positive semi-definite), shape (n, n)
R (NDArray) – Stage cost control weight (positive definite), shape (m, m)
P (NDArray) – Terminal cost weight (positive semi-definite), shape (n, n) Typically set to LQR DARE solution for stability
N (int) – Prediction horizon length
u_min (NDArray, optional) – Lower bound on control inputs, shape (m,)
u_max (NDArray, optional) – Upper bound on control inputs, shape (m,)
x_min (NDArray, optional) – Lower bound on states, shape (n,)
x_max (NDArray, optional) – Upper bound on states, shape (n,)

Return type:

Tuple[ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]], ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]], float]

Returns:

u_opt (NDArray) – Optimal control sequence over horizon, shape (N, m)
x_pred (NDArray) – Predicted state trajectory, shape (N+1, n)
cost (float) – Optimal cost value

Notes

The MPC optimization problem at time k is:

\[\min_{u_0, ..., u_{N-1}} \sum_{i=0}^{N-1} \left( ||x_i - x_{ref,i}||_Q^2 + ||u_i||_R^2 \right) + ||x_N - x_{ref,N}||_P^2\]

subject to:

\[\begin{split}x_{i+1} &= A x_i + B u_i, \quad i = 0, ..., N-1 \\ x_0 &= \hat{x}_k \quad \text{(initial condition)} \\ u_{min} \leq u_i &\leq u_{max}, \quad i = 0, ..., N-1 \quad \text{(input constraints)} \\ x_{min} \leq x_i &\leq x_{max}, \quad i = 1, ..., N \quad \text{(state constraints)}\end{split}\]

Receding Horizon Principle: Only the first control input u_0 is applied. At the next time step, the optimization is resolved with updated state and reference.

Stability: For unconstrained MPC with terminal cost P equal to the LQR DARE solution, the MPC controller is equivalent to LQR and guarantees closed-loop stability. With constraints, stability can be ensured via terminal constraint sets.

Solver: Uses CVXPY with OSQP backend for fast quadratic programming.

pykal.algorithm_library.controllers.mpc.h(mpc_state)

Extract the first control input (receding horizon principle).

Parameters:: mpc_state (Tuple[NDArray, NDArray, float]) – MPC solution (u_opt, x_pred, cost)
Returns:: uk – Control input to apply at current time, shape (m,)
Return type:: NDArray

Notes

The receding horizon principle: - Solve optimization over N-step horizon - Apply only first control input u_0 - At next time step, shift horizon and resolve

This is the key to MPC’s ability to handle constraints and time-varying references/disturbances.

Examples

>>> import numpy as np
>>> u_opt = np.array([[1.0], [0.8], [0.6]])  # 3-step horizon
>>> x_pred = np.zeros((4, 2))  # Predicted trajectory
>>> cost = 5.2
>>> mpc_state = (u_opt, x_pred, cost)
>>> uk = MPC.h(mpc_state)
>>> uk
array([1.])

pykal.algorithm_library.controllers.mpc.simple_f(*, xk, xref, A, B, Q, R, P, N, u_min=None, u_max=None, x_min=None, x_max=None, mpc_output=None)

Simplified stateless MPC for DynamicalSystem integration.

This is a wrapper around standard_f that handles state initialization. Returns (u_opt, x_pred) for compatibility with DynamicalSystem.

Parameters are same as solve_single_step.

Return type:

Tuple[ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]], ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]]]

Parameters:

xk (ndarray[tuple[Any, ...], dtype[_ScalarT]])
xref (ndarray[tuple[Any, ...], dtype[_ScalarT]])
A (ndarray[tuple[Any, ...], dtype[_ScalarT]])
B (ndarray[tuple[Any, ...], dtype[_ScalarT]])
Q (ndarray[tuple[Any, ...], dtype[_ScalarT]])
R (ndarray[tuple[Any, ...], dtype[_ScalarT]])
P (ndarray[tuple[Any, ...], dtype[_ScalarT]])
N (int)
u_min (ndarray[tuple[Any, ...], dtype[_ScalarT]] | None)
u_max (ndarray[tuple[Any, ...], dtype[_ScalarT]] | None)
x_min (ndarray[tuple[Any, ...], dtype[_ScalarT]] | None)
x_max (ndarray[tuple[Any, ...], dtype[_ScalarT]] | None)
mpc_output (Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]] | None)

pykal.algorithm_library.controllers.mpc.simple_h(mpc_output)

Extract first control from simple_f output.

Return type:: ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]]
Parameters:: mpc_output (Tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]])

pykal.algorithm_library.controllers.mpc.solve_single_step(*, xk, xref, A, B, Q, R, P, N, u_min=None, u_max=None, x_min=None, x_max=None, warm_start=None)

Simplified MPC interface for single-step solving.

Parameters:

xk (NDArray) – Current state, shape (n,)
xref (NDArray) – Reference state (constant over horizon), shape (n,)
A (NDArray) – System matrices and cost weights
B (NDArray) – System matrices and cost weights
Q (NDArray) – System matrices and cost weights
R (NDArray) – System matrices and cost weights
P (NDArray) – System matrices and cost weights
N (int) – Prediction horizon
u_min (NDArray, optional) – Constraint bounds
u_max (NDArray, optional) – Constraint bounds
x_min (NDArray, optional) – Constraint bounds
x_max (NDArray, optional) – Constraint bounds
warm_start (Tuple, optional) – Previous MPC solution for warm-starting

Return type:

Tuple[ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]], ndarray[tuple[Any, ...], dtype[TypeVar(_ScalarT, bound= generic)]]]

Returns:

u_opt (NDArray) – Optimal control sequence, shape (N, m) or (N,) for scalar
x_pred (NDArray) – Predicted trajectory, shape (N+1, n)