EigenSafe: A Spectral Framework for Safety Assessment

Summary

EigenSafe is a novel operator-theoretic framework designed to assess the safety of stochastic robotic systems.
The framework derives a linear operator that governs the evolution of the safety probability over long time horizons.
The dominant eigenpair of this operator is used as a calibrated safety critic: the dominant eigenvalue serves as a global safety metric for the policy, and the dominant eigenfunction acts as a state-wise safety score.
The approach is validated in two distinct domains: enforcing safety constraints in reinforcement learning (RL) for continuous control tasks and providing test-time safety filtering for imitation learning (IL) in robotic manipulation.

The Spectral Framework

The Dominant Eigenfunction as a Calibrated Safety Critic

The evolution of safety probability follows a dynamic programming principle, where the safety probability at the current time step is the expection of the next-step safety probability. We frame this recursive update as a linear operator acting on functions of state-action pairs. The system's asymptotic safety behavior is described by its dominant eigenpair, yielding a metric that is calibrated to the true safety probability.

$$ \color{#000000}{\mathbb{P}_{\pi}\left[\text{safe until $t$ }\middle| s_0 = x, a_0 = u\right] = {}} \color{#d32f2f}{A_{\pi}^{\color{#000000}{t}}} \color{#000000}{{{1}}_{\text{safe}}(x, u)} \approx c \cdot \color{#2e7d32}{\psi_{\pi}(x, u)} \cdot \color{#1565c0}{\gamma_{\pi}^{\color{#000000}{t}}} $$

Linear Operator $A_\pi$

Governs the evolution of safety probability
A positive, non-expansive linear operator

The Dominant Eigenfunction $\psi_{\pi}$

Measures the relative safety of each state-action pair $(x, u)$
Always nonnegative

The Dominant Eigenvalue $\gamma_{\pi}$

Safety of the overall closed-loop system
Always between 0 and 1

Dominant Eigenpair Learning

To learn the dominant eigenpair, we minimize the following loss functional:

$$ \begin{aligned} \mathcal{J}_{\text{eig}}[\gamma, \psi] &= W_\text{eig} \cdot \underbrace{\underset{(x,u,x')\sim \mathcal{D}, u' \sim \pi(\cdot|x')}{\mathbb{E}} \left(1[\text{$x$ safe} \wedge \text{$x'$ safe}] \cdot \psi(x', u') - \gamma \cdot 1[\text{$x$ safe}]\cdot \psi(x,u)\right)^2}_{\text{Eigenpair Loss}} \\ & \quad \quad {} + W_n \cdot \underbrace{\left(\max_{(x,u,\cdot) \in \mathcal{D}} \psi(x,u) - 1 \right)^2}_{\text{Normalization Loss}} \end{aligned} $$

Application 2: Safety-Filtered Imitation Learning

We apply EigenSafe to a stochastic behavior cloning policy on a UR3 robot in a food preparation task. At test time, we sample multiple ($n$) action candidates and select the one with the $k$-th highest eigenfunction value (higher safety critic value).

Video. Safety-filtered IL using EigenSafe. ($k=10$, $n=50$)

Video. Baseline IL (flow policy model) without safety filtering.

Figure. Success/safety rates in safety-filtered IL ($n=50$). There is a positive correlation between the selecting actions with higher $\psi_\pi$ values and the success/safety rates, except for the $k=1$ case where the action is subject to the risk of being out of distribution.

Abstract

We present EigenSafe, an operator-theoretic framework for safety assessment of learning-enabled stochastic systems. In many robotic applications, the dynamics are inherently stochastic due to factors such as sensing noise and environmental disturbances, and it is challenging for conventional methods such as Hamilton-Jacobi reachability and control barrier functions to provide a well-calibrated safety critic that is tied to the actual safety probability. We derive a linear operator that governs the dynamic programming principle for safety probability, and find that its dominant eigenpair provides critical safety information for both individual state-action pairs and the overall closed-loop system. The proposed framework learns this dominant eigenpair, which can be used to either inform or constrain policy updates. We demonstrate that the learned eigenpair effectively facilitates safe reinforcement learning. Further, we validate its applicability in enhancing the safety of learned policies from imitation learning through robot manipulation experiments using a UR3 robotic arm in a food preparation task.

EigenSafe: A Spectral Framework for Learning-Based Probabilistic Safety Assessment