Our research focuses on artificial intelligence for controlling complex systems. We usually focus on autonomous aerospace systems, from space probes to drones to telescopes for tracking space debris, but we have also worked on other systems including ecological communities and autonomous cars. The theme that unites all of our research is uncertainty. This could be uncertainty in the system’s parameters or states or in how other people, systems, or the environment will interact with it. We approach our research from many perspectives, ranging from pure mathematical theory to numerical simulation to hardware experiments. The mathematical formalisms that we most often use are the partially observable Markov decision process (POMDP) for stochastic uncertainty, or various game formalisms when the uncertainty is worst case or introduced by other rational agents. One of our most important specialties is developing online tree search algorithms for POMDPs, and we are one of the world’s leading centers of development for this approach.

The best way to view our most up-to-date research is to look at our Publications page. Some slides and brief descriptions of some of our projects can be found below. (If you cannot see the slides, make sure to disable all add blockers.)

Theoretical Foundations

Optimality of POMDP Approximations

Particle Belief MDP alt centered

Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems. However, POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid, which is often the case for physical systems. We present a general theory characterizing the approximation error of the practically effective particle filtering techniques that many recent online sampling-based POMDP algorithms use.

POMDP Algorithms with Theoretical Guarantees

VPW alt centered

Recent online sampling-based algorithms that use techniques such as observation likelihood weighting and have shown unprecedented effectiveness in domains with continuous observation and action spaces. This line of work offers theoretical justifications of these techniques, proving that our new algorithms that utilize these techniques will estimate Q-values accurately with high probability and can be made to perform arbitrarily near the optimal solution by increasing computational power.

Algorithmic Developments

Practical POMDP algorithms

POMCPOW Tree alt centered

Leading online partially observable Markov decision process (POMDP) solvers such as POMCP and DESPOT can handle continuous state spaces, but they still struggle with continuous action and observation spaces. In fact, it can be shown analytically that they will converge to suboptimal solutions for some POMDPs with continuous action spaces regardless of the amount of computation. In this line of work, we propose novel POMDP algorithms that overcome these problems using techniques such as progressive widening and weighted particle filtering.

Light-dark bad alt splitLight-dark good alt split

For example, in the light-dark example above, POMCP (left) cannot decide to localize in the light region, while our new algorithm POMCPOW (right) can, allowing it to hit the target at the origin much more quickly.

Applications and Extensions

Ecological Navigation

Ecology alt centered

Ecological management problems often involve navigating from an initial to a desired community state. We showed that navigation between states is an equivalent problem to searching for lowest-cost sequences of actions that comprise direct and shortcut paths. Shortcuts can be obtained by using small sequential abundance perturbations (e.g. low-density introductions) and environment perturbations to nudge communities between states. Our work suggests that brute-force approaches to navigation like antibiotics or clearcutting may have realistic and less impactful alternatives.

Behavior-Aware Autonomous Driving

Internal States alt split Safety-Efficiency Tradeoff alt split

In autonomous driving, there is an inherent tradeoff between safety and efficiency, especially time-efficiency. If a self-driving car is to be perfectly safe, it cannot enter the road, and it can be the fastest if there are no safety constraints. This tradeoff results in the Pareto curves shown in the figure below. But the performance also depends on the model. We showed that by modeling the latent internal states of the other drivers on the road, safety and efficiency can both be simultaneously improved (this corresponds to moving the Pareto curve). In computational tests in a highway driving scenario, internal state modeling allowed the autonomous vehicle to perform a multiple-lane change maneuver nearly twice as fast with the same level of safety.

Autonomous Autorotation

In 2013, Professor Sunberg (as an MS student) and collaborators used autorotation to repeatedly successfully land a small autonomous helicopter without power. The video below contains footage of one of the landings from a nose-mounted camera. Note that the pilot releases control as he turns off the motor.