Paul N. Beuchat


Loading...

Last Name

Beuchat

First Name

Paul N.

Organisational unit

Search Results

Publications 1 - 10 of 17
  • Beuchat, Paul N. (2019)
    Many problems in science and engineering can be cast as discrete time, continuous space, infinite horizon stochastic optimal control problems. The so-called value function and Q-function both characterise the solution of such problems, but are both intractable to compute in all but a few special cases. This thesis focuses on methods that approximate the value function and Q-function. We consider the linear programming approach to approximate dynamic programming, which computes approximate value functions and Q-functions that are point-wise under-estimators of the optimal by using the so-called Bellman inequality. For this approximation method we provide theoretical guarantees on the value function and Q-function approximation error, and also for the sub-optimality of a policy generated using such lower-bounding approximations. In particular, the online performance guarantee is obtained by analysing an iterated version of the greedy policy, and the fitting error guarantee by analysing an iterated version of the Bellman inequality. These guarantees complement the existing bounds that appear in the literature. Given a collection of lower-bounding approximate value functions, an improved approximation can be constructed by taking the point-wise maximum over the collection, however, the challenge is how to compute the collection itself. To address this challenge, we introduce a novel formulation, referred to as the point-wise maximum approach to approximate dynamic programming, and use this to propose algorithms that iteratively construct a collection of lower-bounding value functions with the objective of maximising the point-wise maximum of the collection. We empirically demonstrate the advantages of the proposed algorithm through a range numerical examples that indicate classes of problems where the proposed algorithms improves upon state-of-the-art methods. A key result from the numerical studies is that the proposed algorithm can provide practically useful sub-optimality bounds for the online performance of any policy, even when the collection of approximate value functions is itself impractical to use for a policy.
  • Warrington, Joseph; Beuchat, Paul N.; Lygeros, John (2019)
    IEEE Transactions on Automatic Control
    We describe a nonlinear generalization of dual dynamic programming (DP) theory and its application to value function estimation for deterministic control problems over continuous state and action spaces, in a discrete-time infinite horizon setting. We prove, using a Benders-type argument leveraging the monotonicity of the Bellman operator, that the result of a one-stage policy evaluation can be used to produce nonlinear lower bounds on the optimal value function that are valid over the entire state space. These bounds contain terms reflecting the functional form of the system's costs, dynamics, and constraints. We provide an iterative algorithm that produces successively better approximations of the optimal value function, and prove under certain assumptions that it achieves an arbitrarily low desired Bellman optimality tolerance at preselected points in the state space, in a finite number of iterations. We also describe means of certifying the quality of the approximate value function generated. We demonstrate the efficacy of the approach on systems whose dimensions are too large for conventional DP approaches to be practical.
  • Romero Aguilar, Ángel; Beuchat, Paul N.; Stürz, Yvonne R.; et al. (2019)
    Proceedings of the 18th European Control Conference (ECC 2019)
    While Approximate Dynamic Programming has successfully been used in many applications involving discrete states and inputs such as playing the games of Tetris or chess, it has not been used in many continuous state and input space applications. In this paper, we combine Approximate Dynamic Programming techniques and apply them to the continuous, non-linear and high dimensional dynamics of a quadcopter vehicle. We use a polynomial approximation of the dynamics and sum-of-squares programming techniques to compute a family of polynomial value function approximations for different tuning parameters. The resulting approximations to the optimal value function are combined in a point-wise maximum approach, which is used to compute the online policy. The success of the method is demonstrated in both simulations and experiments on a quadcopter. The control performance is compared to a linear time-varying Model Predictive Controller. The two methods are then combined to keep the computational benefits of a short horizon MPC and the long term performance benefits of the Approximate Dynamic Programming value function as the terminal cost.
  • Beuchat, Paul N.; Lygeros, John (2017)
    IFAC-PapersOnLine ~ 20th IFAC World Congress. Proceedings
    In this paper, we propose a novel formulation for encoding state constraints into the Linear Programming approach to Approximate Dynamic Programming via the use of penalty functions. To maintain tractability of the resulting optimization problem that needs to be solved, we suggest a penalty function that is constructed as a point-wise maximum taken over a family of low-order polynomials. Once the penalty functions are designed, no additional approximations are introduced by the proposed formulation. The effectiveness and numerical stability of the formulation is demonstrated through examples
  • Beuchat, Paul N.; Hesse, Henrik; Domahidi, Alexander; et al. (2018)
    2018 IEEE 4th World Forum on Internet of Things (WF-IoT)
  • Beuchat, Paul N.; Stürz, Yvonne R.; Lygeros, John (2019)
    IFAC-PapersOnLine ~ 12th IFAC Symposium on Advances in Control Education, ACE 2019. Proceedings
    Quadcopters, popular in consumer and commercial applications, are a perfect testing ground for both basic and advanced control techniques. This makes them an ideal platform for offering experiment-based control courses. We present the software system developed to teach hands-on quadcopter control, that we have used to offer a short course to groups of 32 undergraduate electrical engineering students. Our system allows students to work simultaneously, each with a separate quadcopter. The software makes the hands-on class easy to manage, allowing the instructor to focus on the pedagogic aspects.
  • Beuchat, Paul N.; Georghiou, Angelos; Lygeros, John (2020)
    IEEE Transactions on Automatic Control
    We study both the value function and Q-function formulation of the Linear Programming approach to Approximate Dynamic Programming. The approach is model-based and optimizes over a restricted function space to approximate the value function or Q-function. Working in the discrete time, continuous space setting, we provide guarantees for the fitting error and online performance of the policy. In particular, the online performance guarantee is obtained by analyzing an iterated version of the greedy policy, and the fitting error guarantee by analyzing an iterated version of the Bellman inequality. These guarantees complement the existing bounds that appear in the literature. The Q-function formulation offers benefits, for example, in decentralized controller design, however it can lead to computationally demanding optimization problems. To alleviate this drawback, we provide a condition that simplifies the formulation, resulting in improved computational times.
  • Beuchat, Paul N.; Georghiou, Angelos; Lygeros, John (2016)
    2016 European Control Conference (ECC)
  • Karaca, Orçun; Darivianakis, Georgios; Beuchat, Paul N.; et al. (2017)
    IFAC-PapersOnLine ~ 20th IFAC World Congress. Proceedings
    Polynomial optimization is an active field of research which can be used in a broad range of applications including the synthesis of control policies for non-linear systems, and solution methods such as approximate dynamic programming. Finding the optimal solution of a generic polynomial optimization problem remains a computationally intractable problem. Several studies in the literature resort to hierarchical schemes that converge to the optimal solution, by employing appropriate convex relaxations of the original problem. In this direction, sum of squares methods have shown to be effective in addressing problems of low degree and dimension, with numerous MATLAB toolboxes allowing for efficient implementation. An alternative solution method is to cast the problem as a signomial optimization and solve it using a hierarchy of relative entropy relaxations. In contrast to sum of squares, this method can tackle problems involving high degree and dimension polynomials. In this paper, we develop the publicly available REPOP toolbox to address polynomial optimization problems using relative entropy relaxations. The toolbox is equipped with appropriate pre-processing routines that considerably reduce the size of the resulting optimization problem. In addition, we propose a convergent hierarchy which combines aspects from sum of squares and relative entropy relaxations. The proposed method offers computational advantages over both methods.
  • Beuchat, Paul N.; Coulson, Jeremy; Lygeros, John (2020)
    Many institutes use quadcopters to demonstrate their research on advanced control techniques, and some students have the opportunity to further their control education through individual projects. A handful of setups, including our previous work, have been developed to offer hands-on education to bachelor students in a classroom-like setting. However, the use of a motion capture system for state feedback and the need for specific software configuration limits the scalability of such a setup. In this work we investigate the feasibility of a new software framework that scales gracefully in terms of cost and flexibility from a class size of 1 student up to to tens or hundreds of students.
Publications 1 - 10 of 17