controllers package¶

Submodules¶

controllers.Controller module¶

class controllers.Controller.Controller(*args, **kwargs)¶

Bases: object

Framework of a controller class

act(obs, t, get_pred_cost=False)¶: Performs an action.

dump_logs(primary_logdir, iter_logdir)¶: Dumps logs into primary log directory and per-train iteration log directory.

reset()¶: Resets this controller.

train(obs_trajs, acs_trajs, rews_trajs)¶: Trains this controller using lists of trajectories.

controllers.MPC module¶

class controllers.MPC.MPC(params)¶

Bases: controllers.Controller.Controller

Parameters:

params (dotmap) –

Configuration parameters. .env (gym.env):

Environment for which this controller will be used.

.update_fns (list<func>):

A list of functions that will be invoked (possibly with a tensorflow session) every time this controller is reset.

.ac_ub (np.ndarray): (optional)

An array of action upper bounds. Defaults to environment action upper bounds.

.ac_lb (np.ndarray): (optional)

An array of action lower bounds. Defaults to environment action lower bounds.

.per (int): (optional)

Determines how often the action sequence will be optimized. Defaults to 1 (reoptimizes at every call to act()).

.prop_cfg

.model_init_cfg (DotMap):: A DotMap of initialization parameters for the model. .model_constructor (func):

A function which constructs an instance of this model, given model_init_cfg.
.model_train_cfg (dict): (optional): A DotMap of training parameters that will be passed into the model every time is is trained. Defaults to an empty dict.
.model_pretrained (bool): (optional): If True, assumes that the model has been trained upon construction.
.mode (str):: Propagation method. Choose between [E, DS, TSinf, TS1, MM]. See https://arxiv.org/abs/1805.12114 for details.
.npart (int):: Number of particles used for DS, TSinf, TS1, and MM propagation methods.
.ign_var (bool): (optional): Determines whether or not variance output of the model will be ignored. Defaults to False unless deterministic propagation is being used.
.obs_preproc (func): (optional): A function which modifies observations (in a 2D matrix) before they are passed into the model. Defaults to lambda obs: obs. Note: Must be able to process both NumPy and Tensorflow arrays.
.obs_postproc (func): (optional): A function which returns vectors calculated from the previous observations and model predictions, which will then be passed into the provided cost function on observations. Defaults to lambda obs, model_out: model_out. Note: Must be able to process both NumPy and Tensorflow arrays.
.obs_postproc2 (func): (optional): A function which takes the vectors returned by obs_postproc and (possibly) modifies it into the predicted observations for the next time step. Defaults to lambda obs: obs. Note: Must be able to process both NumPy and Tensorflow arrays.
.targ_proc (func): (optional): A function which takes current observations and next observations and returns the array of targets (so that the model learns the mapping obs -> targ_proc(obs, next_obs)). Defaults to lambda obs, next_obs: next_obs. Note: Only needs to process NumPy arrays.

.opt_cfg

.mode (str):: Internal optimizer that will be used. Choose between [CEM, Random].
.cfg (DotMap):: A map of optimizer initializer parameters.
.plan_hor (int):: The planning horizon that will be used in optimization.
.obs_cost_fn (func):: A function which computes the cost of every observation in a 2D matrix. Note: Must be able to process both NumPy and Tensorflow arrays.
.ac_cost_fn (func):: A function which computes the cost of every action in a 2D matrix.
.constrains (np.array):: An array with the optimisation constrains = [[lb, ub], [lc1, uc1], [lc2, uc2]] so that if u = [v, q], lb <= u <= ub, lc1 <= q/v <= uc2, lc2 <= q/sqrt(v) <= uc2. Overwrites ac_lb and ac_ub is constrains[0] is not None.

.log_cfg

.save_all_models (bool): (optional): If True, saves models at every iteration. Defaults to False (only most recent model is saved). Warning: Can be very memory-intensive.
.log_traj_preds (bool): (optional): If True, saves the mean and variance of predicted particle trajectories. Defaults to False.
.log_particles (bool) (optional): If True, saves all predicted particles trajectories. Defaults to False. Note: Takes precedence over log_traj_preds. Warning: Can be very memory-intensive

act(obs, t, get_pred_cost=False)¶

Returns the action that this controller would take at time t given observation obs.

Parameters:	obs – The current observation t – The current timestep get_pred_cost – If True, returns the predicted cost for the action sequence found by the internal optimizer.

Returns: An action (and possibly the predicted cost)

changePlanHor(T, freq=None, change_over=False)¶

Dynamically changes the planning horizon of the MPC algorithm.

Parameters:	T (int) – New planning horizon.

changeTargetCost(target)¶

dump_logs(primary_logdir, iter_logdir)¶

Saves logs to either a primary log directory or another iteration-specific directory. See __init__ documentation to see what is being logged.

Parameters:	primary_logdir (str) – A directory path. This controller assumes that this directory does not change every iteration. iter_logdir (str) – A directory path. This controller assumes that this directory changes every time dump_logs is called.

Returns: None

optimizers = {'CEM': <class 'dmbrl.misc.optimizers.cem.CEMOptimizer'>, 'Random': <class 'dmbrl.misc.optimizers.random.RandomOptimizer'>}¶

reset()¶

Resets this controller (clears previous solution, calls all update functions).

Returns: None

train(obs_trajs, obs_prime_trajs, acs_trajs)¶

Trains the internal model of this controller. Once trained, this controller switches from applying random actions to using MPC.

Parameters:	obs_trajs – (N, nS) obs_prime_trajs – (N, nS) observations at next time step acs_trajs – (N, nU)

Returns: None.

controllers package¶

Submodules¶

controllers.Controller module¶

controllers.MPC module¶

Module contents¶