controllers package¶
Submodules¶
controllers.Controller module¶
-
class
controllers.Controller.Controller(*args, **kwargs)¶ Bases:
objectFramework of a controller class
-
act(obs, t, get_pred_cost=False)¶ Performs an action.
-
dump_logs(primary_logdir, iter_logdir)¶ Dumps logs into primary log directory and per-train iteration log directory.
-
reset()¶ Resets this controller.
-
train(obs_trajs, acs_trajs, rews_trajs)¶ Trains this controller using lists of trajectories.
-
controllers.MPC module¶
-
class
controllers.MPC.MPC(params)¶ Bases:
controllers.Controller.ControllerParameters: params (dotmap) – Configuration parameters. .env (gym.env):
Environment for which this controller will be used.- .update_fns (list<func>):
- A list of functions that will be invoked (possibly with a tensorflow session) every time this controller is reset.
- .ac_ub (np.ndarray): (optional)
- An array of action upper bounds. Defaults to environment action upper bounds.
- .ac_lb (np.ndarray): (optional)
- An array of action lower bounds. Defaults to environment action lower bounds.
- .per (int): (optional)
- Determines how often the action sequence will be optimized. Defaults to 1 (reoptimizes at every call to act()).
- .prop_cfg
- .model_init_cfg (DotMap):
- A DotMap of initialization parameters for the model.
.model_constructor (func):A function which constructs an instance of this model, given model_init_cfg.
- .model_train_cfg (dict): (optional)
- A DotMap of training parameters that will be passed into the model every time is is trained. Defaults to an empty dict.
- .model_pretrained (bool): (optional)
- If True, assumes that the model has been trained upon construction.
- .mode (str):
- Propagation method. Choose between [E, DS, TSinf, TS1, MM]. See https://arxiv.org/abs/1805.12114 for details.
- .npart (int):
- Number of particles used for DS, TSinf, TS1, and MM propagation methods.
- .ign_var (bool): (optional)
- Determines whether or not variance output of the model will be ignored. Defaults to False unless deterministic propagation is being used.
- .obs_preproc (func): (optional)
- A function which modifies observations (in a 2D matrix) before they are passed into the model. Defaults to lambda obs: obs. Note: Must be able to process both NumPy and Tensorflow arrays.
- .obs_postproc (func): (optional)
- A function which returns vectors calculated from the previous observations and model predictions, which will then be passed into the provided cost function on observations. Defaults to lambda obs, model_out: model_out. Note: Must be able to process both NumPy and Tensorflow arrays.
- .obs_postproc2 (func): (optional)
- A function which takes the vectors returned by obs_postproc and (possibly) modifies it into the predicted observations for the next time step. Defaults to lambda obs: obs. Note: Must be able to process both NumPy and Tensorflow arrays.
- .targ_proc (func): (optional)
- A function which takes current observations and next observations and returns the array of targets (so that the model learns the mapping obs -> targ_proc(obs, next_obs)). Defaults to lambda obs, next_obs: next_obs. Note: Only needs to process NumPy arrays.
- .opt_cfg
- .mode (str):
- Internal optimizer that will be used. Choose between [CEM, Random].
- .cfg (DotMap):
- A map of optimizer initializer parameters.
- .plan_hor (int):
- The planning horizon that will be used in optimization.
- .obs_cost_fn (func):
- A function which computes the cost of every observation in a 2D matrix. Note: Must be able to process both NumPy and Tensorflow arrays.
- .ac_cost_fn (func):
- A function which computes the cost of every action in a 2D matrix.
- .constrains (np.array):
- An array with the optimisation constrains = [[lb, ub], [lc1, uc1], [lc2, uc2]] so that if u = [v, q], lb <= u <= ub, lc1 <= q/v <= uc2, lc2 <= q/sqrt(v) <= uc2. Overwrites ac_lb and ac_ub is constrains[0] is not None.
- .log_cfg
- .save_all_models (bool): (optional)
- If True, saves models at every iteration. Defaults to False (only most recent model is saved). Warning: Can be very memory-intensive.
- .log_traj_preds (bool): (optional)
- If True, saves the mean and variance of predicted particle trajectories. Defaults to False.
- .log_particles (bool) (optional)
- If True, saves all predicted particles trajectories. Defaults to False. Note: Takes precedence over log_traj_preds. Warning: Can be very memory-intensive
-
act(obs, t, get_pred_cost=False)¶ Returns the action that this controller would take at time t given observation obs.
Parameters: - obs – The current observation
- t – The current timestep
- get_pred_cost – If True, returns the predicted cost for the action sequence found by the internal optimizer.
Returns: An action (and possibly the predicted cost)
-
changePlanHor(T, freq=None, change_over=False)¶ Dynamically changes the planning horizon of the MPC algorithm.
Parameters: T (int) – New planning horizon.
-
changeTargetCost(target)¶
-
dump_logs(primary_logdir, iter_logdir)¶ Saves logs to either a primary log directory or another iteration-specific directory. See __init__ documentation to see what is being logged.
Parameters: - primary_logdir (str) – A directory path. This controller assumes that this directory does not change every iteration.
- iter_logdir (str) – A directory path. This controller assumes that this directory changes every time dump_logs is called.
Returns: None
-
optimizers= {'CEM': <class 'dmbrl.misc.optimizers.cem.CEMOptimizer'>, 'Random': <class 'dmbrl.misc.optimizers.random.RandomOptimizer'>}¶
-
reset()¶ Resets this controller (clears previous solution, calls all update functions).
Returns: None
-
train(obs_trajs, obs_prime_trajs, acs_trajs)¶ Trains the internal model of this controller. Once trained, this controller switches from applying random actions to using MPC.
Parameters: - obs_trajs – (N, nS)
- obs_prime_trajs – (N, nS) observations at next time step
- acs_trajs – (N, nU)
Returns: None.