a2c¶
- a2c.calculate_discounted_returns(rewards: array, discounts: array, n_workers: int = 1) array ¶
Calculate the discounted returns from the episode rewards
- Parameters
rewards (The list of rewards) –
discounts (The discount factor) –
n_workers (The number of workers) –
- a2c.create_discounts_array(end: int, base: float, start=0, endpoint=False)¶
Create an array of floating point numbers in [start, end) with the given base
- Parameters
end –
base –
start –
endpoint –
- class a2c.A2CConfig(gamma: float = 0.99, tau: float = 0.1, beta: Optional[float] = None, policy_loss_weight: float = 1.0, value_loss_weight: float = 1.0, max_grad_norm: float = 1.0, n_iterations_per_episode: int = 100, n_workers: int = 1, batch_size: int = 0, normalize_advantages: bool = True, device: str = 'cpu', action_sampler: Optional[Callable] = None, a2cnet: Optional[Module] = None, save_model_path: Optional[Path] = None, optimizer_config: Optional[PyTorchOptimizerConfig] = None)¶
Configuration for A2C algorithm
- class a2c._ActResult(logprobs: torch.Tensor, values: torch.Tensor, actions: torch.Tensor, entropies: torch.Tensor)¶
- class a2c.A2C(config: A2CConfig)¶
-
- _do_train(env: Env, episode_idx: int, **options) EpisodeInfo ¶
Train the algorithm on the episode. In fact this method simply plays the environment to collect batches
- Parameters
env (The environment to train on) –
episode_idx (The index of the training episode) –
options (Any keyword based options passed by the client code) –
- Return type
An instance of EpisodeInfo
- classmethod from_path(config: A2CConfig, path: Path)¶
Load the A2C model parameters from the given path
- Parameters
config (The configuration of the algorithm) –
path (The path to load the parameters) –
- Return type
An instance of A2C class
- on_episode(env: Env, episode_idx: int, **options) EpisodeInfo ¶
Train the algorithm on the episode
- Parameters
env (The environment to train on) –
episode_idx (The index of the training episode) –
options (Any keyword based options passed by the client code) –
- Return type
An instance of EpisodeInfo
- parameters() Any ¶
The parameters of the underlying model
- Return type
An array with the model parameters