replay_buffer

class replay_buffer.ReplayBuffer(buffer_size: int)

The ReplayBuffer class. Models a fixed size replay buffer. The buffer is represented by using a deque from Python’s built-in collections library. This is basically a list that we can set a maximum size. If we try to add a new element whilst the list is already full, it will remove the first item in the list and add the new item to the end of the list. Hence new experiences replace the oldest experiences. The experiences themselves are tuples of (state1, reward, action, state2, done) that we append to the replay deque and they are represented via the named tuple ExperienceTuple

__getitem__(name_attr: str) List

Return the full batch of the name_attr attribute

Parameters
  • name_attr (The name of the attribute to collect the) –

  • values (batch) –

Return type

A list

__init__(buffer_size: int)

Constructor

Parameters

buffer_size (The maximum capacity of the buffer) –

__len__() int

Return the current size of the internal memory.

add(state: Any, action: Any, reward: Any, next_state: Any, done: Any, info: dict = {}) None

Add a new experience tuple in the buffer

Parameters
  • state (The current state) –

  • action (The action taken) –

  • reward (The reward observed) –

  • next_state (The next state observed) –

  • done (Whether the episode is done) –

  • info (Any other info needed) –

Return type

None

get_item_as_torch_tensor(name_attr: str) Tensor

Returns a torch.Tensor representation of the the named item

Parameters

name_attr (The name of the attribute) –

Return type

An instance of torch.Tensor

reinitialize() None

Reinitialize the internal buffer

Return type

None

sample(batch_size: int) List[ExperienceTuple]

Randomly sample a batch of experiences from memory.

Parameters

batch_size (The batch size we want to sample) –

Return type

A list of ExperienceTuple