replay_buffer¶

class replay_buffer.ReplayBuffer(buffer_size: int)¶

The ReplayBuffer class. Models a fixed size replay buffer. The buffer is represented by using a deque from Python’s built-in collections library. This is basically a list that we can set a maximum size. If we try to add a new element whilst the list is already full, it will remove the first item in the list and add the new item to the end of the list. Hence new experiences replace the oldest experiences. The experiences themselves are tuples of (state1, reward, action, state2, done) that we append to the replay deque and they are represented via the named tuple ExperienceTuple

__getitem__(name_attr: str) → List¶

Return the full batch of the name_attr attribute

Parameters

name_attr (The name of the attribute to collect the) –
values (batch) –

Return type

A list

__init__(buffer_size: int)¶

Constructor

Parameters: buffer_size (The maximum capacity of the buffer) –

__len__() → int¶: Return the current size of the internal memory.

add(state: Any, action: Any, reward: Any, next_state: Any, done: Any, info: dict = {}) → None¶

Add a new experience tuple in the buffer

Parameters

state (The current state) –
action (The action taken) –
reward (The reward observed) –
next_state (The next state observed) –
done (Whether the episode is done) –
info (Any other info needed) –

Return type

None

get_item_as_torch_tensor(name_attr: str) → Tensor¶

Returns a torch.Tensor representation of the the named item

Parameters: name_attr (The name of the attribute) –
Return type: An instance of torch.Tensor

reinitialize() → None¶

Reinitialize the internal buffer

Return type: None

sample(batch_size: int) → List[ExperienceTuple]¶

Randomly sample a batch of experiences from memory.

Parameters: batch_size (The batch size we want to sample) –
Return type: A list of ExperienceTuple