Reverb Replay Datasets for TensorFlow¶

Functions for making TensorFlow datasets for sampling from Reverb replay.

The functions implemented by this module closely resemble acme.datasets.make_reverb_dataset(). Thus, most of the code and docstrings was copied from this function.

However, make_reverb_fifo_sampler_dataset ensures that there is only one worker per iterator for the ReplayDataset, since this might be of importance for agents using a queue, where the order of elements drawn from the dataset is relevant.

Furthermore, make_reverb_rnn_sequence_fifo_sampler_dataset implements a dataset for usage in recurrent agents that perform training on unrolled sequences but only require the first recurrent state of the sequence. Consequently, this function configures the dataset in such a way that sequences only include the first recurrent state of a sequence, potentially saving a considerable amount of RAM, especially with recurrent cores that have large states, such as a DNC memory.

Edits made to the original script:

Added the first paragraph of the docstring for make_reverb_fifo_sampler_dataset()
Added the first paragraph of the docstring for make_reverb_rnn_sequence_fifo_sampler_dataset()
Setting the default value of argument num_parallel_calls to 1, to save computational resources (threads).
Passing num_workers_per_iterator=1 to the ReplayDataset constructor in both functions implemented by this module, to ensure compatability with agents using a queue.
Exposing the deterministic argument of tf.dataset.interleave() as an argument to both functions implemented by this module, and setting its default value to True. Setting this to False may improve performance at the cost of determinism.
Passing sequence_length=None and emit_timesteps=True to the construction of the ReplayDataset in make_reverb_rnn_sequence_fifo_sampler_dataset(), as well as manipulating the shape of core_state spec, so sequences contain only the first core_state of a sequence.

ftw.datasets.reverb.make_reverb_fifo_sampler_dataset(client: <sphinx.ext.autodoc.importer._MockObject object at 0x7f0c39b43e10>, environment_spec: Optional[acme.specs.EnvironmentSpec] = None, batch_size: Optional[int] = None, prefetch_size: Optional[int] = None, sequence_length: Optional[int] = None, extra_spec: Union[dm_env.specs.Array, Iterable[NestedSpec], Mapping[Any, NestedSpec], None] = None, transition_adder: bool = False, table: str = 'priority_table', convert_zero_size_to_none: bool = False, num_parallel_calls: int = 1, using_deprecated_adder: bool = False, deterministic: bool = True) → tensorflow.python.data.ops.dataset_ops.DatasetV2¶

Makes a TensorFlow dataset.

Ensures that there is only one worker per iterator for the ReplayDataset, since this might be of importance for agents using a queue, where the order of elements drawn from the dataset is relevant.

We need to explicitly specify up-front the shapes and dtypes of all the Tensors that will be drawn from the dataset. We require that the action and observation specs are given. The reward and discount specs use reasonable defaults if not given. We can also specify a boolean transition_adder which if true will specify the spec as transitions rather than timesteps (i.e. they have a trailing state). Additionally an extra_spec parameter can be given which specifies “extra data”.

Args:

client: A TFClient (or list of TFClients) for talking to a replay server. environment_spec: The environment’s spec. batch_size: Optional. If specified the dataset returned will combine

consecutive elements into batches. This argument is also used to determine the cycle_length for tf.data.Dataset.interleave – if unspecified the cycle length is set to tf.data.experimental.AUTOTUNE.

prefetch_size: How many batches to prefectch in the pipeline. sequence_length: Optional. If specified consecutive elements of each

interleaved dataset will be combined into sequences.

extra_spec: Optional. A possibly nested structure of specs for extras. Note: that whether or not this is present changes the format of the data.
transition_adder: Optional, defaults to False; whether the adder used with: this dataset adds transitions.
table: The name of the table to sample from replay (defaults to: adders.DEFAULT_PRIORITY_TABLE).
convert_zero_size_to_none: When True this will convert specs with shapes 0: to None. This is useful for datasets that contain elements with different shapes for example GraphsTuple from the graph_net library. For example, specs.Array((0, 5), tf.float32) will correspond to a examples with shape tf.TensorShape([None, 5]).
num_parallel_calls: Number of parallel threads creating ReplayDatasets to: interleave.
using_deprecated_adder: True if the adder used to generate the data is: from acme/adders/reverb/deprecated.

deterministic: Whether to use deterministic interleaving of dataset.

Returns:

A tf.data.Dataset that streams data from the replay server.

ftw.datasets.reverb.make_reverb_rnn_sequence_fifo_sampler_dataset(client: <sphinx.ext.autodoc.importer._MockObject object at 0x7f0c39b43e48>, environment_spec: Optional[acme.specs.EnvironmentSpec] = None, batch_size: Optional[int] = None, prefetch_size: Optional[int] = None, sequence_length: Optional[int] = None, extra_spec: Union[dm_env.specs.Array, Iterable[NestedSpec], Mapping[Any, NestedSpec], None] = None, transition_adder: bool = False, table: str = 'priority_table', convert_zero_size_to_none: bool = False, num_parallel_calls: int = 1, using_deprecated_adder: bool = False, deterministic: bool = True) → tensorflow.python.data.ops.dataset_ops.DatasetV2¶