Reference Generators

Reference Generator Base Class

class gym_electric_motor.core.ReferenceGenerator[source]

The abstract base class for reference generators in gym electric motor environments.

reference_space:

Space of reference observations as defined in the Farama Gymnasium Toolbox.

The reference generator is called twice per step.

Call of get_reference():

Returns the reference array which has the same shape as the state array and contains values for currently referenced state variables and a default value (e.g zero) for non-referenced variables. This reference array is used to calculate rewards.

Example:

reference_array=np.array([0.8, 0.0, 0.0, 0.0])

state_variables=['omega', 'torque', 'i', 'u', 'u_sup']

Here, the state consists of five quantities but only 'omega' is referenced during control.

Call of get_reference_observation():

Returns the reference observation, which is shown to the agent. Any shape and content is generally valid, however, values must be within the declared reference space. For example, the reference observation may contain future reference values of the next n steps.

Example:

reference_observation = np.array([0.8, 0.6, 0.4])

This reference observation may be valid for an omega-controlled environment that shows the agent not only the reference for the next time step omega_(t+1)=0.8 but also omega_(t+2)=0.6 and omega_(t+3)=0.4.

close()[source]

Called by the environment, when the environment is deleted to close files, store logs, etc.

get_reference(state, *_, **__)[source]

Returns the reference array of the current time step.

The reference array needs to be in the same shape as the state variables. For referenced states the reference value is passed. For unreferenced states a default value (e.g. Zero) can be set in the reference array.

Parameters:

state (ndarray(float)) – Current state array of the environment.

Returns:

Current reference array.

Return type:

ndarray(float))

get_reference_observation(state, *_, **__)[source]

Returns the reference observation for the next time step. This observation needs to fit in the reference space.

Parameters:

state (ndarray(float)) – Current state array of the environment.

Returns:

Observation for the next reference time step.

Return type:

value in reference_space

property reference_names

Returns: reference_names(list(str)): A list containing all names of the referenced states in the reference observation.

property referenced_states

Returns: ndarray(bool): Boolean-Array with the length of the state_variables indicating which states are referenced.

reset(initial_state=None, initial_reference=None)[source]

Reset of references for a new episode.

Parameters:
  • initial_state (ndarray(float)) – The initial state array of the environment.

  • initial_reference (ndarray(float)) – If not None: A desired initial reference array.

Returns:

The reference array at time step 0.

reference_observation(value in reference_space): The reference observation for the next time step.

trajectories(dict(list(float)): If available, generated trajectories for the Visualization can be passed here. Otherwise return None.

Return type:

reference_array(ndarray(float))

set_modules(physical_system)[source]

Announcement of the PhysicalSystem to the ReferenceGenerator.

In subclasses, store all important information from the physical system to the ReferenceGenerator here. The environment announces the physical system to the ReferenceGenerator during its initialization.

Parameters:

physical_system (PhysicalSystem) – The physical system of the environment.