fragile.optimize.policy
Contents
fragile.optimize.policy
#
Module Contents#
Classes#
The Gaussian policy samples actions from a Gaussian distribution. |
|
The ESModel implements an evolutionary strategy policy. |
|
Implementation of CMAES algorithm from https://en.wikipedia.org/wiki/CMA-ES. |
- class fragile.optimize.policy.ParticleIntegration(loc=0.0, scale=1.0, **kwargs)[source]#
Bases:
fragile.core.policy.Gaussian
The Gaussian policy samples actions from a Gaussian distribution.
Inherits from
ContinuousPolicy
.- default_outputs = ['actions', 'velocities']#
- default_inputs#
- property param_dict#
Return the dictionary defining all the data attributes that the component requires.
- Return type
fragile.core.typing.StateDict
- act(inplace=True, **kwargs)[source]#
Calculate SwarmState containing the data needed to interact with the environment.
- Parameters
inplace (bool) –
- Return type
Union[None, fragile.core.typing.StateData]
- select_actions(**kwargs)[source]#
Select actions by sampling from a Gaussian distribution.
- Parameters
**kwargs – Additional keyword arguments required for selecting actions.
- Returns
Selected actions as a tensor.
- Return type
Tensor
- reset(inplace=True, root_walker=None, states=None, **kwargs)[source]#
Reset the internal state of the
PolicyAPI
.- Parameters
inplace (bool, optional) – If True, updates the swarm state with the selected actions. If False, returns the selected actions. Defaults to True.
root_walker (Optional[StateData], optional) – Set the internal state of the PolicyAPI to this value. Defaults to None.
states (Optional[StateData], optional) – Set the internal state of the PolicyAPI to this value. Defaults to None.
**kwargs – Other parameters required to reset the component.
- Returns
- None if inplace is True. Otherwise, a StateData dictionary
containing the selected actions.
- Return type
Union[None, StateData]
- class fragile.optimize.policy.ESModel(mutation=0.5, recombination=0.7, random_step_prob=0.1, *args, **kwargs)[source]#
Bases:
fragile.core.policy.Gaussian
The ESModel implements an evolutionary strategy policy.
It mutates randomly some of the coordinates of the best solution found by substituting them with a proposal solution. This proposal solution is the difference between two random permutations of the best solution found.
It applies a gaussian normal perturbation with a probability given by
mutation
.- sample(batch_size, model_states=None, env_states=None, walkers_states=None, **kwargs)[source]#
Calculate the corresponding data to interact with the Environment and store it in model states.
- Parameters
batch_size (int) – Number of new points to the sampled.
model_states – SwarmState corresponding to the environment data.
env_states – SwarmState corresponding to the model data.
walkers_states – SwarmState corresponding to the walkers data.
kwargs – Passed to the
Critic
if any.
- Returns
Tuple containing a tensor with the sampled actions and the new model states variable.
- class fragile.optimize.policy.CMAES(sigma, virtual_reward_fitness=False, *args, **kwargs)[source]#
Bases:
fragile.core.policy.Gaussian
Implementation of CMAES algorithm from https://en.wikipedia.org/wiki/CMA-ES.
- sample(batch_size, model_states=None, env_states=None, walkers_states=None, **kwargs)[source]#
Calculate the corresponding data to interact with the Environment and store it in model states.
- Parameters
batch_size (int) – Number of new points to the sampled.
model_states – SwarmState corresponding to the environment data.
env_states – SwarmState corresponding to the model data.
walkers_states – SwarmState corresponding to the walkers data.
kwargs – Passed to the
Critic
if any.
- Returns
Tuple containing a tensor with the sampled actions and the new model states variable.
- reset(batch_size=1, model_states=None, env_states=None, *args, **kwargs)[source]#
Return a new blank State for a DiscreteUniform instance, and a valid prediction based on that new state.
- Parameters
batch_size (int) – Number of walkers that the new model State.
model_states –
StatesModel
corresponding to the model data.env_states –
StatesEnv
containing the environment data.*args – Passed to predict.
**kwargs – Passed to predict.
- Returns
New model states containing sampled data.