`fragile.optimize.policy`#

Module Contents#

Classes#

`ParticleIntegration`	The Gaussian policy samples actions from a Gaussian distribution.
`ESModel`	The ESModel implements an evolutionary strategy policy.
`CMAES`	Implementation of CMAES algorithm from https://en.wikipedia.org/wiki/CMA-ES.

class fragile.optimize.policy.ParticleIntegration(loc=0.0, scale=1.0, **kwargs)[source]#

Bases: fragile.core.policy.Gaussian

The Gaussian policy samples actions from a Gaussian distribution.

Inherits from ContinuousPolicy.

Parameters

loc (float) –
scale (float) –

default_outputs = ['actions', 'velocities']#

default_inputs#

property param_dict#

Return the dictionary defining all the data attributes that the component requires.

Return type: fragile.core.typing.StateDict

act(inplace=True, **kwargs)[source]#

Calculate SwarmState containing the data needed to interact with the environment.

Parameters: inplace (bool) –
Return type: Union[None, fragile.core.typing.StateData]

select_actions(**kwargs)[source]#

Select actions by sampling from a Gaussian distribution.

Parameters: **kwargs – Additional keyword arguments required for selecting actions.
Returns: Selected actions as a tensor.
Return type: Tensor

reset(inplace=True, root_walker=None, states=None, **kwargs)[source]#

Reset the internal state of the PolicyAPI.

Parameters

inplace (bool, optional) – If True, updates the swarm state with the selected actions. If False, returns the selected actions. Defaults to True.
root_walker (Optional[StateData], optional) – Set the internal state of the PolicyAPI to this value. Defaults to None.
states (Optional[StateData], optional) – Set the internal state of the PolicyAPI to this value. Defaults to None.
**kwargs – Other parameters required to reset the component.

Returns

None if inplace is True. Otherwise, a StateData dictionary: containing the selected actions.

Return type

Union[None, StateData]

class fragile.optimize.policy.ESModel(mutation=0.5, recombination=0.7, random_step_prob=0.1, *args, **kwargs)[source]#

Bases: fragile.core.policy.Gaussian

The ESModel implements an evolutionary strategy policy.

It mutates randomly some of the coordinates of the best solution found by substituting them with a proposal solution. This proposal solution is the difference between two random permutations of the best solution found.

It applies a gaussian normal perturbation with a probability given by mutation.

Parameters

mutation (float) –
recombination (float) –
random_step_prob (float) –

sample(batch_size, model_states=None, env_states=None, walkers_states=None, **kwargs)[source]#

Calculate the corresponding data to interact with the Environment and store it in model states.

Parameters

batch_size (int) – Number of new points to the sampled.
model_states – SwarmState corresponding to the environment data.
env_states – SwarmState corresponding to the model data.
walkers_states – SwarmState corresponding to the walkers data.
kwargs – Passed to the Critic if any.

Returns

Tuple containing a tensor with the sampled actions and the new model states variable.

class fragile.optimize.policy.CMAES(sigma, virtual_reward_fitness=False, *args, **kwargs)[source]#

Bases: fragile.core.policy.Gaussian

Implementation of CMAES algorithm from https://en.wikipedia.org/wiki/CMA-ES.

Parameters

sigma (float) –
virtual_reward_fitness (bool) –

sample(batch_size, model_states=None, env_states=None, walkers_states=None, **kwargs)[source]#

Calculate the corresponding data to interact with the Environment and store it in model states.

Parameters

batch_size (int) – Number of new points to the sampled.
model_states – SwarmState corresponding to the environment data.
env_states – SwarmState corresponding to the model data.
walkers_states – SwarmState corresponding to the walkers data.
kwargs – Passed to the Critic if any.

Returns

Tuple containing a tensor with the sampled actions and the new model states variable.

_update_evolution_paths(actions)[source]#

_adapt_covariance_matrix(actions)[source]#

_adapt_sigma()[source]#

_cov_matrix_diagonalization()[source]#

reset(batch_size=1, model_states=None, env_states=None, *args, **kwargs)[source]#

Return a new blank State for a DiscreteUniform instance, and a valid prediction based on that new state.

Parameters

batch_size (int) – Number of walkers that the new model State.
model_states – StatesModel corresponding to the model data.
env_states – StatesEnv containing the environment data.
*args – Passed to predict.
**kwargs – Passed to predict.

Returns

New model states containing sampled data.

_sample_actions()[source]#

Return type: numpy.ndarray

_init_algorithm_params(batch_size)[source]#

fragile.optimize.policy

Contents

fragile.optimize.policy#

Module Contents#

Classes#

`fragile.optimize.policy`#