fragile.optimize.policy#

Module Contents#

Classes#

ParticleIntegration

The Gaussian policy samples actions from a Gaussian distribution.

ESModel

The ESModel implements an evolutionary strategy policy.

CMAES

Implementation of CMAES algorithm from https://en.wikipedia.org/wiki/CMA-ES.

class fragile.optimize.policy.ParticleIntegration(loc=0.0, scale=1.0, **kwargs)[source]#

Bases: fragile.core.policy.Gaussian

The Gaussian policy samples actions from a Gaussian distribution.

Inherits from ContinuousPolicy.

Parameters
default_outputs = ['actions', 'velocities']#
default_inputs#
property param_dict#

Return the dictionary defining all the data attributes that the component requires.

Return type

fragile.core.typing.StateDict

act(inplace=True, **kwargs)[source]#

Calculate SwarmState containing the data needed to interact with the environment.

Parameters

inplace (bool) –

Return type

Union[None, fragile.core.typing.StateData]

select_actions(**kwargs)[source]#

Select actions by sampling from a Gaussian distribution.

Parameters

**kwargs – Additional keyword arguments required for selecting actions.

Returns

Selected actions as a tensor.

Return type

Tensor

reset(inplace=True, root_walker=None, states=None, **kwargs)[source]#

Reset the internal state of the PolicyAPI.

Parameters
  • inplace (bool, optional) – If True, updates the swarm state with the selected actions. If False, returns the selected actions. Defaults to True.

  • root_walker (Optional[StateData], optional) – Set the internal state of the PolicyAPI to this value. Defaults to None.

  • states (Optional[StateData], optional) – Set the internal state of the PolicyAPI to this value. Defaults to None.

  • **kwargs – Other parameters required to reset the component.

Returns

None if inplace is True. Otherwise, a StateData dictionary

containing the selected actions.

Return type

Union[None, StateData]

class fragile.optimize.policy.ESModel(mutation=0.5, recombination=0.7, random_step_prob=0.1, *args, **kwargs)[source]#

Bases: fragile.core.policy.Gaussian

The ESModel implements an evolutionary strategy policy.

It mutates randomly some of the coordinates of the best solution found by substituting them with a proposal solution. This proposal solution is the difference between two random permutations of the best solution found.

It applies a gaussian normal perturbation with a probability given by mutation.

Parameters
  • mutation (float) –

  • recombination (float) –

  • random_step_prob (float) –

sample(batch_size, model_states=None, env_states=None, walkers_states=None, **kwargs)[source]#

Calculate the corresponding data to interact with the Environment and store it in model states.

Parameters
  • batch_size (int) – Number of new points to the sampled.

  • model_states – SwarmState corresponding to the environment data.

  • env_states – SwarmState corresponding to the model data.

  • walkers_states – SwarmState corresponding to the walkers data.

  • kwargs – Passed to the Critic if any.

Returns

Tuple containing a tensor with the sampled actions and the new model states variable.

class fragile.optimize.policy.CMAES(sigma, virtual_reward_fitness=False, *args, **kwargs)[source]#

Bases: fragile.core.policy.Gaussian

Implementation of CMAES algorithm from https://en.wikipedia.org/wiki/CMA-ES.

Parameters
  • sigma (float) –

  • virtual_reward_fitness (bool) –

sample(batch_size, model_states=None, env_states=None, walkers_states=None, **kwargs)[source]#

Calculate the corresponding data to interact with the Environment and store it in model states.

Parameters
  • batch_size (int) – Number of new points to the sampled.

  • model_states – SwarmState corresponding to the environment data.

  • env_states – SwarmState corresponding to the model data.

  • walkers_states – SwarmState corresponding to the walkers data.

  • kwargs – Passed to the Critic if any.

Returns

Tuple containing a tensor with the sampled actions and the new model states variable.

_update_evolution_paths(actions)[source]#
_adapt_covariance_matrix(actions)[source]#
_adapt_sigma()[source]#
_cov_matrix_diagonalization()[source]#
reset(batch_size=1, model_states=None, env_states=None, *args, **kwargs)[source]#

Return a new blank State for a DiscreteUniform instance, and a valid prediction based on that new state.

Parameters
  • batch_size (int) – Number of walkers that the new model State.

  • model_statesStatesModel corresponding to the model data.

  • env_statesStatesEnv containing the environment data.

  • *args – Passed to predict.

  • **kwargs – Passed to predict.

Returns

New model states containing sampled data.

_sample_actions()[source]#
Return type

numpy.ndarray

_init_algorithm_params(batch_size)[source]#