:py:mod:`fragile.optimize.policy`
=================================

.. py:module:: fragile.optimize.policy


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   fragile.optimize.policy.ParticleIntegration
   fragile.optimize.policy.ESModel
   fragile.optimize.policy.CMAES


.. py:class:: ParticleIntegration(loc = 0.0, scale = 1.0, **kwargs)

   Bases: :py:obj:`fragile.core.policy.Gaussian`

   The Gaussian policy samples actions from a Gaussian distribution.

   Inherits from :class:`ContinuousPolicy`.

   .. py:attribute:: default_outputs
      :annotation: = ['actions', 'velocities']

      
   .. py:attribute:: default_inputs
      

   .. py:method:: param_dict()
      :property:

      Return the dictionary defining all the data attributes that the component requires.


   .. py:method:: act(inplace = True, **kwargs)

      Calculate SwarmState containing the data needed to interact with the environment.


   .. py:method:: select_actions(**kwargs)

      Select actions by sampling from a Gaussian distribution.

      :param \*\*kwargs: Additional keyword arguments required for selecting actions.

      :returns: Selected actions as a tensor.
      :rtype: Tensor


   .. py:method:: reset(inplace = True, root_walker = None, states = None, **kwargs)

      Reset the internal state of the :class:`PolicyAPI`.

      :param inplace: If True, updates the swarm state with the selected actions.
                      If False, returns the selected actions. Defaults to True.
      :type inplace: bool, optional
      :param root_walker: Set the internal state of
                          the PolicyAPI to this value. Defaults to None.
      :type root_walker: Optional[StateData], optional
      :param states: Set the internal state of the
                     PolicyAPI to this value. Defaults to None.
      :type states: Optional[StateData], optional
      :param \*\*kwargs: Other parameters required to reset the component.

      :returns:

                None if inplace is True. Otherwise, a StateData dictionary
                                        containing the selected actions.
      :rtype: Union[None, StateData]


.. py:class:: ESModel(mutation = 0.5, recombination = 0.7, random_step_prob = 0.1, *args, **kwargs)

   Bases: :py:obj:`fragile.core.policy.Gaussian`

   The ESModel implements an evolutionary strategy policy.

   It mutates randomly some of the coordinates of the best solution found by     substituting them with a proposal solution. This proposal solution is the     difference between two random permutations of the best solution found.

   It applies a gaussian normal perturbation with a probability given by ``mutation``.

   .. py:method:: sample(batch_size, model_states=None, env_states=None, walkers_states=None, **kwargs)

      Calculate the corresponding data to interact with the Environment and         store it in model states.

      :param batch_size: Number of new points to the sampled.
      :param model_states: SwarmState corresponding to the environment data.
      :param env_states: SwarmState corresponding to the model data.
      :param walkers_states: SwarmState corresponding to the walkers data.
      :param kwargs: Passed to the :class:`Critic` if any.

      :returns: Tuple containing a tensor with the sampled actions and the new model states variable.


.. py:class:: CMAES(sigma, virtual_reward_fitness = False, *args, **kwargs)

   Bases: :py:obj:`fragile.core.policy.Gaussian`

   Implementation of CMAES algorithm from https://en.wikipedia.org/wiki/CMA-ES.

   .. py:method:: sample(batch_size, model_states=None, env_states=None, walkers_states=None, **kwargs)

      Calculate the corresponding data to interact with the Environment and         store it in model states.

      :param batch_size: Number of new points to the sampled.
      :param model_states: SwarmState corresponding to the environment data.
      :param env_states: SwarmState corresponding to the model data.
      :param walkers_states: SwarmState corresponding to the walkers data.
      :param kwargs: Passed to the :class:`Critic` if any.

      :returns: Tuple containing a tensor with the sampled actions and the new model states variable.


   .. py:method:: _update_evolution_paths(actions)


   .. py:method:: _adapt_covariance_matrix(actions)


   .. py:method:: _adapt_sigma()


   .. py:method:: _cov_matrix_diagonalization()


   .. py:method:: reset(batch_size = 1, model_states=None, env_states=None, *args, **kwargs)

      Return a new blank State for a `DiscreteUniform` instance, and a valid         prediction based on that new state.

      :param batch_size: Number of walkers that the new model `State`.
      :param model_states: :class:`StatesModel` corresponding to the model data.
      :param env_states: :class:`StatesEnv` containing the environment data.
      :param \*args: Passed to `predict`.
      :param \*\*kwargs: Passed to `predict`.

      :returns: New model states containing sampled data.


   .. py:method:: _sample_actions()


   .. py:method:: _init_algorithm_params(batch_size)