:py:mod:`fragile.optimize.policy` ================================= .. py:module:: fragile.optimize.policy Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: fragile.optimize.policy.ParticleIntegration fragile.optimize.policy.ESModel fragile.optimize.policy.CMAES .. py:class:: ParticleIntegration(loc = 0.0, scale = 1.0, **kwargs) Bases: :py:obj:`fragile.core.policy.Gaussian` The Gaussian policy samples actions from a Gaussian distribution. Inherits from :class:`ContinuousPolicy`. .. py:attribute:: default_outputs :annotation: = ['actions', 'velocities'] .. py:attribute:: default_inputs .. py:method:: param_dict() :property: Return the dictionary defining all the data attributes that the component requires. .. py:method:: act(inplace = True, **kwargs) Calculate SwarmState containing the data needed to interact with the environment. .. py:method:: select_actions(**kwargs) Select actions by sampling from a Gaussian distribution. :param \*\*kwargs: Additional keyword arguments required for selecting actions. :returns: Selected actions as a tensor. :rtype: Tensor .. py:method:: reset(inplace = True, root_walker = None, states = None, **kwargs) Reset the internal state of the :class:`PolicyAPI`. :param inplace: If True, updates the swarm state with the selected actions. If False, returns the selected actions. Defaults to True. :type inplace: bool, optional :param root_walker: Set the internal state of the PolicyAPI to this value. Defaults to None. :type root_walker: Optional[StateData], optional :param states: Set the internal state of the PolicyAPI to this value. Defaults to None. :type states: Optional[StateData], optional :param \*\*kwargs: Other parameters required to reset the component. :returns: None if inplace is True. Otherwise, a StateData dictionary containing the selected actions. :rtype: Union[None, StateData] .. py:class:: ESModel(mutation = 0.5, recombination = 0.7, random_step_prob = 0.1, *args, **kwargs) Bases: :py:obj:`fragile.core.policy.Gaussian` The ESModel implements an evolutionary strategy policy. It mutates randomly some of the coordinates of the best solution found by substituting them with a proposal solution. This proposal solution is the difference between two random permutations of the best solution found. It applies a gaussian normal perturbation with a probability given by ``mutation``. .. py:method:: sample(batch_size, model_states=None, env_states=None, walkers_states=None, **kwargs) Calculate the corresponding data to interact with the Environment and store it in model states. :param batch_size: Number of new points to the sampled. :param model_states: SwarmState corresponding to the environment data. :param env_states: SwarmState corresponding to the model data. :param walkers_states: SwarmState corresponding to the walkers data. :param kwargs: Passed to the :class:`Critic` if any. :returns: Tuple containing a tensor with the sampled actions and the new model states variable. .. py:class:: CMAES(sigma, virtual_reward_fitness = False, *args, **kwargs) Bases: :py:obj:`fragile.core.policy.Gaussian` Implementation of CMAES algorithm from https://en.wikipedia.org/wiki/CMA-ES. .. py:method:: sample(batch_size, model_states=None, env_states=None, walkers_states=None, **kwargs) Calculate the corresponding data to interact with the Environment and store it in model states. :param batch_size: Number of new points to the sampled. :param model_states: SwarmState corresponding to the environment data. :param env_states: SwarmState corresponding to the model data. :param walkers_states: SwarmState corresponding to the walkers data. :param kwargs: Passed to the :class:`Critic` if any. :returns: Tuple containing a tensor with the sampled actions and the new model states variable. .. py:method:: _update_evolution_paths(actions) .. py:method:: _adapt_covariance_matrix(actions) .. py:method:: _adapt_sigma() .. py:method:: _cov_matrix_diagonalization() .. py:method:: reset(batch_size = 1, model_states=None, env_states=None, *args, **kwargs) Return a new blank State for a `DiscreteUniform` instance, and a valid prediction based on that new state. :param batch_size: Number of walkers that the new model `State`. :param model_states: :class:`StatesModel` corresponding to the model data. :param env_states: :class:`StatesEnv` containing the environment data. :param \*args: Passed to `predict`. :param \*\*kwargs: Passed to `predict`. :returns: New model states containing sampled data. .. py:method:: _sample_actions() .. py:method:: _init_algorithm_params(batch_size)