:py:mod:`fragile.core.policy`
=============================

.. py:module:: fragile.core.policy


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   fragile.core.policy.DummyPolicy
   fragile.core.policy.RandomPlangym
   fragile.core.policy.Discrete
   fragile.core.policy.BinarySwap
   fragile.core.policy.ContinuousPolicy
   fragile.core.policy.ZeroContinuous
   fragile.core.policy.Uniform
   fragile.core.policy.Gaussian
   fragile.core.policy.GaussianModulus


.. py:class:: DummyPolicy(swarm = None, param_dict = None, inputs = None, outputs = None)

   Bases: :py:obj:`fragile.core.api_classes.PolicyAPI`

   The policy is in charge of calculating the interactions with the :class:`Environment`.

   The PolicyAPI class is responsible for defining the policy that determines the actions
   for interacting with the :class:`Environment` in a swarm simulation. This is an abstract
   base class, and specific policy implementations should inherit from this class
   and implement the 'select_actions' method.

   .. py:method:: select_actions(**kwargs)

      Select actions for each walker in the swarm based on the current state.

      This method must be implemented by subclasses.

      :param \*\*kwargs: Additional keyword arguments required for selecting actions.

      :returns: The selected actions as a Tensor or a StateData dictionary.
      :rtype: Union[Tensor, StateData]


.. py:class:: RandomPlangym(swarm = None, param_dict = None, inputs = None, outputs = None)

   Bases: :py:obj:`fragile.core.api_classes.PolicyAPI`

   Policy that selects random actions from the environment action space.

   .. py:method:: setup(swarm)

      Setup the policy.

      :param swarm: Swarm that will use this policy.
      :type swarm: SwarmAPI

      :raises TypeError: If the environment does not have a sample_action method or an action_space.


   .. py:method:: select_actions(**kwargs)

      Sample a random action from the environment action space.


.. py:class:: Discrete(actions = None, probs = None, **kwargs)

   Bases: :py:obj:`fragile.core.api_classes.PolicyAPI`

   A policy that selects discrete actions according to some probability distribution.

   .. py:method:: n_actions()
      :property:

      The number of possible actions.


   .. py:method:: actions()
      :property:

      The possible actions that can be taken by this policy.


   .. py:method:: select_actions(**kwargs)

      Select a random action from the possible actions.

      :returns: An array of shape (swarm.n_walkers,) containing the selected actions.


   .. py:method:: setup(swarm)

      Sets up the policy, inferring any missing parameters.

      :param swarm: The swarm that is using this policy.
      :type swarm: SwarmAPI

      :raises TypeError: If n_actions cannot be inferred.


   .. py:method:: _setup_params(actions = None)

      Setup the parameters of the policy.


.. py:class:: BinarySwap(n_swaps, n_actions = None, **kwargs)

   Bases: :py:obj:`fragile.core.api_classes.PolicyAPI`

   The policy is in charge of calculating the interactions with the :class:`Environment`.

   The PolicyAPI class is responsible for defining the policy that determines the actions
   for interacting with the :class:`Environment` in a swarm simulation. This is an abstract
   base class, and specific policy implementations should inherit from this class
   and implement the 'select_actions' method.

   .. py:method:: n_actions()
      :property:


   .. py:method:: n_swaps()
      :property:


   .. py:method:: select_actions(**kwargs)

      Select actions for each walker in the swarm based on the current state.

      This method must be implemented by subclasses.

      :param \*\*kwargs: Additional keyword arguments required for selecting actions.

      :returns: The selected actions as a Tensor or a StateData dictionary.
      :rtype: Union[Tensor, StateData]


   .. py:method:: setup(swarm)

      Prepare the component during the setup phase of the :class:`Swarm`.


.. py:class:: ContinuousPolicy(bounds=None, second_order = False, step = 1.0, **kwargs)

   Bases: :py:obj:`fragile.core.api_classes.PolicyAPI`

   ContinuousPolicy implements a continuous action space policy for interacting     with the environment.

   :param bounds: Action space bounds. If not provided, the bounds are obtained
                  from the environment. Defaults to `None`.
   :type bounds: Bounds, optional
   :param second_order: If `True`, the policy is considered second-order, and the             action sampled will be added to the last value. Defaults to `False`.
   :type second_order: bool, optional
   :param step: The step size for updating the actions. Defaults to 1.0.
   :type step: float, optional
   :param \*\*kwargs: Additional keyword arguments for the base PolicyAPI class.

   .. attribute:: second_order

      If `True`, the policy is considered second-order.

      :type: bool

   .. attribute:: step

      The step size for updating the actions.

      :type: float

   .. attribute:: bounds

      Action space bounds.

      :type: Bounds

   .. attribute:: _env_bounds

      Environment action space bounds.

      :type: Bounds

   .. py:method:: env_bounds()
      :property:

      Returns the environment action space bounds.


   .. py:method:: select_actions(**kwargs)
      :abstractmethod:

      Implement the functionality for selecting actions in the derived class. This method is
      called during the act operation.

      :param \*\*kwargs: Additional keyword arguments required for selecting actions.


   .. py:method:: act(inplace = True, **kwargs)

      Calculate the data needed to interact with the :class:`Environment`.

      :param inplace: If `True`, updates the swarm state with the selected actions.
                      If `False`, returns the selected actions. Defaults to `True`.
      :type inplace: bool, optional
      :param \*\*kwargs: Additional keyword arguments required for acting.

      :returns:

                A dictionary containing the selected actions if inplace is
                                        `False. Otherwise, returns `None`.
      :rtype: Union[None, StateData]


   .. py:method:: setup(swarm)

      Set up the policy with the provided swarm object.

      :param swarm: The swarm object to set up the policy with.
      :type swarm: SwarmAPI

      :returns: None


.. py:class:: ZeroContinuous(bounds=None, second_order = False, step = 1.0, **kwargs)

   Bases: :py:obj:`ContinuousPolicy`

   Uniform policy samples actions equal to zero.

   Inherits from :class:`ContinuousPolicy`.

   .. py:method:: select_actions(**kwargs)

      Select a vector of zeros.

      :param \*\*kwargs: Additional keyword arguments required for selecting actions.

      :returns: Selected actions as a tensor.
      :rtype: Tensor


.. py:class:: Uniform(bounds=None, second_order = False, step = 1.0, **kwargs)

   Bases: :py:obj:`ContinuousPolicy`

   Uniform policy samples actions uniformly from the bounds of the action space.

   Inherits from :class:`ContinuousPolicy`.

   .. py:method:: select_actions(**kwargs)

      Select actions by sampling uniformly from the action space bounds.

      :param \*\*kwargs: Additional keyword arguments required for selecting actions.

      :returns: Selected actions as a tensor.
      :rtype: Tensor


.. py:class:: Gaussian(loc = 0.0, scale = 1.0, **kwargs)

   Bases: :py:obj:`ContinuousPolicy`

   The Gaussian policy samples actions from a Gaussian distribution.

   Inherits from :class:`ContinuousPolicy`.

   .. py:method:: select_actions(**kwargs)

      Select actions by sampling from a Gaussian distribution.

      :param \*\*kwargs: Additional keyword arguments required for selecting actions.

      :returns: Selected actions as a tensor.
      :rtype: Tensor


.. py:class:: GaussianModulus(loc = 0.0, scale = 1.0, **kwargs)

   Bases: :py:obj:`ContinuousPolicy`

   ContinuousPolicy implements a continuous action space policy for interacting     with the environment.

   :param bounds: Action space bounds. If not provided, the bounds are obtained
                  from the environment. Defaults to `None`.
   :type bounds: Bounds, optional
   :param second_order: If `True`, the policy is considered second-order, and the             action sampled will be added to the last value. Defaults to `False`.
   :type second_order: bool, optional
   :param step: The step size for updating the actions. Defaults to 1.0.
   :type step: float, optional
   :param \*\*kwargs: Additional keyword arguments for the base PolicyAPI class.

   .. attribute:: second_order

      If `True`, the policy is considered second-order.

      :type: bool

   .. attribute:: step

      The step size for updating the actions.

      :type: float

   .. attribute:: bounds

      Action space bounds.

      :type: Bounds

   .. attribute:: _env_bounds

      Environment action space bounds.

      :type: Bounds

   .. py:method:: select_actions(**kwargs)

      Implement the functionality for selecting actions in the derived class. This method is
      called during the act operation.

      :param \*\*kwargs: Additional keyword arguments required for selecting actions.