:py:mod:`fragile.core.policy` ============================= .. py:module:: fragile.core.policy Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: fragile.core.policy.DummyPolicy fragile.core.policy.RandomPlangym fragile.core.policy.Discrete fragile.core.policy.BinarySwap fragile.core.policy.ContinuousPolicy fragile.core.policy.ZeroContinuous fragile.core.policy.Uniform fragile.core.policy.Gaussian fragile.core.policy.GaussianModulus .. py:class:: DummyPolicy(swarm = None, param_dict = None, inputs = None, outputs = None) Bases: :py:obj:`fragile.core.api_classes.PolicyAPI` The policy is in charge of calculating the interactions with the :class:`Environment`. The PolicyAPI class is responsible for defining the policy that determines the actions for interacting with the :class:`Environment` in a swarm simulation. This is an abstract base class, and specific policy implementations should inherit from this class and implement the 'select_actions' method. .. py:method:: select_actions(**kwargs) Select actions for each walker in the swarm based on the current state. This method must be implemented by subclasses. :param \*\*kwargs: Additional keyword arguments required for selecting actions. :returns: The selected actions as a Tensor or a StateData dictionary. :rtype: Union[Tensor, StateData] .. py:class:: RandomPlangym(swarm = None, param_dict = None, inputs = None, outputs = None) Bases: :py:obj:`fragile.core.api_classes.PolicyAPI` Policy that selects random actions from the environment action space. .. py:method:: setup(swarm) Setup the policy. :param swarm: Swarm that will use this policy. :type swarm: SwarmAPI :raises TypeError: If the environment does not have a sample_action method or an action_space. .. py:method:: select_actions(**kwargs) Sample a random action from the environment action space. .. py:class:: Discrete(actions = None, probs = None, **kwargs) Bases: :py:obj:`fragile.core.api_classes.PolicyAPI` A policy that selects discrete actions according to some probability distribution. .. py:method:: n_actions() :property: The number of possible actions. .. py:method:: actions() :property: The possible actions that can be taken by this policy. .. py:method:: select_actions(**kwargs) Select a random action from the possible actions. :returns: An array of shape (swarm.n_walkers,) containing the selected actions. .. py:method:: setup(swarm) Sets up the policy, inferring any missing parameters. :param swarm: The swarm that is using this policy. :type swarm: SwarmAPI :raises TypeError: If n_actions cannot be inferred. .. py:method:: _setup_params(actions = None) Setup the parameters of the policy. .. py:class:: BinarySwap(n_swaps, n_actions = None, **kwargs) Bases: :py:obj:`fragile.core.api_classes.PolicyAPI` The policy is in charge of calculating the interactions with the :class:`Environment`. The PolicyAPI class is responsible for defining the policy that determines the actions for interacting with the :class:`Environment` in a swarm simulation. This is an abstract base class, and specific policy implementations should inherit from this class and implement the 'select_actions' method. .. py:method:: n_actions() :property: .. py:method:: n_swaps() :property: .. py:method:: select_actions(**kwargs) Select actions for each walker in the swarm based on the current state. This method must be implemented by subclasses. :param \*\*kwargs: Additional keyword arguments required for selecting actions. :returns: The selected actions as a Tensor or a StateData dictionary. :rtype: Union[Tensor, StateData] .. py:method:: setup(swarm) Prepare the component during the setup phase of the :class:`Swarm`. .. py:class:: ContinuousPolicy(bounds=None, second_order = False, step = 1.0, **kwargs) Bases: :py:obj:`fragile.core.api_classes.PolicyAPI` ContinuousPolicy implements a continuous action space policy for interacting with the environment. :param bounds: Action space bounds. If not provided, the bounds are obtained from the environment. Defaults to `None`. :type bounds: Bounds, optional :param second_order: If `True`, the policy is considered second-order, and the action sampled will be added to the last value. Defaults to `False`. :type second_order: bool, optional :param step: The step size for updating the actions. Defaults to 1.0. :type step: float, optional :param \*\*kwargs: Additional keyword arguments for the base PolicyAPI class. .. attribute:: second_order If `True`, the policy is considered second-order. :type: bool .. attribute:: step The step size for updating the actions. :type: float .. attribute:: bounds Action space bounds. :type: Bounds .. attribute:: _env_bounds Environment action space bounds. :type: Bounds .. py:method:: env_bounds() :property: Returns the environment action space bounds. .. py:method:: select_actions(**kwargs) :abstractmethod: Implement the functionality for selecting actions in the derived class. This method is called during the act operation. :param \*\*kwargs: Additional keyword arguments required for selecting actions. .. py:method:: act(inplace = True, **kwargs) Calculate the data needed to interact with the :class:`Environment`. :param inplace: If `True`, updates the swarm state with the selected actions. If `False`, returns the selected actions. Defaults to `True`. :type inplace: bool, optional :param \*\*kwargs: Additional keyword arguments required for acting. :returns: A dictionary containing the selected actions if inplace is `False. Otherwise, returns `None`. :rtype: Union[None, StateData] .. py:method:: setup(swarm) Set up the policy with the provided swarm object. :param swarm: The swarm object to set up the policy with. :type swarm: SwarmAPI :returns: None .. py:class:: ZeroContinuous(bounds=None, second_order = False, step = 1.0, **kwargs) Bases: :py:obj:`ContinuousPolicy` Uniform policy samples actions equal to zero. Inherits from :class:`ContinuousPolicy`. .. py:method:: select_actions(**kwargs) Select a vector of zeros. :param \*\*kwargs: Additional keyword arguments required for selecting actions. :returns: Selected actions as a tensor. :rtype: Tensor .. py:class:: Uniform(bounds=None, second_order = False, step = 1.0, **kwargs) Bases: :py:obj:`ContinuousPolicy` Uniform policy samples actions uniformly from the bounds of the action space. Inherits from :class:`ContinuousPolicy`. .. py:method:: select_actions(**kwargs) Select actions by sampling uniformly from the action space bounds. :param \*\*kwargs: Additional keyword arguments required for selecting actions. :returns: Selected actions as a tensor. :rtype: Tensor .. py:class:: Gaussian(loc = 0.0, scale = 1.0, **kwargs) Bases: :py:obj:`ContinuousPolicy` The Gaussian policy samples actions from a Gaussian distribution. Inherits from :class:`ContinuousPolicy`. .. py:method:: select_actions(**kwargs) Select actions by sampling from a Gaussian distribution. :param \*\*kwargs: Additional keyword arguments required for selecting actions. :returns: Selected actions as a tensor. :rtype: Tensor .. py:class:: GaussianModulus(loc = 0.0, scale = 1.0, **kwargs) Bases: :py:obj:`ContinuousPolicy` ContinuousPolicy implements a continuous action space policy for interacting with the environment. :param bounds: Action space bounds. If not provided, the bounds are obtained from the environment. Defaults to `None`. :type bounds: Bounds, optional :param second_order: If `True`, the policy is considered second-order, and the action sampled will be added to the last value. Defaults to `False`. :type second_order: bool, optional :param step: The step size for updating the actions. Defaults to 1.0. :type step: float, optional :param \*\*kwargs: Additional keyword arguments for the base PolicyAPI class. .. attribute:: second_order If `True`, the policy is considered second-order. :type: bool .. attribute:: step The step size for updating the actions. :type: float .. attribute:: bounds Action space bounds. :type: Bounds .. attribute:: _env_bounds Environment action space bounds. :type: Bounds .. py:method:: select_actions(**kwargs) Implement the functionality for selecting actions in the derived class. This method is called during the act operation. :param \*\*kwargs: Additional keyword arguments required for selecting actions.