fragile.core.policy
Contents
fragile.core.policy
#
Module Contents#
Classes#
The policy is in charge of calculating the interactions with the |
|
Policy that selects random actions from the environment action space. |
|
A policy that selects discrete actions according to some probability distribution. |
|
The policy is in charge of calculating the interactions with the |
|
ContinuousPolicy implements a continuous action space policy for interacting with the environment. |
|
Uniform policy samples actions equal to zero. |
|
Uniform policy samples actions uniformly from the bounds of the action space. |
|
The Gaussian policy samples actions from a Gaussian distribution. |
|
ContinuousPolicy implements a continuous action space policy for interacting with the environment. |
- class fragile.core.policy.DummyPolicy(swarm=None, param_dict=None, inputs=None, outputs=None)[source]#
Bases:
fragile.core.api_classes.PolicyAPI
The policy is in charge of calculating the interactions with the
Environment
.The PolicyAPI class is responsible for defining the policy that determines the actions for interacting with the
Environment
in a swarm simulation. This is an abstract base class, and specific policy implementations should inherit from this class and implement the ‘select_actions’ method.- Parameters
- select_actions(**kwargs)[source]#
Select actions for each walker in the swarm based on the current state.
This method must be implemented by subclasses.
- Parameters
**kwargs – Additional keyword arguments required for selecting actions.
- Returns
The selected actions as a Tensor or a StateData dictionary.
- Return type
Union[Tensor, StateData]
- class fragile.core.policy.RandomPlangym(swarm=None, param_dict=None, inputs=None, outputs=None)[source]#
Bases:
fragile.core.api_classes.PolicyAPI
Policy that selects random actions from the environment action space.
- Parameters
- class fragile.core.policy.Discrete(actions=None, probs=None, **kwargs)[source]#
Bases:
fragile.core.api_classes.PolicyAPI
A policy that selects discrete actions according to some probability distribution.
- Parameters
actions (Optional[fragile.core.typing.Tensor]) –
probs (Optional[fragile.core.typing.Tensor]) –
- property actions#
The possible actions that can be taken by this policy.
- Return type
fragile.core.typing.Tensor
- select_actions(**kwargs)[source]#
Select a random action from the possible actions.
- Returns
An array of shape (swarm.n_walkers,) containing the selected actions.
- Return type
fragile.core.typing.Tensor
- class fragile.core.policy.BinarySwap(n_swaps, n_actions=None, **kwargs)[source]#
Bases:
fragile.core.api_classes.PolicyAPI
The policy is in charge of calculating the interactions with the
Environment
.The PolicyAPI class is responsible for defining the policy that determines the actions for interacting with the
Environment
in a swarm simulation. This is an abstract base class, and specific policy implementations should inherit from this class and implement the ‘select_actions’ method.- property n_actions#
- property n_swaps#
- select_actions(**kwargs)[source]#
Select actions for each walker in the swarm based on the current state.
This method must be implemented by subclasses.
- Parameters
**kwargs – Additional keyword arguments required for selecting actions.
- Returns
The selected actions as a Tensor or a StateData dictionary.
- Return type
Union[Tensor, StateData]
- setup(swarm)[source]#
Prepare the component during the setup phase of the
Swarm
.- Parameters
swarm (fragile.core.api_classes.SwarmAPI) –
- class fragile.core.policy.ContinuousPolicy(bounds=None, second_order=False, step=1.0, **kwargs)[source]#
Bases:
fragile.core.api_classes.PolicyAPI
ContinuousPolicy implements a continuous action space policy for interacting with the environment.
- Parameters
bounds (Bounds, optional) – Action space bounds. If not provided, the bounds are obtained from the environment. Defaults to None.
second_order (bool, optional) – If True, the policy is considered second-order, and the action sampled will be added to the last value. Defaults to False.
step (float, optional) – The step size for updating the actions. Defaults to 1.0.
**kwargs – Additional keyword arguments for the base PolicyAPI class.
- bounds#
Action space bounds.
- Type
Bounds
- _env_bounds#
Environment action space bounds.
- Type
Bounds
- property env_bounds#
Returns the environment action space bounds.
- Return type
judo.Bounds
- abstract select_actions(**kwargs)[source]#
Implement the functionality for selecting actions in the derived class. This method is called during the act operation.
- Parameters
**kwargs – Additional keyword arguments required for selecting actions.
- act(inplace=True, **kwargs)[source]#
Calculate the data needed to interact with the
Environment
.- Parameters
inplace (bool, optional) – If True, updates the swarm state with the selected actions. If False, returns the selected actions. Defaults to True.
**kwargs – Additional keyword arguments required for acting.
- Returns
- A dictionary containing the selected actions if inplace is
False. Otherwise, returns `None.
- Return type
Union[None, StateData]
- class fragile.core.policy.ZeroContinuous(bounds=None, second_order=False, step=1.0, **kwargs)[source]#
Bases:
ContinuousPolicy
Uniform policy samples actions equal to zero.
Inherits from
ContinuousPolicy
.
- class fragile.core.policy.Uniform(bounds=None, second_order=False, step=1.0, **kwargs)[source]#
Bases:
ContinuousPolicy
Uniform policy samples actions uniformly from the bounds of the action space.
Inherits from
ContinuousPolicy
.
- class fragile.core.policy.Gaussian(loc=0.0, scale=1.0, **kwargs)[source]#
Bases:
ContinuousPolicy
The Gaussian policy samples actions from a Gaussian distribution.
Inherits from
ContinuousPolicy
.
- class fragile.core.policy.GaussianModulus(loc=0.0, scale=1.0, **kwargs)[source]#
Bases:
ContinuousPolicy
ContinuousPolicy implements a continuous action space policy for interacting with the environment.
- Parameters
bounds (Bounds, optional) – Action space bounds. If not provided, the bounds are obtained from the environment. Defaults to None.
second_order (bool, optional) – If True, the policy is considered second-order, and the action sampled will be added to the last value. Defaults to False.
step (float, optional) – The step size for updating the actions. Defaults to 1.0.
**kwargs – Additional keyword arguments for the base PolicyAPI class.
loc (float) –
scale (float) –
- bounds#
Action space bounds.
- Type
Bounds
- _env_bounds#
Environment action space bounds.
- Type
Bounds