fragile.core.policy
Contents
fragile.core.policy#
Module Contents#
Classes#
The policy is in charge of calculating the interactions with the |
|
Policy that selects random actions from the environment action space. |
|
A policy that selects discrete actions according to some probability distribution. |
|
The policy is in charge of calculating the interactions with the |
|
ContinuousPolicy implements a continuous action space policy for interacting with the environment. |
|
Uniform policy samples actions equal to zero. |
|
Uniform policy samples actions uniformly from the bounds of the action space. |
|
The Gaussian policy samples actions from a Gaussian distribution. |
|
ContinuousPolicy implements a continuous action space policy for interacting with the environment. |
- class fragile.core.policy.DummyPolicy(swarm=None, param_dict=None, inputs=None, outputs=None)[source]#
Bases:
fragile.core.api_classes.PolicyAPIThe policy is in charge of calculating the interactions with the
Environment.The PolicyAPI class is responsible for defining the policy that determines the actions for interacting with the
Environmentin a swarm simulation. This is an abstract base class, and specific policy implementations should inherit from this class and implement the ‘select_actions’ method.- Parameters
- select_actions(**kwargs)[source]#
Select actions for each walker in the swarm based on the current state.
This method must be implemented by subclasses.
- Parameters
**kwargs – Additional keyword arguments required for selecting actions.
- Returns
The selected actions as a Tensor or a StateData dictionary.
- Return type
Union[Tensor, StateData]
- class fragile.core.policy.RandomPlangym(swarm=None, param_dict=None, inputs=None, outputs=None)[source]#
Bases:
fragile.core.api_classes.PolicyAPIPolicy that selects random actions from the environment action space.
- Parameters
- class fragile.core.policy.Discrete(actions=None, probs=None, **kwargs)[source]#
Bases:
fragile.core.api_classes.PolicyAPIA policy that selects discrete actions according to some probability distribution.
- Parameters
actions (Optional[fragile.core.typing.Tensor]) –
probs (Optional[fragile.core.typing.Tensor]) –
- property actions#
The possible actions that can be taken by this policy.
- Return type
fragile.core.typing.Tensor
- select_actions(**kwargs)[source]#
Select a random action from the possible actions.
- Returns
An array of shape (swarm.n_walkers,) containing the selected actions.
- Return type
fragile.core.typing.Tensor
- class fragile.core.policy.BinarySwap(n_swaps, n_actions=None, **kwargs)[source]#
Bases:
fragile.core.api_classes.PolicyAPIThe policy is in charge of calculating the interactions with the
Environment.The PolicyAPI class is responsible for defining the policy that determines the actions for interacting with the
Environmentin a swarm simulation. This is an abstract base class, and specific policy implementations should inherit from this class and implement the ‘select_actions’ method.- property n_actions#
- property n_swaps#
- select_actions(**kwargs)[source]#
Select actions for each walker in the swarm based on the current state.
This method must be implemented by subclasses.
- Parameters
**kwargs – Additional keyword arguments required for selecting actions.
- Returns
The selected actions as a Tensor or a StateData dictionary.
- Return type
Union[Tensor, StateData]
- setup(swarm)[source]#
Prepare the component during the setup phase of the
Swarm.- Parameters
swarm (fragile.core.api_classes.SwarmAPI) –
- class fragile.core.policy.ContinuousPolicy(bounds=None, second_order=False, step=1.0, **kwargs)[source]#
Bases:
fragile.core.api_classes.PolicyAPIContinuousPolicy implements a continuous action space policy for interacting with the environment.
- Parameters
bounds (Bounds, optional) – Action space bounds. If not provided, the bounds are obtained from the environment. Defaults to None.
second_order (bool, optional) – If True, the policy is considered second-order, and the action sampled will be added to the last value. Defaults to False.
step (float, optional) – The step size for updating the actions. Defaults to 1.0.
**kwargs – Additional keyword arguments for the base PolicyAPI class.
- bounds#
Action space bounds.
- Type
Bounds
- _env_bounds#
Environment action space bounds.
- Type
Bounds
- property env_bounds#
Returns the environment action space bounds.
- Return type
judo.Bounds
- abstract select_actions(**kwargs)[source]#
Implement the functionality for selecting actions in the derived class. This method is called during the act operation.
- Parameters
**kwargs – Additional keyword arguments required for selecting actions.
- act(inplace=True, **kwargs)[source]#
Calculate the data needed to interact with the
Environment.- Parameters
inplace (bool, optional) – If True, updates the swarm state with the selected actions. If False, returns the selected actions. Defaults to True.
**kwargs – Additional keyword arguments required for acting.
- Returns
- A dictionary containing the selected actions if inplace is
False. Otherwise, returns `None.
- Return type
Union[None, StateData]
- class fragile.core.policy.ZeroContinuous(bounds=None, second_order=False, step=1.0, **kwargs)[source]#
Bases:
ContinuousPolicyUniform policy samples actions equal to zero.
Inherits from
ContinuousPolicy.
- class fragile.core.policy.Uniform(bounds=None, second_order=False, step=1.0, **kwargs)[source]#
Bases:
ContinuousPolicyUniform policy samples actions uniformly from the bounds of the action space.
Inherits from
ContinuousPolicy.
- class fragile.core.policy.Gaussian(loc=0.0, scale=1.0, **kwargs)[source]#
Bases:
ContinuousPolicyThe Gaussian policy samples actions from a Gaussian distribution.
Inherits from
ContinuousPolicy.
- class fragile.core.policy.GaussianModulus(loc=0.0, scale=1.0, **kwargs)[source]#
Bases:
ContinuousPolicyContinuousPolicy implements a continuous action space policy for interacting with the environment.
- Parameters
bounds (Bounds, optional) – Action space bounds. If not provided, the bounds are obtained from the environment. Defaults to None.
second_order (bool, optional) – If True, the policy is considered second-order, and the action sampled will be added to the last value. Defaults to False.
step (float, optional) – The step size for updating the actions. Defaults to 1.0.
**kwargs – Additional keyword arguments for the base PolicyAPI class.
loc (float) –
scale (float) –
- bounds#
Action space bounds.
- Type
Bounds
- _env_bounds#
Environment action space bounds.
- Type
Bounds