`fragile.core.policy`#

Module Contents#

Classes#

`DummyPolicy`	The policy is in charge of calculating the interactions with the `Environment`.
`RandomPlangym`	Policy that selects random actions from the environment action space.
`Discrete`	A policy that selects discrete actions according to some probability distribution.
`BinarySwap`	The policy is in charge of calculating the interactions with the `Environment`.
`ContinuousPolicy`	ContinuousPolicy implements a continuous action space policy for interacting with the environment.
`ZeroContinuous`	Uniform policy samples actions equal to zero.
`Uniform`	Uniform policy samples actions uniformly from the bounds of the action space.
`Gaussian`	The Gaussian policy samples actions from a Gaussian distribution.
`GaussianModulus`	ContinuousPolicy implements a continuous action space policy for interacting with the environment.

class fragile.core.policy.DummyPolicy(swarm=None, param_dict=None, inputs=None, outputs=None)[source]#

Bases: fragile.core.api_classes.PolicyAPI

The policy is in charge of calculating the interactions with the Environment.

The PolicyAPI class is responsible for defining the policy that determines the actions for interacting with the Environment in a swarm simulation. This is an abstract base class, and specific policy implementations should inherit from this class and implement the ‘select_actions’ method.

Parameters

swarm (Optional[SwarmAPI]) –
param_dict (Optional[fragile.core.typing.StateDict]) –
inputs (Optional[fragile.core.typing.InputDict]) –
outputs (Optional[Tuple[str]]) –

select_actions(**kwargs)[source]#

Select actions for each walker in the swarm based on the current state.

This method must be implemented by subclasses.

Parameters: **kwargs – Additional keyword arguments required for selecting actions.
Returns: The selected actions as a Tensor or a StateData dictionary.
Return type: Union[Tensor, StateData]

class fragile.core.policy.RandomPlangym(swarm=None, param_dict=None, inputs=None, outputs=None)[source]#

Bases: fragile.core.api_classes.PolicyAPI

Policy that selects random actions from the environment action space.

Parameters

swarm (Optional[SwarmAPI]) –
param_dict (Optional[fragile.core.typing.StateDict]) –
inputs (Optional[fragile.core.typing.InputDict]) –
outputs (Optional[Tuple[str]]) –

setup(swarm)[source]#

Setup the policy.

Parameters: swarm (SwarmAPI) – Swarm that will use this policy.
Raises: TypeError – If the environment does not have a sample_action method or an action_space.

select_actions(**kwargs)[source]#

Sample a random action from the environment action space.

Return type: list

class fragile.core.policy.Discrete(actions=None, probs=None, **kwargs)[source]#

Bases: fragile.core.api_classes.PolicyAPI

A policy that selects discrete actions according to some probability distribution.

Parameters

actions (Optional[fragile.core.typing.Tensor]) –
probs (Optional[fragile.core.typing.Tensor]) –

property n_actions#

The number of possible actions.

Return type: int

property actions#

The possible actions that can be taken by this policy.

Return type: fragile.core.typing.Tensor

select_actions(**kwargs)[source]#

Select a random action from the possible actions.

Returns: An array of shape (swarm.n_walkers,) containing the selected actions.
Return type: fragile.core.typing.Tensor

setup(swarm)[source]#

Sets up the policy, inferring any missing parameters.

Parameters: swarm (SwarmAPI) – The swarm that is using this policy.
Raises: TypeError – If n_actions cannot be inferred.
Return type: None

_setup_params(actions=None)[source]#

Setup the parameters of the policy.

Parameters: actions (Optional[fragile.core.typing.Tensor]) –

class fragile.core.policy.BinarySwap(n_swaps, n_actions=None, **kwargs)[source]#

Bases: fragile.core.api_classes.PolicyAPI

The policy is in charge of calculating the interactions with the Environment.

Parameters

n_swaps (int) –
n_actions (int) –

property n_actions#

property n_swaps#

select_actions(**kwargs)[source]#

Select actions for each walker in the swarm based on the current state.

This method must be implemented by subclasses.

Parameters: **kwargs – Additional keyword arguments required for selecting actions.
Returns: The selected actions as a Tensor or a StateData dictionary.
Return type: Union[Tensor, StateData]

setup(swarm)[source]#

Prepare the component during the setup phase of the Swarm.

Parameters: swarm (fragile.core.api_classes.SwarmAPI) –

class fragile.core.policy.ContinuousPolicy(bounds=None, second_order=False, step=1.0, **kwargs)[source]#

Bases: fragile.core.api_classes.PolicyAPI

ContinuousPolicy implements a continuous action space policy for interacting with the environment.

Parameters

bounds (Bounds, optional) – Action space bounds. If not provided, the bounds are obtained from the environment. Defaults to None.
second_order (bool, optional) – If True, the policy is considered second-order, and the action sampled will be added to the last value. Defaults to False.
step (float, optional) – The step size for updating the actions. Defaults to 1.0.
**kwargs – Additional keyword arguments for the base PolicyAPI class.

second_order#

If True, the policy is considered second-order.

Type: bool

step#

The step size for updating the actions.

Type: float

bounds#

Action space bounds.

Type: Bounds

_env_bounds#

Environment action space bounds.

Type: Bounds

property env_bounds#

Returns the environment action space bounds.

Return type: judo.Bounds

abstract select_actions(**kwargs)[source]#

Implement the functionality for selecting actions in the derived class. This method is called during the act operation.

Parameters: **kwargs – Additional keyword arguments required for selecting actions.

act(inplace=True, **kwargs)[source]#

Calculate the data needed to interact with the Environment.

Parameters

inplace (bool, optional) – If True, updates the swarm state with the selected actions. If False, returns the selected actions. Defaults to True.
**kwargs – Additional keyword arguments required for acting.

Returns

A dictionary containing the selected actions if inplace is: False. Otherwise, returns `None.

Return type

Union[None, StateData]

setup(swarm)[source]#

Set up the policy with the provided swarm object.

Parameters: swarm (SwarmAPI) – The swarm object to set up the policy with.
Returns: None
Return type: None

class fragile.core.policy.ZeroContinuous(bounds=None, second_order=False, step=1.0, **kwargs)[source]#

Bases: ContinuousPolicy

Uniform policy samples actions equal to zero.

Inherits from ContinuousPolicy.

Parameters

second_order (bool) –
step (float) –

select_actions(**kwargs)[source]#

Select a vector of zeros.

Parameters: **kwargs – Additional keyword arguments required for selecting actions.
Returns: Selected actions as a tensor.
Return type: Tensor

class fragile.core.policy.Uniform(bounds=None, second_order=False, step=1.0, **kwargs)[source]#

Bases: ContinuousPolicy

Uniform policy samples actions uniformly from the bounds of the action space.

Inherits from ContinuousPolicy.

Parameters

second_order (bool) –
step (float) –

select_actions(**kwargs)[source]#

Select actions by sampling uniformly from the action space bounds.

Parameters: **kwargs – Additional keyword arguments required for selecting actions.
Returns: Selected actions as a tensor.
Return type: Tensor

class fragile.core.policy.Gaussian(loc=0.0, scale=1.0, **kwargs)[source]#

Bases: ContinuousPolicy

The Gaussian policy samples actions from a Gaussian distribution.

Inherits from ContinuousPolicy.

Parameters

loc (float) –
scale (float) –

select_actions(**kwargs)[source]#

Select actions by sampling from a Gaussian distribution.

Parameters: **kwargs – Additional keyword arguments required for selecting actions.
Returns: Selected actions as a tensor.
Return type: Tensor

class fragile.core.policy.GaussianModulus(loc=0.0, scale=1.0, **kwargs)[source]#

Bases: ContinuousPolicy

ContinuousPolicy implements a continuous action space policy for interacting with the environment.

Parameters

bounds (Bounds, optional) – Action space bounds. If not provided, the bounds are obtained from the environment. Defaults to None.
second_order (bool, optional) – If True, the policy is considered second-order, and the action sampled will be added to the last value. Defaults to False.
step (float, optional) – The step size for updating the actions. Defaults to 1.0.
**kwargs – Additional keyword arguments for the base PolicyAPI class.
loc (float) –
scale (float) –

second_order#

If True, the policy is considered second-order.

Type: bool

step#

The step size for updating the actions.

Type: float

bounds#

Action space bounds.

Type: Bounds

_env_bounds#

Environment action space bounds.

Type: Bounds

select_actions(**kwargs)[source]#

Implement the functionality for selecting actions in the derived class. This method is called during the act operation.

Parameters: **kwargs – Additional keyword arguments required for selecting actions.
Return type: fragile.core.typing.Tensor

fragile.core.policy

Contents

fragile.core.policy#

Module Contents#

Classes#

`fragile.core.policy`#