fragile.core.policy#

Module Contents#

Classes#

DummyPolicy

The policy is in charge of calculating the interactions with the Environment.

RandomPlangym

Policy that selects random actions from the environment action space.

Discrete

A policy that selects discrete actions according to some probability distribution.

BinarySwap

The policy is in charge of calculating the interactions with the Environment.

ContinuousPolicy

ContinuousPolicy implements a continuous action space policy for interacting with the environment.

ZeroContinuous

Uniform policy samples actions equal to zero.

Uniform

Uniform policy samples actions uniformly from the bounds of the action space.

Gaussian

The Gaussian policy samples actions from a Gaussian distribution.

GaussianModulus

ContinuousPolicy implements a continuous action space policy for interacting with the environment.

class fragile.core.policy.DummyPolicy(swarm=None, param_dict=None, inputs=None, outputs=None)[source]#

Bases: fragile.core.api_classes.PolicyAPI

The policy is in charge of calculating the interactions with the Environment.

The PolicyAPI class is responsible for defining the policy that determines the actions for interacting with the Environment in a swarm simulation. This is an abstract base class, and specific policy implementations should inherit from this class and implement the ‘select_actions’ method.

Parameters
  • swarm (Optional[SwarmAPI]) –

  • param_dict (Optional[fragile.core.typing.StateDict]) –

  • inputs (Optional[fragile.core.typing.InputDict]) –

  • outputs (Optional[Tuple[str]]) –

select_actions(**kwargs)[source]#

Select actions for each walker in the swarm based on the current state.

This method must be implemented by subclasses.

Parameters

**kwargs – Additional keyword arguments required for selecting actions.

Returns

The selected actions as a Tensor or a StateData dictionary.

Return type

Union[Tensor, StateData]

class fragile.core.policy.RandomPlangym(swarm=None, param_dict=None, inputs=None, outputs=None)[source]#

Bases: fragile.core.api_classes.PolicyAPI

Policy that selects random actions from the environment action space.

Parameters
  • swarm (Optional[SwarmAPI]) –

  • param_dict (Optional[fragile.core.typing.StateDict]) –

  • inputs (Optional[fragile.core.typing.InputDict]) –

  • outputs (Optional[Tuple[str]]) –

setup(swarm)[source]#

Setup the policy.

Parameters

swarm (SwarmAPI) – Swarm that will use this policy.

Raises

TypeError – If the environment does not have a sample_action method or an action_space.

select_actions(**kwargs)[source]#

Sample a random action from the environment action space.

Return type

list

class fragile.core.policy.Discrete(actions=None, probs=None, **kwargs)[source]#

Bases: fragile.core.api_classes.PolicyAPI

A policy that selects discrete actions according to some probability distribution.

Parameters
  • actions (Optional[fragile.core.typing.Tensor]) –

  • probs (Optional[fragile.core.typing.Tensor]) –

property n_actions#

The number of possible actions.

Return type

int

property actions#

The possible actions that can be taken by this policy.

Return type

fragile.core.typing.Tensor

select_actions(**kwargs)[source]#

Select a random action from the possible actions.

Returns

An array of shape (swarm.n_walkers,) containing the selected actions.

Return type

fragile.core.typing.Tensor

setup(swarm)[source]#

Sets up the policy, inferring any missing parameters.

Parameters

swarm (SwarmAPI) – The swarm that is using this policy.

Raises

TypeError – If n_actions cannot be inferred.

Return type

None

_setup_params(actions=None)[source]#

Setup the parameters of the policy.

Parameters

actions (Optional[fragile.core.typing.Tensor]) –

class fragile.core.policy.BinarySwap(n_swaps, n_actions=None, **kwargs)[source]#

Bases: fragile.core.api_classes.PolicyAPI

The policy is in charge of calculating the interactions with the Environment.

The PolicyAPI class is responsible for defining the policy that determines the actions for interacting with the Environment in a swarm simulation. This is an abstract base class, and specific policy implementations should inherit from this class and implement the ‘select_actions’ method.

Parameters
  • n_swaps (int) –

  • n_actions (int) –

property n_actions#
property n_swaps#
select_actions(**kwargs)[source]#

Select actions for each walker in the swarm based on the current state.

This method must be implemented by subclasses.

Parameters

**kwargs – Additional keyword arguments required for selecting actions.

Returns

The selected actions as a Tensor or a StateData dictionary.

Return type

Union[Tensor, StateData]

setup(swarm)[source]#

Prepare the component during the setup phase of the Swarm.

Parameters

swarm (fragile.core.api_classes.SwarmAPI) –

class fragile.core.policy.ContinuousPolicy(bounds=None, second_order=False, step=1.0, **kwargs)[source]#

Bases: fragile.core.api_classes.PolicyAPI

ContinuousPolicy implements a continuous action space policy for interacting with the environment.

Parameters
  • bounds (Bounds, optional) – Action space bounds. If not provided, the bounds are obtained from the environment. Defaults to None.

  • second_order (bool, optional) – If True, the policy is considered second-order, and the action sampled will be added to the last value. Defaults to False.

  • step (float, optional) – The step size for updating the actions. Defaults to 1.0.

  • **kwargs – Additional keyword arguments for the base PolicyAPI class.

second_order#

If True, the policy is considered second-order.

Type

bool

step#

The step size for updating the actions.

Type

float

bounds#

Action space bounds.

Type

Bounds

_env_bounds#

Environment action space bounds.

Type

Bounds

property env_bounds#

Returns the environment action space bounds.

Return type

judo.Bounds

abstract select_actions(**kwargs)[source]#

Implement the functionality for selecting actions in the derived class. This method is called during the act operation.

Parameters

**kwargs – Additional keyword arguments required for selecting actions.

act(inplace=True, **kwargs)[source]#

Calculate the data needed to interact with the Environment.

Parameters
  • inplace (bool, optional) – If True, updates the swarm state with the selected actions. If False, returns the selected actions. Defaults to True.

  • **kwargs – Additional keyword arguments required for acting.

Returns

A dictionary containing the selected actions if inplace is

False. Otherwise, returns `None.

Return type

Union[None, StateData]

setup(swarm)[source]#

Set up the policy with the provided swarm object.

Parameters

swarm (SwarmAPI) – The swarm object to set up the policy with.

Returns

None

Return type

None

class fragile.core.policy.ZeroContinuous(bounds=None, second_order=False, step=1.0, **kwargs)[source]#

Bases: ContinuousPolicy

Uniform policy samples actions equal to zero.

Inherits from ContinuousPolicy.

Parameters
select_actions(**kwargs)[source]#

Select a vector of zeros.

Parameters

**kwargs – Additional keyword arguments required for selecting actions.

Returns

Selected actions as a tensor.

Return type

Tensor

class fragile.core.policy.Uniform(bounds=None, second_order=False, step=1.0, **kwargs)[source]#

Bases: ContinuousPolicy

Uniform policy samples actions uniformly from the bounds of the action space.

Inherits from ContinuousPolicy.

Parameters
select_actions(**kwargs)[source]#

Select actions by sampling uniformly from the action space bounds.

Parameters

**kwargs – Additional keyword arguments required for selecting actions.

Returns

Selected actions as a tensor.

Return type

Tensor

class fragile.core.policy.Gaussian(loc=0.0, scale=1.0, **kwargs)[source]#

Bases: ContinuousPolicy

The Gaussian policy samples actions from a Gaussian distribution.

Inherits from ContinuousPolicy.

Parameters
select_actions(**kwargs)[source]#

Select actions by sampling from a Gaussian distribution.

Parameters

**kwargs – Additional keyword arguments required for selecting actions.

Returns

Selected actions as a tensor.

Return type

Tensor

class fragile.core.policy.GaussianModulus(loc=0.0, scale=1.0, **kwargs)[source]#

Bases: ContinuousPolicy

ContinuousPolicy implements a continuous action space policy for interacting with the environment.

Parameters
  • bounds (Bounds, optional) – Action space bounds. If not provided, the bounds are obtained from the environment. Defaults to None.

  • second_order (bool, optional) – If True, the policy is considered second-order, and the action sampled will be added to the last value. Defaults to False.

  • step (float, optional) – The step size for updating the actions. Defaults to 1.0.

  • **kwargs – Additional keyword arguments for the base PolicyAPI class.

  • loc (float) –

  • scale (float) –

second_order#

If True, the policy is considered second-order.

Type

bool

step#

The step size for updating the actions.

Type

float

bounds#

Action space bounds.

Type

Bounds

_env_bounds#

Environment action space bounds.

Type

Bounds

select_actions(**kwargs)[source]#

Implement the functionality for selecting actions in the derived class. This method is called during the act operation.

Parameters

**kwargs – Additional keyword arguments required for selecting actions.

Return type

fragile.core.typing.Tensor