:py:mod:`fragile.core.walkers` ============================== .. py:module:: fragile.core.walkers Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: fragile.core.walkers.SimpleWalkers fragile.core.walkers.ScoreMetric fragile.core.walkers.RewardScore fragile.core.walkers.SonicScore fragile.core.walkers.MarioScore fragile.core.walkers.DiversityMetric fragile.core.walkers.RandomDistance fragile.core.walkers.Walkers fragile.core.walkers.ExplorationWalkers fragile.core.walkers.NoBalance Functions ~~~~~~~~~ .. autoapisummary:: fragile.core.walkers.l2_norm .. py:class:: SimpleWalkers(accumulate_reward = True, score_scale = 1.0, diversity_scale = 1.0, minimize = False, **kwargs) Bases: :py:obj:`fragile.core.api_classes.WalkersAPI` The WalkersAPI class defines the base functionality for managing walkers in a swarm simulation. This class inherits from the SwarmComponent class. .. py:attribute:: default_inputs .. py:attribute:: default_param_dict .. py:attribute:: default_outputs .. py:method:: run_epoch(observs, rewards, scores, oobs=None, inplace = True, **kwargs) Implement the functionality for running an epoch in the derived class. This method is called during the balance operation. :param inplace: If `True`, updates the swarm state with the data generated during the epoch. If False, returns the data. Defaults to `True`. :type inplace: bool, optional :param \*\*kwargs: Additional keyword arguments required for running the epoch. :returns: A dictionary containing the data generated during the epoch if inplace is False. Otherwise, returns None. :rtype: StateData .. py:function:: l2_norm(x, y) Compute the L2 norm between two tensors. :param x: The first tensor. :type x: Tensor :param y: The second tensor. :type y: Tensor :returns: The L2 norm between the two input tensors. :rtype: Tensor .. py:class:: ScoreMetric(swarm = None, param_dict = None, inputs = None, outputs = None) Bases: :py:obj:`fragile.core.api_classes.WalkersMetric` A base class for score metrics in a swarm of walkers. .. attribute:: default_param_dict Default parameters dictionary for the score metric. :type: dict .. attribute:: default_outputs Tuple containing the default output keys. :type: tuple .. py:attribute:: default_param_dict .. py:attribute:: default_outputs .. py:class:: RewardScore(accumulate_reward = True, keep_max_reward = False, **kwargs) Bases: :py:obj:`ScoreMetric` A class representing the reward score metric for a swarm of walkers. .. attribute:: default_inputs Default inputs dictionary for the reward score metric. :type: dict .. py:attribute:: default_inputs .. py:method:: inputs() :property: Return a dictionary containing the data that this component needs to function. .. py:method:: calculate(rewards = None, scores=None, **kwargs) Calculate the scores for the walkers based on rewards. :param rewards: Array of walkers' rewards. :type rewards: array :param scores: Array of walkers' scores. :type scores: array, optional :param \*\*kwargs: Additional keyword arguments. :returns: A dictionary containing the scores. :rtype: dict .. py:method:: reset(inplace = True, root_walker = None, states = None, **kwargs) Reset the reward score metric. :param inplace: Whether to reset in-place. Defaults to True. :type inplace: bool, optional :param root_walker: The root walker to reset. Defaults to None. :type root_walker: StateData, optional :param states: The state data to reset. Defaults to None. :type states: Optional[StateData], optional :param \*\*kwargs: Additional keyword arguments. .. py:class:: SonicScore(swarm = None, param_dict = None, inputs = None, outputs = None) Bases: :py:obj:`ScoreMetric` A base class for score metrics in a swarm of walkers. .. attribute:: default_param_dict Default parameters dictionary for the score metric. :type: dict .. attribute:: default_outputs Tuple containing the default output keys. :type: tuple .. py:attribute:: accumulate_reward :annotation: = False .. py:attribute:: default_inputs .. py:method:: score_from_info(info) :staticmethod: .. py:method:: calculate(rewards, scores=None, **kwargs) .. py:class:: MarioScore(swarm = None, param_dict = None, inputs = None, outputs = None) Bases: :py:obj:`ScoreMetric` A base class for score metrics in a swarm of walkers. .. attribute:: default_param_dict Default parameters dictionary for the score metric. :type: dict .. attribute:: default_outputs Tuple containing the default output keys. :type: tuple .. py:attribute:: name :annotation: = MarioScore .. py:attribute:: accumulate_reward :annotation: = False .. py:attribute:: default_inputs .. py:method:: score_from_info(info) :staticmethod: .. py:method:: calculate(rewards=None, scores=None, **kwargs) .. py:class:: DiversityMetric(swarm = None, param_dict = None, inputs = None, outputs = None) Bases: :py:obj:`fragile.core.api_classes.WalkersMetric` A base class for diversity metrics in a swarm of walkers. .. attribute:: default_param_dict Default parameters dictionary for the diversity metric. :type: dict .. attribute:: default_outputs Tuple containing the default output keys. :type: tuple .. py:attribute:: default_param_dict .. py:attribute:: default_outputs .. py:class:: RandomDistance(swarm = None, param_dict = None, inputs = None, outputs = None) Bases: :py:obj:`DiversityMetric` A class representing the random distance diversity metric for a swarm of walkers. .. attribute:: default_inputs Default inputs dictionary for the random distance diversity metric. :type: dict .. py:attribute:: use_pbc :annotation: = False .. py:attribute:: default_inputs .. py:method:: calculate(observs, oobs, **kwargs) Calculate the diversities for the walkers based on the L2 distance to another walker chosen at random. :param observs: Array of walkers' observations. :type observs: array :param oobs: Array of out-of-bounds flags for walkers. :type oobs: array :param \*\*kwargs: Additional keyword arguments. :returns: A dictionary containing the diversities. :rtype: dict .. py:class:: Walkers(score = None, diversity = None, minimize = False, score_scale = 1.0, diversity_scale = 1.0, track_data=None, accumulate_reward = True, keep_max_reward = False, clone_period = 1, freeze_walkers=True, **kwargs) Bases: :py:obj:`fragile.core.api_classes.WalkersAPI` Walkers class handles the walkers' dynamics, including scoring, diversity, cloning, and resetting. Inherits from WalkersAPI. .. py:attribute:: default_param_dict .. py:attribute:: default_outputs :annotation: = ['compas_clone', 'virtual_rewards', 'clone_probs', 'will_clone'] .. py:attribute:: default_inputs .. py:method:: param_dict() :property: Return the dictionary defining all the data attributes that the component requires. .. py:method:: inputs() :property: Return a dictionary containing the data that this component needs to function. .. py:method:: outputs() :property: Return a tuple containing the names of the data attribute that the component outputs. .. py:method:: setup(swarm) Set up the Walkers object by initializing its swarm. :param swarm: Swarm object to be initialized. .. py:method:: balance(inplace = True, **kwargs) Balance the walkers. :param inplace: Whether to perform the operation inplace. Defaults to True. :type inplace: bool, optional :param \*\*kwargs: Additional keyword arguments. :returns: None if inplace=True, otherwise a StateData object. :rtype: Union[None, StateData] .. py:method:: run_epoch(inplace = True, oobs=None, **kwargs) Execute an epoch in the walkers' dynamics. :param inplace: Whether to perform the operation inplace. Defaults to True. :type inplace: bool, optional :param oobs: Out of bounds flags. :param \*\*kwargs: Additional keyword arguments. :returns: State data after the epoch execution. :rtype: StateData .. py:method:: calculate_virtual_reward(scores, diversities, **kwargs) Calculate the virtual rewards for walkers based on scores and diversities. :param scores: Array of walkers' scores. :type scores: array :param diversities: Array of walkers' diversities. :type diversities: array :param \*\*kwargs: Additional keyword arguments. :returns: A dictionary containing the virtual rewards. :rtype: dict .. py:method:: calculate_clones(virtual_rewards, oobs=None) Calculate the walkers that will clone and their target companions. :param virtual_rewards: Array of walkers' virtual rewards. :type virtual_rewards: array :param oobs: Array of out-of-bounds flags for walkers. Defaults to None. :type oobs: array, optional :returns: A dictionary containing clone probabilities, will_clone flags, companion clones, and active flags. :rtype: dict .. py:method:: reset(inplace = True, **kwargs) Reset the Walkers object and its score and diversity components. :param inplace: Whether to perform the operation inplace. Defaults to True. :type inplace: bool, optional :param \*\*kwargs: Additional keyword arguments. .. py:class:: ExplorationWalkers(exploration_scale = 1.0, **kwargs) Bases: :py:obj:`Walkers` Walkers class handles the walkers' dynamics, including scoring, diversity, cloning, and resetting. Inherits from WalkersAPI. .. py:method:: explore_counts() :property: .. py:method:: calculate_virtual_reward(scores, diversities, **kwargs) Apply the virtual reward formula to account for all the different goal scores. .. py:method:: get_explore_rewards() .. py:method:: get_coords_keys() .. py:class:: NoBalance(score = None, diversity = None, minimize = False, score_scale = 1.0, diversity_scale = 1.0, track_data=None, accumulate_reward = True, keep_max_reward = False, clone_period = 1, freeze_walkers=True, **kwargs) Bases: :py:obj:`Walkers` A class representing walkers with no balancing behavior in a swarm. Inherits from the Walkers class and modifies the properties and methods to disable balancing. .. py:method:: param_dict() :property: Return the dictionary defining all the data attributes that the component requires. .. py:method:: inputs() :property: Return a dictionary containing the data that this component needs to function. .. py:method:: outputs() :property: Return a tuple containing the names of the data attribute that the component outputs. .. py:method:: run_epoch(inplace = True, oobs=None, **kwargs) Run an epoch for the walkers without balancing. :param inplace: Whether to run the epoch in-place. Defaults to True. :type inplace: bool, optional :param oobs: Array of out-of-bounds walkers. Defaults to None. :type oobs: array, optional :param \*\*kwargs: Additional keyword arguments. :returns: A dictionary containing the scores. :rtype: dict