.. include:: color_roles.rst

Architecture
============

Class descriptions
------------------

The fragile framework is designed to quickly implement and test new FractalAI based algorithms.
The different classes are combined using composition to isolate the different aspects
of the algorithms:

* :swarm:`Swarm`: It defines the main computation loop for the algorithm, and it coordinates all the other classes involved in the algorithm.

The following classes represent different aspects of modelling a given problem.
They are state-less, and they write and save data to the :states:`States`.

* :env:`Environment`: It represents the problem that is being solved, and it
  allows to sample new states as the algorithm evolves.

* :model:`Model`: This implements the policy used to sample new actions that
  will be applied to the :env:`Environment`.

* :walkers:`Walkers`: This class takes care of the data processing necessary
  to implement the FractalAI algorithms.

All the data needed to make a :swarm:`Swarm` evolve is stored inside :states:`States`, and all the
historical data generated is manged by the :tree:`StateTree`.

The :states:`States` contain functionality for storing, accessing and cloning all the data of its
respective classes.

* :tree:`StateTree`: Data structure that stores the data generated by the algorithm, keeping
  track of its history as a directed acyclic graph.

* :env-st:`StateEnv`: Contains the data provided by the :env:`Environment`,
  and it describes the problem being solved. It contains at least the following attributes:

    - **states**: This data tracks the internal state of the :class:`Environment` simulation,
      and they are only used to save and restore its state.

    - **observs**: This is the data that corresponds to the observations of the current
      :class:`Environment` state. The observations are used for calculating distances.

    - **rewards**: This vector contains the rewards associated with each observation.

    - **oobs**: Stands for **Out Of Bounds**. It is a vector of booleans that represents and arbitrary
      boundary condition. If a value is ``True`` the corresponding states will be treated as
      being outside the :class:`Environment` domain. The states considered out of bounds
      will be avoided by the sampling algorithms.

    - **terminals**: Vector of booleans representing the successful termination of an environment.
      A ``True`` value indicates that the :class:`Environment` has successfully reached a
      terminal state that is not out of bounds.

* :model-st:`StatesModel`: Contains the data provided by the :class:Model. This data describes the
  internal state of the :model:`Model`, and the actions applied to the :env:`Environment`.

    - **actions**: Array containing the actions that will be applied to the states.

* :walkers-st:`StatesWalkers`: This data describes the internal state of the :walkers:`Walkers`.

    - **id_walkers**: Array of of integers that uniquely identify a given state. They are obtained by hashing the states.

    - **compas_clone**: Array of integers containing the index of the walkers selected as companions for cloning.

    - **processed_rewards**: Array of normalized rewards. It contains positive values with an
      average of 1. Values greater than one correspond to rewards above the average, and values
      lower than one correspond to rewards below the average.

    - **virtual_rewards**: Array containing the virtual rewards assigned to each walker.

    - **cum_rewards**: Array of rewards used to compute the virtual rewards. This value can
      accumulate the rewards provided by the :class:`Environment` during an algorithm run.

    - **distances**: Array containing the similarity metric of each walker used to compute the virtual rewards.

    - **clone_probs**: Array containing the probability that a walker clones to its companion during the cloning phase.

    - **will_clone**: Boolean array. A ``True`` value indicates that the corresponding walker will clone to its companion.

    - **in_bounds**: Boolean array. A ``True`` value indicates that a walker is in the domain defined by the :class:`Environment`.

The :walkers-st:`StatesWalkers` also track the data associated with the best state found during the algorithm run:

    - **best_state**: State of the walker with the best ``cum_reward`` found during the algorithm run.

    - **best_obs**: Observation corresponding to the ``best_state``.

    - **best_reward**: Best ``cum_reward`` found during the algorithm run.

    - **best_id**: Integer representing the hash of the ``best_state``.


Class dependency
----------------

.. figure:: /../source/resources/images/fragile_architecture.png
   :alt: Composition relationships among the different fragile classes.

   Composition relationship among the different classes

Swarm algorithm loop
--------------------
The algorithm loop executed when :swarm:`run_swarm` is called executes the following methods of the classes listed above.

Each method name is colored according to the class that implements it. After the method name are listed inside parenthesis
the different :states:`States` classes that will be read, and after the ``->`` are described the :states:`States` that each method modifies.

A.  :swarm:`reset`

    1. :env:`reset` -> :env-st:`StatesEnv (SE)`

    2. :model:`reset` -> :model-st:`StatesModel (SM)`

    3. :walkers:`reset` -> :walkers-st:`StatesWalkers (SW)`

    4. :tree:`reset`

B. While not :walkers:`calculate_end_condition` then :swarm:`run_step`:

    1. :swarm:`step_and_update_best`

        1.1 :walkers:`update_best` (:env-st:`SE`) -> :walkers-st:`SW`

        1.2 :walkers:`fix_best` (:walkers-st:`SW`) -> :walkers-st:`SW`, :env-st:`SE`

        1.3 :swarm:`step_walkers`

            1.3.1 :model:`predict` (:env-st:`SE` , :model-st:`SM`, :walkers-st:`SW`) -> :model-st:`SM`

            1.3.2 :env:`step` (:env-st:`SE` , :model-st:`SM`, :walkers-st:`SW`) -> :env-st:`SE`

            1.3.3 :walkers:`update_states` (:env-st:`SE` , :model-st:`SM`) -> :env-st:`SE` , :model-st:`SM`, :walkers-st:`SW`

            1.3.4 :walkers:`update_id` (:walkers-st:`SW`) -> :walkers-st:`SW`

            1.3.5 :tree:`add_states` (:env-st:`SE` , :model-st:`SM`, :walkers-st:`SW`)

    2. :swarm:`balance_and_prune`

        2.1 :walkers:`balance`

        2.2 :walkers:`calculate_distances` (:env-st:`SE`) -> :walkers-st:`SW`

        2.3 :walkers:`calculate_virtual_reward` (:walkers-st:`SW`) -> :walkers-st:`SW`

        2.4 :walkers:`update_clone_probs` (:walkers-st:`SW`) -> :walkers-st:`SW`

        2.5 :walkers:`clone_walkers` (:walkers-st:`SW`) -> :walkers-st:`SW`, :env-st:`SE`, :model-st:`SM`

        2.6 :tree:`prune_tree`