2.2.3. FL Repository enabler

2.2.3.1. Introduction 

The FL repository is a set of different databases, including initial ML algorithms, already trained ML models suitable for specific data sets and formats, averaging approaches, and auxiliary repositories for other additional functionalities that may be needed, and are not specifically identified yet.

2.2.3.2. Features 

Provide storage for FL related data like: initial ML algorithms, already trained ML models suitable for specific data sets and formats, averaging approaches, and auxiliary repositories for other additional functionalities that may be needed, and are not specifically identified yet.
Provide interfaces to put and retrieve data from different components of the enabler.
Communication with other FL enablers.

2.2.3.3. Place in architecture 

FL Repository enabler is one of the Federated Learning enablers that together enable to deploy a federated learning environment. Functionally, it operates on scalability and manageability verticals in the Assist-IoT architecture.

More specifically the following figure provides the semantic diagam of the enabler:

Semantic Diagram of Fl Repository Enabler

2.2.3.4. User guide 

Interactions with this enabler are done through a REST API. In the FL environment this enabler interacts with FL Orchestrator, FL Training Collector and FL Local Operations.

Method	Endpoint	Description
POST	/model	Adds new ML model to the library
PUT	/model/{id}/{version }	Update model that is already in the repository under identifier id and version
PUT	/model/meta/{id}/{ve rsion}	Update metadata of a model that is already in the repository under identifier id and version
GET	/model	Retrieve list of all models stored in the repository
GET	/model/{id}/{version }	Retrieve model with a specific identifier and version
DELETE	/model/{id}/{version }	Delete a model with a specific identifier and version
POST	/algorithm	Add new ML algorithm to the repository
PUT	/algorithm/{name}/{v ersion}	Update algorithm that is already in the repository with a given name and version
PUT	/algorithm/meta/{nam e}/{version}	Update metadata of an algorithm that is already in the repository with a given name and version
GET	/algorithm	Retrieve lis of all ML algorithms stored in the repository
GET	/algorithm/{name}/{v ersion}	Retrieve a ML algorithm identified with a given name and version
DELETE	/algorithm/{name}/{v ersion}	Delete a ML algorithm with a specific name and version
POST	/collector	Add new ML training collector algorithm to the repository
PUT	/collector/{name}/{v ersion}	Update ML training collector algorithm that is already in the repository with a given name and version
PUT	/collector/meta/{nam e}/{version}	Update metadata of a ML training collector algorithm that is already in the repository with a given name and version
GET	/collector	Retrieve lis of all ML training collector algorithms stored in the repository
GET	/collector/{name}/{v ersion}	Retrieve a ML training collector algorithm identified with a given name and version
DELETE	/collector/{name}/{v ersion}	Delete a ML training collector algorithm with a specific name and version

2.2.3.5. Prerequisites 

The main prerequisities are the installation of Docker and docker-compose. These prerequisites are necessary in case of running the enabler as a container (Docker). However, it is also possible to run the component independently. In this case, it’s mandatory to have Python installed on the machine where the enabler will be executed. At least version 3.8 is recommended (this is the version of the Python image being used). It is also necessary to install some additional libraries or packages. These additional packages can be seen in the requirements.txt file (inside the application folder).

2.2.3.6. Installation 

The installation procedure for this enabler is under development and will be provided once the release of the enabler is completed.

2.2.3.7. Configuration options 

The are no configuration options for this enabler.

2.2.3.8. Developer guide 

2.2.3.8.1. Components

2.2.3.8.1.1. ML Algorithms Libraries

These libraries will be used by local nodes to instantiate local processes. The way that libraries (modules) will be stored will be similar to the way that standard ML libraries It will made available ML algorithms that can be used for either regular ML modelling, or for FL modelling. Moreover, as in the well-known cases of use of external ML modules, appropriate ML library modules are to be downloaded to the local node, installed and used to complete model training.

2.2.3.8.1.2. FL Collectors

As described in the FL Training Collector enabler, different Federated averaging algorithms can be applied to combine local results. This component of the FL repository will store them.

2.2.3.8.1.3. ML Model Libraries

The repository will also persist ML trained models. These models can be conceptualized in two “scenarios”.

If the enabler is installed on a local node, it will store models that are currently in training and/or are “in use” by this node.
If the repository is instantiated in some “more central location” it will store current versions of shared models (including initial models). Here, depending on the topology, shared models may represent a group of nodes (e.g., in the case of use of mediators), or be common to all nodes.

2.2.3.8.1.4. Auxiliary

Any other modules that may be needed to instantiate FL can be also stored in the FL Repository. Among them possible modules related to process verification, error handling, stopping criteria, authorization, belong to this category.

2.2.3.8.1.5. Local communication

Communication between external entities and the enabler.

2.2.3.8.2. Technologies

2.2.3.8.2.1. RDF

W3C Resource Description Framework Description (RDF) is a standard for representing information on the Web designed as a data model for metadata. It is one of the foundations for semantic technologies. It will provide flexible and adaptable model for ML algorithms metadata or any auxiliary data. Components: ML Algorithms library, Auxiliary

2.2.3.8.2.2. FedML

Research library and benchmark for Federated ML containing federated algorithms and optimizers. Components: FL Collectors, Auxiliary

2.2.3.8.2.3. Python

Python is an interpreted high-level general-purpose programming language with a set of libraries. Very popular for data analysis and ML applications. Component: Local communication

2.2.3.8.2.4. FastAPI

A popular web microframework written in Python, FastAPI is known for being both robust and high performing. It is based on OpenAPI (previously Swagger) standards. Component: Local communication

2.2.3.8.2.5. MongoDB

MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program. Component: ML Models Libraries, Auxiliary