MAFL: Model-Agnostic Federated Learning

MAFL: Model-Agnostic Federated Learning (formerly OpenFederatedLearning-extended (OpenFL-x)) is an open-source extension of Intel® OpenFL 1.4 supporting federated bagging and boosting of any ML model. The software is entirely Python-based and comes with extensive examples, as described below, exploiting SciKit-Learn models. It has been successfully tested on x86_64, ARM and RISC-V platforms.

Installation

It is highly recommended to create a virtual environment prior to installing OpenFL-x (with both venv or conda) to avoid compatibility issues with existing Python software already installed on the system. Furthermore, since OpenFL-x is an extended version of OpenFL, it integrates all the features of the base framework plus the federated bagging and boosting functionalities: it is then suggested to not install both software in the same virtual environment since this can lead to issues in the working of both software.

You can get the last version of OpenFL-x from pypi:

pip install openfl-x

or, alternatively, you can clone this repository and run pip install:

git clone https://github.com/alpha-unito/OpenFL-extended.git
cd OpenFL-extended
pip install .

If this procedure completes successfully, you now have access to all the base features of OpenFL, plus the distributed bagging and boosting functionalities. Enjoy!

Getting Started

The quickest way to test OpenFL-x is to execute the examples available in the openfl-tutorials/boosting-examples folder. There are many of them available to be run out of the box, each employing a different dataset and a different number of participants in the federation:

Name	Algorithm	Dataset	# Train samples	# Test samples	# labels	# Features	# Envoy	ML model
`AdaBoostF_Adult`	Federated boosting	Adult	30,132	15,060	2	14	10	`DecisionTreeClassifier`
`AdaBoostF_Iris`	Federated boosting	Iris	35	15	3	4	2	`DecisionTreeClassifier`
`AdaBoostF_forestcover`	Federated boosting	ForestCover	250,000	245,141	2	54	10	`DecisionTreeClassifier`
`AdaBoostF_krvskp`	Federated boosting	KrvsKp	2,557	639	2	36	10	`DecisionTreeClassifier`
`AdaBoostF_letter`	Federated boosting	Letter	16,000	4,000	26	16	10	`DecisionTreeClassifier`
`AdaBoostF_pendigits`	Federated boosting	Pendigits	7,494	3,498	10	16	10	`DecisionTreeClassifier`
`AdaBoostF_sat`	Federated boosting	Sat	4,435	2,000	8	36	10	`DecisionTreeClassifier`
`AdaBoostF_segmentation`	Federated boosting	Segmentation	209	2,099	7	19	10	`DecisionTreeClassifier`
`AdaBoostF_splice`	Federated boosting	Splice	2,552	638	3	61	10	`DecisionTreeClassifier`
`AdaBoostF_vowel`	Federated boosting	Vowel	792	198	11	27	10	`DecisionTreeClassifier`
`AdaBoostF_vehicle`	Federated boosting	Vehicle	677	169	4	18	10	`DecisionTreeClassifier`
`RandomForest_Iris`	Federated bagging	Iris	35	15	3	4	2	`DecisionTreeClassifier`

The user can customise each example by changing the data distribution across the envoys, the number of envoys itself, the ML model used as weak learner, and the federation's aggregation algorithm. More information on how to run and personalise the examples are available in the boosting-examples folder.

Example results of the execution of these tests and their variations are freely available on WandB. During the experimentation, many SciKit-Learn classifiers have been used as weak learners, such as ExtremelyRandomizedTree, RidgeLinearRegression, MultiLayerPerceptron, KNearestNeighbors, GaussianNaiveBayes, and DecisionTree.

Aggregation algorithms

OpenFL-x offers the possibility to experiment with basic federated bagging and the federated boosting method called AdaBoost.F developed by Polato et al. [1]. This work proposes two other federated versions of AdaBoost, namely DistBoost.F and PreWeak.F. However, AdaBoost.F has been selected as it provided the best experimental performances to the other two approaches.

Concisely, AdaBoost.F creates iteratively an AdaBoost model selecting the best-performing weak learner during each federated round. It goes like this:

The aggregator receives the dataset size N from each collaborator and sends them an initial version of the weak hypothesis.
The aggregator receives the weak hypothesis hi from each collaborator and broadcasts the entire hypothesis space to every collaborator.
The errors ε committed by the global weak hypothesis on the local data are calculated by each client and sent to the aggregator.
The aggregator exploits the error information to select the best weak hypothesis c, adds it to the global strong hypothesis and sends the calculated AdaBoost coefficient α to the collaborators

More details can be found in the original paper reported below.

[1] Polato, Mirko, Roberto Esposito, and Marco Aldinucci. "Boosting the federation: Cross-silo federated learning without gradient descent." 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022.

Performance optimisations

Together with the new aggregation algorithm made available, OpenFL-x also introduces many other little optimisations to the original OpenFL coda, all aiming to reduce the execution time and improve the computational performance of the code:

we empirically optimised the buffer sizes used by gRPC to accommodate larger models and avoid resizing operations (∼1.5% execution time improvement over the base implementation);
we employed the Cloudpickle serialisation framework over other options available (like dill and pickle) (∼2.6% execution time improvement over the base implementation);
we modified the TensorDB to store only the essential information of the last two federation rounds (∼14.4% execution time improvement over the base implementation);
we have lowered the few sleeps present in the code from 10 seconds to 0.01 seconds, which according to our experimentation on cluster systems, is the lowest value still yielding improvement (∼48.2% execution time improvement over the base implementation).

These minor optimisations achieved an overall 5.46x speedup over the base software performance. This performance improvement makes the execution of OpenFL-x and the choice of lightweight weak learners possible on a wide range of computing platforms, even low-power ones.

Publication

Gianluca Mittone, Walter Riviera, Iacopo Colonnelli, Robert Birke, and Marco Aldinucci. “Model-Agnostic Federated Learning.” In: Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, August 28 - September 1, 2023, Proceedings. Ed. by José Cano, Mar- ios D. Dikaiakos, George A. Papadopoulos, Miquel Pericàs, and Rizos Sakellariou. Vol. 14100. Lecture Notes in Computer Science. Limassol, Cyprus: Springer, Sept. 2023, pp. 383–396. doi: 10.1007/978-3-031-39698-4_26. url: https: //doi.org/10.1007/978-3-031-39698-4%5C_26

A pre-print version of this software's paper is available on arXiv.

Contacts

This software is developed and maintained by Gianluca Mittone (gianluca.mittone@unito.it).

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
docs		docs
openfl-docker		openfl-docker
openfl-gramine		openfl-gramine
openfl-tutorials		openfl-tutorials
openfl-workspace		openfl-workspace
openfl		openfl
scripts		scripts
tests		tests
CITATION		CITATION
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
LICENSE-3RD-PARTY		LICENSE-3RD-PARTY
MANIFEST.in		MANIFEST.in
NOTICE		NOTICE
README.md		README.md
README_original.md		README_original.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
requirements-linters.txt		requirements-linters.txt
requirements-test.txt		requirements-test.txt
setup.cfg		setup.cfg
setup.py		setup.py

License

alpha-unito/Model-Agnostic-FL

Folders and files

Latest commit

History

Repository files navigation

MAFL: Model-Agnostic Federated Learning

Installation

Getting Started

Aggregation algorithms

Performance optimisations

Publication

Contacts

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages