Scikit-learn benchmark takes very long time to finish (~18 hrs) when EDMM is enabled #1612

vasanth-intel · 2023-10-18T10:37:35Z

Description of the problem

When EDMM is enabled, scikit-learn benchmark takes about 18 hours to finish 1 iteration with Gramine-SGX. Otherwise, when EDMM is disabled the benchmark executes successfully for 10 iterations in ~20 hours with Linux Native, Gramine-Direct and Gramine-SGX execution modes.

Steps to reproduce

Git clone the benchmark from https://github.com/IntelPython/scikit-learn_bench.
Install the benchmark using pip install command using requirements-common.txt and sklearn_bench/requirements.txt present within the above github link.

OR

Save the below lines in requirements.txt and install using the command pip install -r requirements.txt.

tqdm numpy==1.24.1 scipy==1.10.0 daal==2023.0.1 daal4py==2023.0.1 pandas==1.5.2 scikit-learn==1.2.0 dpcpp-cpp-rt==2023.0.0 scikit-learn-intelex==2023.0.1
Edit configs/skl_config.json to include kmeans and knn_clsf algorithms only.
Update the manifest to enable EDMM and generate the relevant SGX manifest.
Create a new results directory for the benchmark output.
Execute the below benchmark command.

gramine-sgx sklearnex runner.py --configs configs/skl_config.json --output-file results/sgx_output.json

Expected results

When EDMM is enabled and with Gramine-SGX execution mode, the benchmark should complete it's 10 iterations of execution within ~20 hours.

Actual results

When EDMM is enabled and with Gramine-SGX execution mode, the benchmark takes ~18 hours for 1 iteration to complete.

Gramine commit hash

master

The text was updated successfully, but these errors were encountered:

dimakuv · 2023-10-19T09:43:33Z

@anjalirai-intel Have you tried the perf optimizations for EDMM? In particular, PR #1513

vasanth-intel · 2023-10-19T10:12:56Z

@dimakuv The above issue was first tested and observed on PR #1513. Later on, it was tested with master and Gramine v1.5 with only sgx.edmm_enable flag set to true. Hence, @kailun-qin suggested to track the issue with master as the issue is observed there as well.

dimakuv · 2023-10-19T11:19:41Z

@vasanth-intel So PR #1513 doesn't help, right? The performance overhead is still huge?

vasanth-intel · 2023-10-19T12:15:59Z

With scikit-learn workload, we were unable to conclude on the performance overhead as we were unable to execute the workload for 10 iterations when sgx.edmm_enable is set to true. This is true for PR #1513, master and Gramine v1.5. We were able to execute only for 1 iteration which took ~18 hours, which is why the issue was raised.

monavij · 2023-11-04T01:04:41Z

I wonder if this workload does a lot of dynamic allocation AND deallocation. Maybe we will need "lazy free" optimization as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scikit-learn benchmark takes very long time to finish (~18 hrs) when EDMM is enabled #1612

Scikit-learn benchmark takes very long time to finish (~18 hrs) when EDMM is enabled #1612

vasanth-intel commented Oct 18, 2023 •

edited by dimakuv

dimakuv commented Oct 19, 2023

vasanth-intel commented Oct 19, 2023

dimakuv commented Oct 19, 2023

vasanth-intel commented Oct 19, 2023 •

edited

monavij commented Nov 4, 2023

Scikit-learn benchmark takes very long time to finish (~18 hrs) when EDMM is enabled #1612

Scikit-learn benchmark takes very long time to finish (~18 hrs) when EDMM is enabled #1612

Comments

vasanth-intel commented Oct 18, 2023 • edited by dimakuv

Description of the problem

Steps to reproduce

Expected results

Actual results

Gramine commit hash

dimakuv commented Oct 19, 2023

vasanth-intel commented Oct 19, 2023

dimakuv commented Oct 19, 2023

vasanth-intel commented Oct 19, 2023 • edited

monavij commented Nov 4, 2023

vasanth-intel commented Oct 18, 2023 •

edited by dimakuv

vasanth-intel commented Oct 19, 2023 •

edited