Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OVEP: Bug Fixes, Refactoring, and Contrib Ops Update #23742

Merged
merged 7 commits into from
Feb 21, 2025

Conversation

saurabhkale17
Copy link
Contributor

Description

This pull request combines multiple improvements, bug fixes for the OpenVINO Execution Provider (OVEP). The changes are summarized as follows:

  1. Support for various contrib Ops in OVEP.

  2. Dimension Check Fixes for Greater, Pad, and MAX Ops: Fixed dimension check failures for the Greater, Pad, and MAX ops in OVEP, ensuring they now pass validation for all supported models.

  3. Refactor Core and Shared Context Lifetimes: Refactored the lifetimes of the OpenVINO core and shared context to remove dependency on shutdown calls. This change avoids relying on static lifetime management and improves stability and resource cleanup.

  4. Fix for Duplicate DQ Node Removal: Addressed an issue where duplicate Dequantize (DQ) nodes that were initializers were incorrectly removed. Initializers should always be preserved, and this fix ensures that all duplicate DQ nodes that are initializers are retained.

ankitm3k and others added 7 commits February 17, 2025 14:55

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
* update: Update MSFT Contrib Ops from OV

* modified data_ops.cc to remove unsupported ops

* disabled tests for EmbedLayerNormalisation and MatMulNBits

---------

Co-authored-by: n1harika <niharika.sathish@intel.com>
* Internal ci for PTL 1.1 (#523)

* update: Update MSFT Contrib Ops in OVEP (#521)

* update: Update MSFT Contrib Ops from OV

* modified data_ops.cc to remove unsupported ops

* disabled tests for EmbedLayerNormalisation and MatMulNBits

---------

Co-authored-by: n1harika <niharika.sathish@intel.com>

* Add Max op to no_dimension_supported list

---------

Co-authored-by: jatinwadhwa921 <110383850+jatinwadhwa921@users.noreply.github.com>
Co-authored-by: Ankit Maheshkar <ankit.maheshkar@intel.com>
Co-authored-by: n1harika <niharika.sathish@intel.com>
Python bindings don't call the provider factory shutdown method. We relied on this
to avoid destruction order issues with statically scoped ov::Core objects.

 Refactor ov core and shared context lifetimes such that we don't need to rely on shutdown
 calls to manage life times and we avoid a static lifetime of the core.

Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
…582)

This change addresses an issue where a Pad op in quantized models fails due to an unsupported dimension input. The fix adds logic to detect if a Pad op is part of a quantized model by checking for a DequantizeLinear input. If found, the op is marked as quantized and the unsupported dimension check is bypassed, ensuring that the pad_value remains constant as required by the VPUX compiler.

Related-to: EISW-152222

Co-authored-by: Surendar Rama Sitaraman <surendar.rama.sitaraman@intel.com>
@saurabhkale17
Copy link
Contributor Author

@jywu-msft

@jywu-msft
Copy link
Member

/azp run Linux OpenVINO CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jywu-msft jywu-msft requested a review from yihonglyu February 19, 2025 00:03
@yihonglyu
Copy link
Contributor

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, Linux QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 8 pipeline(s).

@yihonglyu
Copy link
Contributor

/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows ARM64 QNN CI Pipeline, Windows x64 QNN CI Pipeline, Big Models

Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@yihonglyu
Copy link
Contributor

/azp run Linux Android Emulator QNN CI Pipeline, Windows GPU CUDA CI Pipeline, Windows GPU DML CI Pipeline, Windows GPU Doc Gen CI Pipeline

Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@yihonglyu
Copy link
Contributor

/azp run Win TRT Minimal CUDA Test CI Pipeline

Copy link

No pipelines are associated with this pull request.

@yihonglyu
Copy link
Contributor

/azp run Windows TRT Minimal CUDA Test CI Pipeline

Copy link

No pipelines are associated with this pull request.

@yihonglyu
Copy link
Contributor

/azp run Windows TensorRT Minimal CUDA Test CI Pipeline

Copy link

No pipelines are associated with this pull request.

@yihonglyu
Copy link
Contributor

/azp run Win_TRT_Minimal_CUDA_Test_CI

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yihonglyu
Copy link
Contributor

Description

This pull request combines multiple improvements, bug fixes for the OpenVINO Execution Provider (OVEP). The changes are summarized as follows:

  1. Support for various contrib Ops in OVEP.
  2. Dimension Check Fixes for Greater, Pad, and MAX Ops: Fixed dimension check failures for the Greater, Pad, and MAX ops in OVEP, ensuring they now pass validation for all supported models.
  3. Refactor Core and Shared Context Lifetimes: Refactored the lifetimes of the OpenVINO core and shared context to remove dependency on shutdown calls. This change avoids relying on static lifetime management and improves stability and resource cleanup.
  4. Fix for Duplicate DQ Node Removal: Addressed an issue where duplicate Dequantize (DQ) nodes that were initializers were incorrectly removed. Initializers should always be preserved, and this fix ensures that all duplicate DQ nodes that are initializers are retained.

Could you list the contrib ops supported by this PR?

@saurabhkale17
Copy link
Contributor Author

These are contrib ops

  1. SkipLayerNormalization
  2. MatMulNBits
  3. FusedGemm
  4. FusedConv
  5. EmbedLayerNormalization
  6. BiasGelu
  7. Attention

@yihonglyu

@jywu-msft
Copy link
Member

/azp run Linux GPU CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@yihonglyu
Copy link
Contributor

These are contrib ops

  1. SkipLayerNormalization
  2. MatMulNBits
  3. FusedGemm
  4. FusedConv
  5. EmbedLayerNormalization
  6. BiasGelu
  7. Attention

@yihonglyu

@saurabhkale17 Please list them in the git commit message.

Copy link
Contributor

@yihonglyu yihonglyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you enable tests for the contrib ops added in this PR?

Copy link
Contributor

@yihonglyu yihonglyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests for:

  1. Dimension check fixes for Greater, Pad, and MAX ops.
  2. Fix for duplicate DQ node removal.

in the following PR.

@jywu-msft jywu-msft merged commit 754ee21 into microsoft:main Feb 21, 2025
76 checks passed
jatinwadhwa921 added a commit to intel/onnxruntime that referenced this pull request Feb 21, 2025
### Description
This pull request combines multiple improvements, bug fixes for the
OpenVINO Execution Provider (OVEP). The changes are summarized as
follows:

1. Support for various contrib Ops in OVEP.

2. Dimension Check Fixes for Greater, Pad, and MAX Ops: Fixed dimension
check failures for the Greater, Pad, and MAX ops in OVEP, ensuring they
now pass validation for all supported models.

3. Refactor Core and Shared Context Lifetimes: Refactored the lifetimes
of the OpenVINO core and shared context to remove dependency on shutdown
calls. This change avoids relying on static lifetime management and
improves stability and resource cleanup.

4. Fix for Duplicate DQ Node Removal: Addressed an issue where duplicate
Dequantize (DQ) nodes that were initializers were incorrectly removed.
Initializers should always be preserved, and this fix ensures that all
duplicate DQ nodes that are initializers are retained.

---------

Co-authored-by: Ankit Maheshkar <ankit.maheshkar@intel.com>
Co-authored-by: n1harika <niharika.sathish@intel.com>
Co-authored-by: rayngun <103146671+rayngun@users.noreply.github.com>
Co-authored-by: jatinwadhwa921 <110383850+jatinwadhwa921@users.noreply.github.com>
Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
Co-authored-by: Surendar Rama Sitaraman <surendar.rama.sitaraman@intel.com>
guschmue pushed a commit that referenced this pull request Mar 6, 2025
### Description
This pull request combines multiple improvements, bug fixes for the
OpenVINO Execution Provider (OVEP). The changes are summarized as
follows:

1. Support for various contrib Ops in OVEP.

2. Dimension Check Fixes for Greater, Pad, and MAX Ops: Fixed dimension
check failures for the Greater, Pad, and MAX ops in OVEP, ensuring they
now pass validation for all supported models.

3. Refactor Core and Shared Context Lifetimes: Refactored the lifetimes
of the OpenVINO core and shared context to remove dependency on shutdown
calls. This change avoids relying on static lifetime management and
improves stability and resource cleanup.

4. Fix for Duplicate DQ Node Removal: Addressed an issue where duplicate
Dequantize (DQ) nodes that were initializers were incorrectly removed.
Initializers should always be preserved, and this fix ensures that all
duplicate DQ nodes that are initializers are retained.

---------

Co-authored-by: Ankit Maheshkar <ankit.maheshkar@intel.com>
Co-authored-by: n1harika <niharika.sathish@intel.com>
Co-authored-by: rayngun <103146671+rayngun@users.noreply.github.com>
Co-authored-by: jatinwadhwa921 <110383850+jatinwadhwa921@users.noreply.github.com>
Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
Co-authored-by: Surendar Rama Sitaraman <surendar.rama.sitaraman@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants