Add NPU backend support for val and inference #2109

MengqingCao · 2024-03-14T09:46:17Z

I am a user of NPU. When I used TIMM recently, I found that it does not support NPU natively. It's pleasure to see that someone has made some contributions on leveraging NPU to TIMM #2102. But it currently only offers the feature of using NPU during training. This PR extends NPU support to the validate and inference entries, thus addressing this limitation.

Specify the device as "npu", then you can use NPU as accelerator during inferencing and validating.

It is tested on:

model: tiny_vit_21m_512
dataset: the val subset of ImageNet-1K

Validate Scripts

python validate.py ../open_clip/data/ImageNet-1000/val/ --device npu --model ./model_ckpts/tiny_vit_21m_512 --batch-size 64 --pretrained

ScreenShot

It shows the validation results on val subset of ImageNet-1K are as following:

top-1 acc	top-5 acc
86.040%	97.750%

Inference Scripts

python inference.py ./data/ --device npu --batch-size 64 --model ./model_ckpts/tiny_vit_21m_512 --label-type detail --topk 5

ScreenShot

results

Here offers some results of predicting the top-5 classification results by inferencing on tiny_vit_21m_512. Everything goes well on npu.

HuggingFaceDocBuilderDev · 2024-03-14T20:58:21Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

MengqingCao · 2024-03-20T08:25:11Z

cc @rwightman

rwightman · 2024-04-10T16:44:56Z

@MengqingCao see #2138 ... I need a better design to centralize device specific acccelerator module loading, etc instead of spreading it out across many files, it's not a sustainable approach.

Also, another challenge here is I don't have easy access to many potential accelerators so definitely need help testing as I can't realistically run my normal CI or tests with them as I do across my github and local CI right now...

MengqingCao · 2024-04-18T01:11:32Z

@MengqingCao see #2138 ... I need a better design to centralize device specific acccelerator module loading, etc instead of spreading it out across many files, it's not a sustainable approach.

Also, another challenge here is I don't have easy access to many potential accelerators so definitely need help testing as I can't realistically run my normal CI or tests with them as I do across my github and local CI right now...

Good day! @rwightman, thanks for your reply.

for your first concern, I agree that importing the device specific modules in many files is not a smart way to enable the devices. I was inspired by the way of centralizing device related modules loading in train.py#L415 that we could do a autoloading when the whole lib is initing. Because the way in train.py#L415 also needs to do redundant processing in many files.

My initial idea was to load the device accelerator module via a specific environment variable (e.g. TIMM_DEVICE_EXT). This variable is set in timm/init.py by reading the configuration infos in a specific file (e.g., a json file), and then the module is preloaded according to this variable, so that device-related modules import can be activated within the entire TIMM library, instead of having to import them separately everywhere. But I think the device-specific hardcoding has to be modified.

For your second concern, making a mechine with Ascend NPU available to community is on my to-do list, so that we could ensure that the correctness of the code could be verified and maintained.

Let me know if you have any ideas or confusion!

MengqingCao · 2024-04-28T09:59:17Z

Hi, @rwightman. I have just committed the code implementation of the above solution, please review it, thx!

MengqingCao · 2024-06-18T03:48:20Z

Hi, @rwightman I'm sorry for bothering you. Could you help reviewing the latest code in this PR? Thanks in adavance!

rwightman · 2024-06-18T04:23:23Z

@MengqingCao I don't really have any way to test this so don't want to have support for other hardware like this touching as many files. Same thing for Intel and other hardware that requires extra imports, etc. PyTorch 2.4 should have a mechanism for auto-importing device dependencies so I'll probably wait for that ....

MengqingCao · 2024-06-19T08:16:31Z

@MengqingCao I don't really have any way to test this so don't want to have support for other hardware like this touching as many files. Same thing for Intel and other hardware that requires extra imports, etc. PyTorch 2.4 should have a mechanism for auto-importing device dependencies so I'll probably wait for that ....

Thanks a lot for your reply! I‘m applying a NPU machine for CI, thus you can attach NPU for testing. The latest code also avoid touching too many files. However, as far as I know, the auto-importing maybe postponed to PyTorch 2.5
So if you don't mind being a little late, maybe we could wait for PyTorch supportting auto-importing device dependencies

MengqingCao · 2024-10-16T08:56:15Z

@rwightman Good day! I'm happy to tell you that PyTorch has supported autoloading device-related dependencies through pytorch/pytorch#127074. This feature will be included in torch 2.5.0. The latest commit is tested on torch 2.5 dev version, and everything goes well on Ascend NPU.

Plz review the code and if these changes are acceptable, maybe we could merge it as soon as PyTorch 2.5 is released?

timm/data/loader.py

validate.py

rwightman · 2024-10-16T18:54:02Z

@MengqingCao thanks, this is looking better, two issues flagged above, will merge once 2.25 is out

MengqingCao · 2024-10-17T12:39:57Z

@MengqingCao thanks, this is looking better, two issues flagged above, will merge once 2.25 is out

Thanks!

rwightman · 2024-10-18T20:57:48Z

@MengqingCao I tried these additions on pytorch 2.5 and 2.4 to ensure nothing broke in normal use. Seems fine.

I noticed there were some other possible errors where different devices might not be supported so I did a bit of cleanup on #2308 ... I think that would be needed for the grad scaler & amp to work fully with NPU?

I don't have an NPU, could you confirm that your changes here + my new ones on that branch work well?

rwightman · 2024-10-18T21:55:44Z

@MengqingCao I merged the contents of this branch into the device_amp_cleanup mentioned in comment above. It'd be great if you could try the combination before I merge.

MengqingCao · 2024-10-19T13:09:10Z

@rwightman The cleanup you did is necessary for NPU and makes the code cleaner. I have tested #2308 on my NPU device and everything works fine. Thanks!

rwightman · 2024-10-19T16:57:58Z

@MengqingCao all merged, I'll tweet about the torch autoload support and this addition for NPU in a day or so, and will look at OpenCLIP merge and test tomorrow or Monday. Feel free to let other Ascend users know this should work now.

MengqingCao · 2024-10-21T01:27:48Z

Thanks! I'm excited to announce this good news to more TIMM & Ascend users :-)

MengqingCao changed the title ~~add npu support for val and inference~~ Add NPU backend support for val and inference Mar 14, 2024

rwightman mentioned this pull request Apr 9, 2024

Add Intel's GPU / XPU support to TIMM's validation script #2138

Closed

add npu support

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired

Verified
Learn about vigilant mode

234f975

MengqingCao force-pushed the npu_support branch from f0369e0 to 234f975 Compare October 16, 2024 07:14

rwightman reviewed Oct 16, 2024

View reviewed changes

timm/data/loader.py Outdated Show resolved Hide resolved

validate.py Show resolved Hide resolved

fix device check

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired

Verified
Learn about vigilant mode

Loading
Loading status checks…

37c731c

rwightman merged commit 81b59fa into huggingface:main Oct 19, 2024
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add NPU backend support for val and inference #2109

Add NPU backend support for val and inference #2109

MengqingCao commented Mar 14, 2024

HuggingFaceDocBuilderDev commented Mar 14, 2024

MengqingCao commented Mar 20, 2024

rwightman commented Apr 10, 2024

MengqingCao commented Apr 18, 2024

MengqingCao commented Apr 28, 2024

MengqingCao commented Jun 18, 2024

rwightman commented Jun 18, 2024

MengqingCao commented Jun 19, 2024

MengqingCao commented Oct 16, 2024 •

edited

Loading

rwightman commented Oct 16, 2024

MengqingCao commented Oct 17, 2024

rwightman commented Oct 18, 2024

rwightman commented Oct 18, 2024

MengqingCao commented Oct 19, 2024

rwightman commented Oct 19, 2024

MengqingCao commented Oct 21, 2024

Add NPU backend support for val and inference #2109

Add NPU backend support for val and inference #2109

Conversation

MengqingCao commented Mar 14, 2024

Validate Scripts

ScreenShot

Inference Scripts

ScreenShot

results

HuggingFaceDocBuilderDev commented Mar 14, 2024

MengqingCao commented Mar 20, 2024

rwightman commented Apr 10, 2024

MengqingCao commented Apr 18, 2024

MengqingCao commented Apr 28, 2024

MengqingCao commented Jun 18, 2024

rwightman commented Jun 18, 2024

MengqingCao commented Jun 19, 2024

MengqingCao commented Oct 16, 2024 • edited Loading

rwightman commented Oct 16, 2024

MengqingCao commented Oct 17, 2024

rwightman commented Oct 18, 2024

rwightman commented Oct 18, 2024

MengqingCao commented Oct 19, 2024

rwightman commented Oct 19, 2024

MengqingCao commented Oct 21, 2024

MengqingCao commented Oct 16, 2024 •

edited

Loading