model-acceleration

Here are 16 public repositories matching this topic...

ksm26 / Efficiently-Serving-LLMs

Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.

text-generation batch-processing server-optimization model-serving model-acceleration inference-optimization optimization-techniques machine-learning-operations deep-learning-techniques model-inference-service performance-enhancement scalability-strategies serving-infrastructure large-scale-deployment

Updated Apr 12, 2024
Jupyter Notebook

dhingratul / Model-Compression

Star

Reduce the model complexity by 612 times, and memory footprint by 19.5 times compared to base model, while achieving worst case accuracy threshold.

lightweight deep-learning tensorflow keras cnn mnist inception ensemble-model model-complexity model-compression memory-footprint mobile-platform model-acceleration 1x1 conv1x1

Updated Jan 26, 2018
Jupyter Notebook

likholat / openvino_quantization

Star

This sample shows how to convert TensorFlow model to OpenVINO IR model and how to quantize OpenVINO model.

tensorflow quantization model-acceleration openvino model-optimizer model-conversion post-training-quantization

Updated Oct 6, 2022
Python

bnabis93 / vision-language-examples

Star

Vision-lanugage model example code.

tutorial example pytorch transformer embedding-models model-acceleration vision-language model-optimization vision-language-model

Updated Sep 6, 2023
Python

TaehyeonKim-pyomu / CNN_compression_rank_selection_BayesOpt

Star

Bayesian Optimization-Based Global Optimal Rank Selection for Compression of Convolutional Neural Networks, IEEE Access

cnn pytorch bayesopt convolutional-neural-networks bayesian-optimization tensorly tucker model-compression cnn-compression network-acceleration rank-selection model-acceleration neural-network-compression gpyopt low-rank network-compression-acceleration

Updated Mar 21, 2021
Python

MingSun-Tse / Caffe_IncReg

Star

[IJCNN'19, IEEE JSTSP'19] Caffe code for our paper "Structured Pruning for Efficient ConvNets via Incremental Regularization"; [BMVC'18] "Structured Probabilistic Pruning for Convolutional Neural Network Acceleration"

pruning model-compression model-acceleration

Updated Feb 14, 2020
Makefile

cantbebetter2 / Awesome-Diffusion-Distillation

Star

A list of papers, docs, codes about diffusion distillation.This repo collects various distillation methods for the Diffusion model. Welcome to PR the works (papers, repositories) missed by the repo.

awesome deep-learning model-compression distillation model-acceleration diffusion-models lightweight-neural-network

Updated Dec 10, 2023

signalogic / SigDL

Star

Deep Learning Compression and Acceleration SDK -- deep model compression for Edge and IoT embedded systems, and deep model acceleration for clouds and private servers

cloud acceleration compression deep-learning model-compression flow-diagram nvidia-jetson-tx2 embedded-targets model-acceleration

Updated Mar 17, 2018

Lee-Gihun / MicroNet_OSI-AI

Star

(NeurIPS-2019 MicroNet Challenge - 3rd Winner) Open source code for "SIPA: A simple framework for efficient networks"

pruning model-compression micronet model-acceleration neurips-2019 micronet-challenge early-exiting adaptive-computation compact-neural-network

Updated Dec 18, 2022
Python

sdc17 / CrossGET

Star

[ICML 2024] CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers.

framework transformer image-captioning visual-reasoning multimodal-learning visual-question-answering model-acceleration efficient-deep-learning vision-language-transformer image-text-retrieval text-image-retrieval token-ensemble token-matching

Updated Oct 4, 2023

wangxb96 / Awesome-AI-on-the-Edge

Star

Resources of our survey paper "Enabling AI on Edges: Techniques, Applications and Challenges"

machine-learning deep-learning awesome-list data-preprocessing efficient-algorithm model-compression edge-computing model-deployment model-acceleration edge-ai tiny-ml model-design model-inference

Updated May 17, 2024

musco-ai / musco-pytorch

Star

MUSCO: MUlti-Stage COmpression of neural networks

deep-neural-networks pytorch tensor-decomposition cp-decomposition tucker model-compression network-acceleration model-acceleration truncated-svd network-compression low-rank vbmf

Updated Feb 16, 2021
Jupyter Notebook

chester256 / Model-Compression-Papers

Star

Papers for deep neural network compression and acceleration

deep-neural-networks deep-learning papers model-compression model-acceleration

Updated Jun 21, 2021

guan-yuan / awesome-AutoML-and-Lightweight-Models

Star

A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.

tensorflow pytorch hyperparameter-optimization awesome-list quantization nas automl model-compression neural-architecture-search meta-learning architecture-search quantized-training model-acceleration automated-feature-engineering quantized-neural-network

Updated Jun 19, 2021

htqin / awesome-model-quantization

Star

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

awesome deep-learning quantization binarization model-compression model-acceleration binary-network binarized-neural-networks lightweight-neural-network model-quantization efficient-deep-learning

Updated Apr 29, 2024

he-y / Awesome-Pruning

Star

A curated list of neural network pruning resources.

awesome-list pruning model-compression model-acceleration

Updated Apr 4, 2024

Improve this page

Add a description, image, and links to the model-acceleration topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the model-acceleration topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model-acceleration

Here are 16 public repositories matching this topic...

ksm26 / Efficiently-Serving-LLMs

dhingratul / Model-Compression

likholat / openvino_quantization

bnabis93 / vision-language-examples

TaehyeonKim-pyomu / CNN_compression_rank_selection_BayesOpt

MingSun-Tse / Caffe_IncReg

cantbebetter2 / Awesome-Diffusion-Distillation

signalogic / SigDL

Lee-Gihun / MicroNet_OSI-AI

sdc17 / CrossGET

wangxb96 / Awesome-AI-on-the-Edge

musco-ai / musco-pytorch

chester256 / Model-Compression-Papers

guan-yuan / awesome-AutoML-and-Lightweight-Models

htqin / awesome-model-quantization

he-y / Awesome-Pruning

Improve this page

Add this topic to your repo