speculative-decoding

Here are 13 public repositories matching this topic...

intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

retrieval chatbot rag habana large-language-model chatpdf llm-inference 4-bits speculative-decoding llm-cpu streamingllm intel-optimized-llamacpp neural-chat neural-chat-7b autoround gaudi3

Updated Jun 12, 2024
Python

SafeAILab / EAGLE

Star

[ICML'24] EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty

large-language-models llm-inference speculative-decoding

Updated May 26, 2024
Python

Infini-AI-Lab / Sequoia

Star

scalable and robust tree-based speculative decoding algorithm

efficiency inference llm speculative-decoding

Updated Jun 7, 2024
Python

Infini-AI-Lab / TriForce

Star

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

acceleration efficiency inference llm long-context llm-inference speculative-decoding

Updated Apr 20, 2024
Python

kssteven418 / BigLittleDecoder

Star

[NeurIPS'23] Speculative Decoding with Big Little Decoder

decoding efficient-inference speculative-execution fast-inference llm speculative-decoding

Updated Feb 6, 2024
Python

hemingkx / SpecDec

Star

Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)

non-autoregressive speculative-decoding

Updated Dec 9, 2023
Python

mscheong01 / speculative_decoding.c

Star

minimal C implementation of speculative decoding based on llama2.c

c artificial-intelligence llm llama2 speculative-decoding

Updated Apr 22, 2024
C

romsto / Speculative-Decoding

Star

Implementation of the paper Fast Inference from Transformers via Speculative Decoding, Leviathan et al. 2023.

fast-inference llm llm-inference speculative-decoding llm-optimization

Updated Jun 7, 2024
Python

pinqian77 / Dynasurge

Star

Dynasurge: Dynamic Tree Speculation for Prompt-Specific Decoding

large-language-models speculative-decoding

Updated Apr 29, 2024
Python

AutonomicPerfectionist / PipeInfer

Sponsor

Star

PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation

inference llm llamacpp speculative-decoding

Updated Apr 15, 2024
C++

u-hyszk / japanese-speculative-decoding

Star

Verification of the effect of speculative decoding in Japanese.

nlp japanese fast-inference speculative-decoding

Updated Mar 4, 2024
Python

kinshukdua / SpecDec

Star

Some experiments aimed at increasing LLM throughput and efficiency via Speculative Decoding.

inference llm speculative-decoding

Updated Jul 31, 2023
Python

PopoDev / CSE481N_Project

Star

Reproducibility Project for [NeurIPS'23] Speculative Decoding with Big Little Decoder

fast-inference llm speculative-decoding

Updated May 30, 2024
Python

Improve this page

Add a description, image, and links to the speculative-decoding topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speculative-decoding topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speculative-decoding

Here are 13 public repositories matching this topic...

intel / intel-extension-for-transformers

SafeAILab / EAGLE

Infini-AI-Lab / Sequoia

Infini-AI-Lab / TriForce

kssteven418 / BigLittleDecoder

hemingkx / SpecDec

mscheong01 / speculative_decoding.c

romsto / Speculative-Decoding

pinqian77 / Dynasurge

AutonomicPerfectionist / PipeInfer

u-hyszk / japanese-speculative-decoding

kinshukdua / SpecDec

PopoDev / CSE481N_Project

Improve this page

Add this topic to your repo