speech

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

speech image-editing caption data-generation 3d-whole-body-pose-estimation open-vocabulary-detection open-vocabulary-segmentation automatic-labeling-system

Updated May 23, 2024
Jupyter Notebook

kaldi-asr / kaldi

Star

kaldi-asr/kaldi is the official location of the Kaldi project.

shell c-plus-plus cuda speech speech-recognition speech-to-text kaldi speaker-verification speaker-id

Updated Jun 3, 2024
Shell

AIGC-Audio / AudioGPT

Star

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

audio music speech sound gpt talking-head

Updated Apr 2, 2024
Python

m-bain / whisperX

Star

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

speech speech-recognition speech-to-text whisper asr

Updated Jun 2, 2024
Python

mozilla / TTS

Star

🤖 💬 Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

python text-to-speech deep-learning speech pytorch tts vocoder tacotron tensorflow2 tacotron2 melgan speaker-encoder dataset-analysis glow-tts multiband-melgan gantts

Updated Nov 9, 2023
Jupyter Notebook

PaddlePaddle / models

Star

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.

nlp natural-language-processing computer-vision deep-learning neural-network models cv speech recommendation paddlepaddle

Updated Sep 5, 2023
Python

TalAter / annyang

Star

💬 Speech recognition for your site

voice speech speech-recognition speech-to-text

Updated Jun 12, 2024
JavaScript

netease-youdao / EmotiVoice

Star

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

python text-to-speech ai deep-learning style prompt speech emotion pytorch tts speech-synthesis multi-speaker emotivoice

Updated Feb 6, 2024
Python

modelscope / modelscope

Star

ModelScope: bring the notion of Model-as-a-Service to life.

python nlp science machine-learning deep-learning cv speech multi-modal

Updated Jun 11, 2024
Python

snakers4 / silero-models

Star

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Updated Oct 18, 2023
Jupyter Notebook

shu223 / iOS-10-Sampler

Sponsor

Star

Code examples for new APIs of iOS 10.

ios demo metal speech cnn swift-3 image-recognition convolutional-neural-networks ios10 uiviewpropertyanimator swift-4 metal-performance-shaders metal-cnn

Updated May 1, 2024
Swift

metavoiceio / metavoice-src

Star

Foundational model for human-like, expressive TTS

text-to-speech ai deep-learning speech pytorch tts speech-synthesis voice-clone zero-shot-tts

Updated Jun 8, 2024
Python

tensorflow / lingvo

Star

Lingvo

nlp research translation tensorflow machine-translation speech distributed tts speech-synthesis mnist speech-recognition lm seq2seq speech-to-text gpu-computing language-model asr

Updated Jun 6, 2024
Python

hahahumble / speechgpt

Star

💬 SpeechGPT is a web application that enables you to converse with ChatGPT.

chat chatbot language-learning speech conversation chatgpt

Updated Oct 16, 2023
TypeScript

readbeyond / aeneas

Star

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

Updated Jun 11, 2024
Python

pytorch / audio

Star

Data manipulation and transformation for audio signal processing, powered by PyTorch

audio python machine-learning speech pytorch io audio-processing

Updated Jun 13, 2024
Python

Improve this page

Add a description, image, and links to the speech topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speech topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speech

Here are 1,630 public repositories matching this topic...

babysor / MockingBird

coqui-ai / TTS

svc-develop-team / so-vits-svc

huggingface / datasets

IDEA-Research / Grounded-Segment-Anything

kaldi-asr / kaldi

AIGC-Audio / AudioGPT

m-bain / whisperX

mozilla / TTS

PaddlePaddle / models

TalAter / annyang

netease-youdao / EmotiVoice

modelscope / modelscope

snakers4 / silero-models

shu223 / iOS-10-Sampler

metavoiceio / metavoice-src

tensorflow / lingvo

hahahumble / speechgpt

readbeyond / aeneas

pytorch / audio

Improve this page

Add this topic to your repo