Skip to content
#

attention-mechanism

Here are 1,499 public repositories matching this topic...

swarms

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

  • Updated Jun 11, 2024
  • Python

Researching causal relationships in time series data using Temporal Convolutional Networks (TCNs) combined with attention mechanisms. This approach aims to identify complex temporal interactions. Additionally, we're incorporating uncertainty quantification to enhance the reliability of our causal predictions.

  • Updated Jun 10, 2024
  • Jupyter Notebook

QuillGPT is an implementation of the GPT decoder block based on the architecture from Attention is All You Need paper by Vaswani et. al. in PyTorch. Additionally, this repository contains two pre-trained models — Shakespearean GPT and Harpoon GPT, a Streamlit Playground, Containerized FastAPI Microservice, training - inference scripts & notebooks.

  • Updated Jun 7, 2024
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the attention-mechanism topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the attention-mechanism topic, visit your repo's landing page and select "manage topics."

Learn more