Bot image for Claude-3-Opus

Shared conversation

Claude-3-Opus ยท

EvoLLM-JP Sakana AI https://huggingface.co/SakanaAI/EvoLLM-JP-v1-10B 10 800 80:1 0.3 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Mar/2024 ๐ŸŸข https://arxiv.org/abs/2403.13187
Parakeet Independent https://colab.research.google.com/drive/1gI8CM9Bz9ov0-E6aL2jF808rE56UtZyF?usp=sharing 0.378 3 8:1 0.0 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Mar/2024 ๐ŸŸข https://news.ycombinator.com/item?id=39745700#39745702
RWKV-v5 EagleX RWKV https://huggingface.co/recursal/EagleX_1-7T 7.52 1700 227:1 0.4 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Mar/2024 ๐ŸŸข https://substack.recursal.ai/p/eaglex-17t-soaring-past-llama-7b
MM1 Apple 30 2010 67:1 0.8 ๐ŸŒ‹ Mar/2024 ๐Ÿ”ด https://arxiv.org/abs/2403.09611
RFM-1 Covariant https://vimeo.com/921866765 8 160 20:1 0.1 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Mar/2024 ๐ŸŸก https://covariant.ai/insights/introducing-rfm-1-giving-robots-human-like-reasoning-capabilities/
Command-R Cohere Cohere 35 700 20:1 0.5 ๐Ÿ“š ๐Ÿ•ธ Mar/2024 ๐ŸŸข https://txt.cohere.com/command-r/
DeepSeek-VL DeepSeek-AI https://github.com/deepseek-ai/DeepSeek-VL?tab=readme-ov-file 7 2000 286:1 0.4 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Mar/2024 ๐ŸŸข https://arxiv.org/abs/2403.05525
AnyGPT Fudan University https://junzhan2000.github.io/AnyGPT.github.io/ 7 2000 286:1 0.4 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Mar/2024 ๐ŸŸข https://arxiv.org/abs/2402.12226
Stable Beluga 2.5 Stability AI 70 2000 29:1 1.2 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Mar/2024 ๐ŸŸข https://stability.ai/news/putting-the-ai-supercomputer-to-work
Inflection-2.5 Inflection AI https://inflection.ai/inflection-2 1200 20000 17:1 16.3 ๐Ÿ†† ๐Ÿ“š โฌ† ๐Ÿ•ธ Mar/2024 ๐ŸŸข https://inflection.ai/inflection-2-5
Apollo SRIBD/CUHK https://apollo.llmzoo.com/ 7 2500 358:1 0.4 ๐Ÿ†† ๐Ÿ“š๐Ÿ•ธ ๐ŸŒ‹ Mar/2024 ๐ŸŸข https://arxiv.org/abs/2403.03640
Claude 3 Opus Anthropic https://claude.ai/ 2000 40000 20:1 29.8 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Mar/2024 ๐ŸŸข https://www.anthropic.com/claude-3-model-card
Samba-1 SambaNova https://trysambanova.ai/ 1400 20000 15:1 17.6 ๐ŸŒ‹ Feb/2024 ๐ŸŸก https://sambanova.ai/press/secure-one-trillion-parameter-generative-ai-model-for-the-enterprise
StarCoder 2 HF/ServiceNow 15 4300 287:1 0.8 ๐ŸŒ‹ Feb/2024 ๐ŸŸข https://arxiv.org/abs/2402.19173
530B ByteDance 530 300 1:1 1.3 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Feb/2024 ๐Ÿ”ด https://arxiv.org/abs/2402.15627
175B ByteDance 175 300 2:1 0.8 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Feb/2024 ๐Ÿ”ด https://arxiv.org/abs/2402.15627
Mistral Small Mistral AI https://chat.mistral.ai/chat 7 3000 429:1 0.5 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Feb/2024 ๐ŸŸข https://mistral.ai/news/mistral-large/
Mistral Large Mistral AI https://poe.com/Mistral-Large 540 11000 21:1 8.1 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Feb/2024 ๐ŸŸข https://mistral.ai/news/mistral-large/
Hanooman Reliance 40 ๐ŸŒ‹ Feb/2024 ๐ŸŸข https://www.hanooman.ai/
Ask Apple 20 ๐ŸŒ‹ Feb/2024 ๐Ÿ”ด https://www.macrumors.com/2024/02/22/applecare-advisors-testing-new-ask-tool/
Reka Edge Reka AI https://chat.reka.ai/ 7 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Feb/2024 ๐ŸŸข https://reka.ai/reka-flash-an-efficient-and-capable-multimodal-language-model/
Reka Flash Reka AI https://poe.com/RekaFlash 21 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Feb/2024 ๐ŸŸข https://reka.ai/reka-flash-an-efficient-and-capable-multimodal-language-model/
Gemma Google DeepMind https://labs.pplx.ai/ 7 6000 858:1 0.7 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Feb/2024 ๐ŸŸข https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf
Gemini 1.5 Pro Google DeepMind https://aistudio.google.com/app/waitlist/97445851 1500 30000 20:1 22.4 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Feb/2024 ๐ŸŸก https://storage.googleapis.com/deepmind-media/gemini/gemini_v1_5_report.pdf
Qwen-1.5 Alibaba https://huggingface.co/spaces/Qwen/Qwen1.5-72B-Chat 72 3000 42:1 1.5 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Feb/2024 ๐ŸŸข https://qwenlm.github.io/blog/qwen1.5/
GOODY-2 BRAIN https://www.goody2.ai/chat ๐ŸŒ‹ Feb/2024 ๐ŸŸข https://www.goody2.ai/goody2-modelcard.pdf
Natural-SQL-7B ChatDB 7 2000 286:1 0.4 ๐ŸŒ‹ Feb/2024 ๐ŸŸข https://huggingface.co/chatdb/natural-sql-7b
Sea-Lion AI Singapore https://aisingapore.org/aiproducts/sea-lion/ 7.5 980 131:1 0.3 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Feb/2024 ๐ŸŸข https://huggingface.co/aisingapore/sealion7b
TimesFM Google 0.2 100 500:1 0.0 ๐Ÿ†† ๐ŸŒ‹ Feb/2024 ๐Ÿ”ด https://blog.research.google/2024/02/a-decoder-only-foundation-model-for.html
OLMo Allen AI https://huggingface.co/allenai/OLMo-7B 7 2500 358:1 0.4 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Feb/2024 ๐ŸŸข https://allenai.org/olmo/olmo-paper.pdf
FLOR-6.3B Cerebras https://huggingface.co/projecte-aina/FLOR-6.3B 6.3 481 77:1 0.2 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://www.cerebras.net/press-release/cerebras-systems-and-barcelona-supercomputing-center-train-industry-leading-multilingual-spanish-catalan-english-llm
Weaver AIWaves.cn https://www.wawawriter.com/ 34 2018 60:1 0.9 ๐Ÿ“š Jan/2024 ๐ŸŸข https://arxiv.org/abs/2401.17268
miqu 70b Mistral AI https://huggingface.co/miqudev/miqu-1-70b 70 3000 43:1 1.5 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://huggingface.co/miqudev/miqu-1-70b
iFlytekSpark-13B iFlyTek https://gitee.com/iflytekopensource/iFlytekSpark-13B 13 3000 231:1 0.7 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://www.ithome.com/0/748/030.htm
Xinghuo 3.5 (Spark) iFlyTek 200 4000 20:1 3.0 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://www.laitimes.com/en/article/6f50u_6vhbm.html
MGIE Apple https://github.com/tsujuifu/pytorch_mgie 7 2000 286:1 0.4 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://openreview.net/forum?id=S1RKWSyZ2Y
CodeLlama-70B Meta AI https://huggingface.co/codellama/CodeLlama-70b-hf 70 2000 29:1 1.2 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://ai.meta.com/research/publications/code-llama-open-foundation-models-for-code/
RWKV-v5 Eagle 7B RWKV https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-2 7.52 1100 147:1 0.3 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://blog.rwkv.com/p/eagle-7b-soaring-past-transformers
MaLA-500 LMU https://huggingface.co/MaLA-LM/mala-500 10 2000 200:1 0.5 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://arxiv.org/abs/2401.13303
MambaByte Cornell https://github.com/kyegomez/MambaByte 0.972 37.5 39:1 0.0 ๐Ÿ“š๐ŸŒ‹ Jan/2024 ๐Ÿ”ด https://arxiv.org/abs/2401.13660
DeepSeek-Coder DeepSeek-AI https://coder.deepseek.com/ 33 2000 61:1 0.9 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://arxiv.org/abs/2401.14196
FuseLLM Tencent https://github.com/fanqiwan/FuseLLM 7 2000 286:1 0.4 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://arxiv.org/abs/2401.10491
Fuyu-Heavy Adept 120 5000 42:1 2.6 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸก https://www.adept.ai/blog/adept-fuyu-heavy
GLM-4 Zhipu AI (Tsinghua) https://open.bigmodel.cn/ 200 4000 20:1 3.0 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://pandaily.com/zhipu-ai-unveils-glm-4-model-with-advanced-performance-paralleling-gpt-4/
DeepSeekMoE DeepSeek-AI 16 2000 125:1 0.6 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐Ÿ”ด https://arxiv.org/abs/2401.06066
DeepSeek DeepSeek-AI https://chat.deepseek.com/ 67 2000 30:1 1.2 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://arxiv.org/abs/2401.02954
LLaMA Pro Tencent https://huggingface.co/TencentARC/LLaMA-Pro-8B 8.3 2080 251:1 0.4 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://arxiv.org/abs/2401.02415
TinyLlama SUTD/Independent https://github.com/jzhang38/TinyLlama 1.1 3000 2,728:1 0.2 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐ŸŸข https://arxiv.org/abs/2401.02385
DocLLM JPMorgan 7 2000 286:1 0.4 ๐Ÿ†† ๐Ÿ“šโฌ† ๐Ÿ•ธ ๐ŸŒ‹ Jan/2024 ๐Ÿ”ด https://arxiv.org/abs/2401.00908

====

this is a dump of llms released in q1 2024. i need this as a comma item list with each model name linked to the paper. for example, the output should be (with markdown linking the model name to the paper):
JPMorgan DocLLM 7B, SUTD TinyLlama 1.1B,

thanks. can you add month headings in bold for jan/feb/mar, you'll have to reverse the order. can you also only apply the link to the official model name, not the lab and not the param count

March 2024

EvoLLM-JP Sakana AI 10B, Parakeet Independent 378M, RWKV-v5 EagleX RWKV 1.7T, MM1 Apple, RFM-1 Covariant 8B, Command-R Cohere 35B, DeepSeek-VL DeepSeek-AI 7B, AnyGPT Fudan University 7B, Stable Beluga 2.5 Stability AI 70B, Inflection-2.5 Inflection AI 1.2T, Apollo SRIBD/CUHK 7B, Claude 3 Opus Anthropic 2T

February 2024

Samba-1 SambaNova 1.4T, StarCoder 2 HF/ServiceNow 15B, 530B ByteDance, 175B ByteDance, Mistral Small Mistral AI 7B, Mistral Large Mistral AI 540B, Hanooman Reliance 40B, Ask Apple 20B, Reka Edge Reka AI 7B, Reka Flash Reka AI 21B, Gemma Google DeepMind 7B, Gemini 1.5 Pro Google DeepMind 1.5T, Qwen-1.5 Alibaba 72B, GOODY-2 BRAIN, Natural-SQL-7B ChatDB 7B, Sea-Lion AI Singapore 7.5B, TimesFM Google 200M, OLMo Allen AI 7B, FLOR-6.3B Cerebras 6.3B

January 2024

Weaver AIWaves.cn 34B, miqu 70b Mistral AI 70B, iFlytekSpark-13B iFlyTek 13B, Xinghuo 3.5 (Spark) iFlyTek 200B, MGIE Apple 7B, CodeLlama-70B Meta AI 70B, RWKV-v5 Eagle 7B RWKV 7.52B, MaLA-500 LMU 10B, MambaByte Cornell 972M, DeepSeek-Coder DeepSeek-AI 33B, FuseLLM Tencent 7B, Fuyu-Heavy Adept 120B, GLM-4 Zhipu AI (Tsinghua) 200B, DeepSeekMoE DeepSeek-AI 16B, DeepSeek DeepSeek-AI 67B, LLaMA Pro Tencent 8.3B, TinyLlama SUTD/Independent 1.1B, DocLLM JPMorgan 7B

that doesn't look right. the link should be on the model name only. and the months should be ascending, as well as the model releases

January 2024

DocLLM JPMorgan 7B, TinyLlama SUTD/Independent 1.1B, LLaMA Pro Tencent 8.3B, DeepSeek DeepSeek-AI 67B, DeepSeekMoE DeepSeek-AI 16B, GLM-4 Zhipu AI (Tsinghua) 200B, Fuyu-Heavy Adept 120B, FuseLLM Tencent 7B, DeepSeek-Coder DeepSeek-AI 33B, MambaByte Cornell 972M, MaLA-500 LMU 10B, RWKV-v5 Eagle 7B RWKV 7.52B, CodeLlama-70B Meta AI 70B, MGIE Apple 7B, Xinghuo 3.5 (Spark) iFlyTek 200B, iFlytekSpark-13B iFlyTek 13B, miqu 70b Mistral AI 70B, Weaver AIWaves.cn 34B

February 2024

FLOR-6.3B Cerebras 6.3B, OLMo Allen AI 7B, TimesFM Google 200M, Sea-Lion AI Singapore 7.5B, Natural-SQL-7B ChatDB 7B, GOODY-2 BRAIN, Qwen-1.5 Alibaba 72B, Gemini 1.5 Pro Google DeepMind 1.5T, Gemma Google DeepMind 7B, Reka Flash Reka AI 21B, Reka Edge Reka AI 7B, Ask Apple 20B, Hanooman Reliance 40B, Mistral Large Mistral AI 540B, Mistral Small Mistral AI 7B, 175B ByteDance, 530B ByteDance, StarCoder 2 HF/ServiceNow 15B, Samba-1 SambaNova 1.4T

March 2024

Claude 3 Opus Anthropic 2T, Apollo SRIBD/CUHK 7B, Inflection-2.5 Inflection AI 1.2T, Stable Beluga 2.5 Stability AI 70B, AnyGPT Fudan University 7B, DeepSeek-VL DeepSeek-AI 7B, Command-R Cohere 35B, RFM-1 Covariant 8B, MM1 Apple, RWKV-v5 EagleX RWKV 1.7T, Parakeet Independent 378M, EvoLLM-JP Sakana AI 10B

that doesn't look right. in the first example, JPMorgan is the lab, so that name comes first before the model name. don't repeat parameter count if it's already in the name

January 2024

JPMorgan DocLLM 7B, SUTD/Independent TinyLlama 1.1B, Tencent LLaMA Pro 8.3B, DeepSeek-AI DeepSeek 67B, DeepSeek-AI DeepSeekMoE 16B, Zhipu AI (Tsinghua) GLM-4 200B, Adept Fuyu-Heavy 120B, Tencent FuseLLM 7B, DeepSeek-AI DeepSeek-Coder 33B, Cornell MambaByte 972M, LMU MaLA-500 10B, RWKV RWKV-v5 Eagle 7B, Meta AI CodeLlama-70B, Apple MGIE 7B, iFlyTek Xinghuo 3.5 (Spark) 200B, iFlyTek iFlytekSpark-13B, Mistral AI miqu 70b, AIWaves.cn Weaver 34B

February 2024

Cerebras FLOR-6.3B, Allen AI OLMo 7B, Google TimesFM 200M, AI Singapore Sea-Lion 7.5B, ChatDB Natural-SQL-7B, BRAIN GOODY-2, Alibaba Qwen-1.5 72B, Google DeepMind Gemini 1.5 Pro 1.5T, Google DeepMind Gemma 7B, Reka AI Reka Flash 21B, Reka AI Reka Edge 7B, Apple Ask 20B, Reliance Hanooman 40B, Mistral AI Mistral Large 540B, Mistral AI Mistral Small 7B, ByteDance 175B, ByteDance 530B, HF/ServiceNow StarCoder 2 15B, SambaNova Samba-1 1.4T

March 2024

Anthropic Claude 3 Opus 2T, SRIBD/CUHK Apollo 7B, Inflection AI Inflection-2.5 1.2T, Stability AI Stable Beluga 2.5 70B, Fudan University AnyGPT 7B, DeepSeek-AI DeepSeek-VL 7B, Cohere Command-R 35B, Covariant RFM-1 8B, Apple MM1, RWKV RWKV-v5 EagleX 1.7T, Independent Parakeet 378M, Sakana AI EvoLLM-JP 10B

This conversation may have been edited by the sharer.

Bot image for Claude-3-OpusTalk to Claude-3-Opus on Poe