EvoLLM-JP Sakana AI https://huggingface.co/SakanaAI/EvoLLM-JP-v1-10B 10 800 80:1 0.3 🆆 📚⬆ 🕸 🌋 Mar/2024 🟢 https://arxiv.org/abs/2403.13187 Parakeet Independent https://colab.research.google.com/drive/

Shared conversation

Claude-3-Opus ·

====

this is a dump of llms released in q1 2024. i need this as a comma item list with each model name linked to the paper. for example, the output should be (with markdown linking the model name to the paper):
JPMorgan DocLLM 7B, SUTD TinyLlama 1.1B,

recursal/EagleX_1-7T · Hugging Face (https://huggingface.co/recursal/EagleX_1-7T)

html · 52 kB

Command-R: RAG at Production Scale (https://txt.cohere.com/command-r/)

html · 52 kB

RFM-1 Release: Language on Vimeo (https://vimeo.com/921866765)

html · 289 kB

Google Colab (https://colab.research.google.com/drive/1gI8CM9Bz9ov0-E6aL2jF808rE56UtZyF?usp=sharing)

html · 88 kB

🦅 EagleX 1.7T : Soaring past LLaMA 7B 2T in both English and Multi-lang evals (RWKV-v5) (https://substack.recursal.ai/p/eaglex-17t-soaring-past-llama-7b)

html · 276 kB

Parakeet: A Tiny LLM | Hacker News (https://news.ycombinator.com/item?id=39745700#39745702)

html · 40 kB

SakanaAI/EvoLLM-JP-v1-10B · Hugging Face (https://huggingface.co/SakanaAI/EvoLLM-JP-v1-10B)

html · 90 kB

[2403.09611] MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training (https://arxiv.org/abs/2403.09611)

html · 50 kB

Covariant | Powering the Future of Automation, Today (https://covariant.ai/insights/introducing-rfm-1-giving-robots-human-like-reasoning-capabilities/)

html · 201 kB

[2403.13187] Evolutionary Optimization of Model Merging Recipes (https://arxiv.org/abs/2403.13187)

html · 45 kB

Claude-3-Opus

Poe

Here is the comma-separated list with each model name linked to the respective paper or source using Markdown:

EvoLLM-JP 10B, Parakeet 378M, RWKV-v5 EagleX 1.7T, Apple MM1, Covariant RFM-1 8B, Cohere Command-R 35B, DeepSeek-VL 7B, Fudan University AnyGPT 7B, Stability AI Stable Beluga 2.5 70B, Inflection AI Inflection-2.5 1.2T, SRIBD/CUHK Apollo 7B, Anthropic Claude 3 Opus, SambaNova Samba-1 1.4T, HF/ServiceNow StarCoder 2 15B, ByteDance 530B, ByteDance 175B, Mistral AI Mistral Small 7B, Mistral AI Mistral Large 540B, Reliance Hanooman 40B, Apple Ask 20B, Reka AI Reka Edge 7B, Reka AI Reka Flash 21B, Google DeepMind Gemma 7B, Google DeepMind Gemini 1.5 Pro 1.5T, Alibaba Qwen-1.5 72B, BRAIN GOODY-2, ChatDB Natural-SQL-7B, AI Singapore Sea-Lion 7.5B, Google TimesFM 200M, Allen AI OLMo 7B, Cerebras FLOR-6.3B, AIWaves.cn Weaver 34B, Mistral AI miqu 70b, iFlyTek iFlytekSpark-13B, iFlyTek Xinghuo 3.5 (Spark) 200B, Apple MGIE 7B, Meta AI CodeLlama-70B, RWKV RWKV-v5 Eagle 7B, LMU MaLA-500 10B, Cornell MambaByte 972M, DeepSeek-AI DeepSeek-Coder 33B, Tencent FuseLLM 7B, Adept Fuyu-Heavy 120B, Zhipu AI (Tsinghua) GLM-4 200B, DeepSeek-AI DeepSeekMoE 16B, DeepSeek-AI DeepSeek 67B, Tencent LLaMA Pro 8.3B, SUTD/Independent TinyLlama 1.1B, JPMorgan DocLLM 7B

thanks. can you add month headings in bold for jan/feb/mar, you'll have to reverse the order. can you also only apply the link to the official model name, not the lab and not the param count

Claude-3-Opus

Poe

March 2024

EvoLLM-JP Sakana AI 10B, Parakeet Independent 378M, RWKV-v5 EagleX RWKV 1.7T, MM1 Apple, RFM-1 Covariant 8B, Command-R Cohere 35B, DeepSeek-VL DeepSeek-AI 7B, AnyGPT Fudan University 7B, Stable Beluga 2.5 Stability AI 70B, Inflection-2.5 Inflection AI 1.2T, Apollo SRIBD/CUHK 7B, Claude 3 Opus Anthropic 2T

February 2024

Samba-1 SambaNova 1.4T, StarCoder 2 HF/ServiceNow 15B, 530B ByteDance, 175B ByteDance, Mistral Small Mistral AI 7B, Mistral Large Mistral AI 540B, Hanooman Reliance 40B, Ask Apple 20B, Reka Edge Reka AI 7B, Reka Flash Reka AI 21B, Gemma Google DeepMind 7B, Gemini 1.5 Pro Google DeepMind 1.5T, Qwen-1.5 Alibaba 72B, GOODY-2 BRAIN, Natural-SQL-7B ChatDB 7B, Sea-Lion AI Singapore 7.5B, TimesFM Google 200M, OLMo Allen AI 7B, FLOR-6.3B Cerebras 6.3B

January 2024

Weaver AIWaves.cn 34B, miqu 70b Mistral AI 70B, iFlytekSpark-13B iFlyTek 13B, Xinghuo 3.5 (Spark) iFlyTek 200B, MGIE Apple 7B, CodeLlama-70B Meta AI 70B, RWKV-v5 Eagle 7B RWKV 7.52B, MaLA-500 LMU 10B, MambaByte Cornell 972M, DeepSeek-Coder DeepSeek-AI 33B, FuseLLM Tencent 7B, Fuyu-Heavy Adept 120B, GLM-4 Zhipu AI (Tsinghua) 200B, DeepSeekMoE DeepSeek-AI 16B, DeepSeek DeepSeek-AI 67B, LLaMA Pro Tencent 8.3B, TinyLlama SUTD/Independent 1.1B, DocLLM JPMorgan 7B

that doesn't look right. the link should be on the model name only. and the months should be ascending, as well as the model releases

Claude-3-Opus

Poe

January 2024

DocLLM JPMorgan 7B, TinyLlama SUTD/Independent 1.1B, LLaMA Pro Tencent 8.3B, DeepSeek DeepSeek-AI 67B, DeepSeekMoE DeepSeek-AI 16B, GLM-4 Zhipu AI (Tsinghua) 200B, Fuyu-Heavy Adept 120B, FuseLLM Tencent 7B, DeepSeek-Coder DeepSeek-AI 33B, MambaByte Cornell 972M, MaLA-500 LMU 10B, RWKV-v5 Eagle 7B RWKV 7.52B, CodeLlama-70B Meta AI 70B, MGIE Apple 7B, Xinghuo 3.5 (Spark) iFlyTek 200B, iFlytekSpark-13B iFlyTek 13B, miqu 70b Mistral AI 70B, Weaver AIWaves.cn 34B

February 2024

FLOR-6.3B Cerebras 6.3B, OLMo Allen AI 7B, TimesFM Google 200M, Sea-Lion AI Singapore 7.5B, Natural-SQL-7B ChatDB 7B, GOODY-2 BRAIN, Qwen-1.5 Alibaba 72B, Gemini 1.5 Pro Google DeepMind 1.5T, Gemma Google DeepMind 7B, Reka Flash Reka AI 21B, Reka Edge Reka AI 7B, Ask Apple 20B, Hanooman Reliance 40B, Mistral Large Mistral AI 540B, Mistral Small Mistral AI 7B, 175B ByteDance, 530B ByteDance, StarCoder 2 HF/ServiceNow 15B, Samba-1 SambaNova 1.4T

March 2024

Claude 3 Opus Anthropic 2T, Apollo SRIBD/CUHK 7B, Inflection-2.5 Inflection AI 1.2T, Stable Beluga 2.5 Stability AI 70B, AnyGPT Fudan University 7B, DeepSeek-VL DeepSeek-AI 7B, Command-R Cohere 35B, RFM-1 Covariant 8B, MM1 Apple, RWKV-v5 EagleX RWKV 1.7T, Parakeet Independent 378M, EvoLLM-JP Sakana AI 10B

that doesn't look right. in the first example, JPMorgan is the lab, so that name comes first before the model name. don't repeat parameter count if it's already in the name

Claude-3-Opus

Poe

January 2024

JPMorgan DocLLM 7B, SUTD/Independent TinyLlama 1.1B, Tencent LLaMA Pro 8.3B, DeepSeek-AI DeepSeek 67B, DeepSeek-AI DeepSeekMoE 16B, Zhipu AI (Tsinghua) GLM-4 200B, Adept Fuyu-Heavy 120B, Tencent FuseLLM 7B, DeepSeek-AI DeepSeek-Coder 33B, Cornell MambaByte 972M, LMU MaLA-500 10B, RWKV RWKV-v5 Eagle 7B, Meta AI CodeLlama-70B, Apple MGIE 7B, iFlyTek Xinghuo 3.5 (Spark) 200B, iFlyTek iFlytekSpark-13B, Mistral AI miqu 70b, AIWaves.cn Weaver 34B

February 2024

Cerebras FLOR-6.3B, Allen AI OLMo 7B, Google TimesFM 200M, AI Singapore Sea-Lion 7.5B, ChatDB Natural-SQL-7B, BRAIN GOODY-2, Alibaba Qwen-1.5 72B, Google DeepMind Gemini 1.5 Pro 1.5T, Google DeepMind Gemma 7B, Reka AI Reka Flash 21B, Reka AI Reka Edge 7B, Apple Ask 20B, Reliance Hanooman 40B, Mistral AI Mistral Large 540B, Mistral AI Mistral Small 7B, ByteDance 175B, ByteDance 530B, HF/ServiceNow StarCoder 2 15B, SambaNova Samba-1 1.4T

March 2024

Anthropic Claude 3 Opus 2T, SRIBD/CUHK Apollo 7B, Inflection AI Inflection-2.5 1.2T, Stability AI Stable Beluga 2.5 70B, Fudan University AnyGPT 7B, DeepSeek-AI DeepSeek-VL 7B, Cohere Command-R 35B, Covariant RFM-1 8B, Apple MM1, RWKV RWKV-v5 EagleX 1.7T, Independent Parakeet 378M, Sakana AI EvoLLM-JP 10B

This conversation may have been edited by the sharer.