Falcon 180B
Open Weights
Technology Innovation Institute
Technology Innovation Institute's massive 180B parameter open-access model trained on 3.5T tokens. Causal decoder-only architecture with 80 layers, hidden dimension 14,848, vocabulary size 65,024. Trained on up to 4,096 A100 GPUs using Amazon SageMaker for ~7M GPU hours. Dataset consists of 85% RefinedWeb plus curated conversations, technical papers, and code (~3%). Achieved 68.74 on Hugging Face Open LLM Leaderboard - highest among open models at release. Surpassed Meta's LLaMA 2 and ranks near GPT-4 and PaLM 2. 2.5x larger than Llama 2 with 4x more compute. Released under Falcon 180B TII License (Apache 2.0 based) for research and commercial use.
Strengths
Caveats
Capabilities
Vision
Audio
Video
Tool Use
Resources
No external resources available