Swallow 70B
Open Weights
Tokyo Institute of Technology
Tokyo Institute of Technology's Japanese-enhanced Llama model with 70B parameters. Based on Llama 3.3, continually pre-trained on ~200B Japanese tokens from Swallow Corpus v2, Japanese/English Wikipedia, and math/code content. Features expanded vocabulary with Japanese characters and subwords for efficient tokenization and notably faster inference. Evaluated on 10 Japanese benchmarks (JCommonsenseQA, JEMHopQA, NIILC, JSQuAD) and 10 English benchmarks (OpenBookQA, TriviaQA, SQuAD 2.0, XWINO, HellaSwag). Achieves best 70B-class performance for Japanese as of Dec 2023. Also available in Llama 3.1 variants. Represents state-of-the-art Japanese language AI.
Strengths
Caveats
Capabilities
Vision
Audio
Video
Tool Use
Resources
No external resources available