by cyberagent
Open source · 213k downloads · 20 likes
OpenCALM-3B is a Japanese language model developed by CyberAgent, designed for generating Japanese text. Trained on datasets such as Wikipedia and Common Crawl, it excels in various tasks, including writing, summarization, and question answering. Its optimized architecture makes it particularly well-suited for applications requiring a deep understanding of the Japanese language. The model stands out for its open license (CC BY-SA 4.4.0), which encourages its use and sharing while requiring clear attribution. It is ideal for developers and researchers seeking a high-performance and accessible Japanese language solution.
OpenCALM is a suite of decoder-only language models pre-trained on Japanese datasets, developed by CyberAgent, Inc.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("cyberagent/open-calm-3b", device_map="auto", torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained("cyberagent/open-calm-3b")
inputs = tokenizer("AIによって私達の暮らしは、", return_tensors="pt").to(model.device)
with torch.no_grad():
tokens = model.generate(
**inputs,
max_new_tokens=64,
do_sample=True,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.05,
pad_token_id=tokenizer.pad_token_id,
)
output = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(output)
| Model | Params | Layers | Dim | Heads | Dev ppl |
|---|---|---|---|---|---|
| cyberagent/open-calm-small | 160M | 12 | 768 | 12 | 19.7 |
| cyberagent/open-calm-medium | 400M | 24 | 1024 | 16 | 13.8 |
| cyberagent/open-calm-large | 830M | 24 | 1536 | 16 | 11.3 |
| cyberagent/open-calm-1b | 1.4B | 24 | 2048 | 16 | 10.3 |
| cyberagent/open-calm-3b | 2.7B | 32 | 2560 | 32 | 9.7 |
| cyberagent/open-calm-7b | 6.8B | 32 | 4096 | 32 | 8.2 |
@software{gpt-neox-library,
title = {{GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch}},
author = {Andonian, Alex and Anthony, Quentin and Biderman, Stella and Black, Sid and Gali, Preetham and Gao, Leo and Hallahan, Eric and Levy-Kramer, Josh and Leahy, Connor and Nestler, Lucas and Parker, Kip and Pieler, Michael and Purohit, Shivanshu and Songz, Tri and Phil, Wang and Weinbach, Samuel},
url = {https://www.github.com/eleutherai/gpt-neox},
doi = {10.5281/zenodo.5879544},
month = {8},
year = {2021},
version = {0.0.1},
}