AI/EXPLORER
ToolsCategoriesSitesLLMsCompareAI QuizAlternativesPremium
—AI Tools
—Sites & Blogs
—LLMs & Models
—Categories
AI Explorer

Find and compare the best artificial intelligence tools for your projects.

Made within France

Explore

  • ›All tools
  • ›Sites & Blogs
  • ›LLMs & Models
  • ›Compare
  • ›Chatbots
  • ›AI Images
  • ›Code & Dev

Company

  • ›Premium
  • ›About
  • ›Contact
  • ›Blog

Legal

  • ›Legal notice
  • ›Privacy
  • ›Terms

© 2026 AI Explorer·All rights reserved.

HomeLLMsgpt neox japanese 2.7b

gpt neox japanese 2.7b

by abeja

Open source · 100k downloads · 58 likes

2.2
(58 reviews)ChatAPI & Local
About

The gpt-neox-japanese-2.7b model is an artificial intelligence specialized in generating Japanese text, based on the GPT-NeoX architecture and featuring 2.7 billion parameters. Trained on a vast corpus that includes data from Wikipedia, Common Crawl, and other Japanese sources, it excels at producing natural and coherent Japanese text. Its key capabilities include writing articles, answering questions, summarizing texts, and generating dialogues, while adapting to different language registers. This model stands out for its optimized size, balancing quality and accessibility, as well as its specialized tokenizer designed to efficiently handle the unique features of the Japanese language. It is aimed at developers and researchers looking to integrate Japanese text generation into their applications, offering an open-source alternative to proprietary solutions.

Documentation

gpt-neox-japanese-2.7b

The open PR is merged on 2022/9/14. You can use this model with v4.23 and higher versions of transformers as follows,

Code
pip install transformers

This repository provides a 2.7B-parameter Japanese GPT-NeoX-based model. The model was trained by ABEJA, Inc

How to use

When using pipeline for text generation.

Python
from transformers import pipeline


generator = pipeline("text-generation", model="abeja/gpt-neox-japanese-2.7b")
generated = generator(
    "人とAIが協調するためには、",
    max_length=300,
    do_sample=True,
    num_return_sequences=3,
    top_p=0.95,
    top_k=50
)
print(*generated, sep="\n")

"""
[out]
{"generated_text": "人とAIが協調するためには、「人が持っている優れた能力とAIの得意とする分野を掛け合わせる」ことが不可欠になります。"}
{"generated_text": "人とAIが協調するためには、双方の長所を活かしていくことが不可欠だと考えています。"}
{"generated_text": "人とAIが協調するためには、人間がAIを理解する、ということが重要です。人間には「AIに対してAIが何をするべきか」ということを明確に教えないと、AIはある程度の知識はあっても何をすべきかがわかりません。だから、コンピューターが考えたり、決めたりすることはAIではなく、人間が解釈して理解できるようにしなくて"}
"""

When using PyTorch.

Python
from transformers import AutoTokenizer, AutoModelForCausalLM


tokenizer = AutoTokenizer.from_pretrained("abeja/gpt-neox-japanese-2.7b")
model = AutoModelForCausalLM.from_pretrained("abeja/gpt-neox-japanese-2.7b")

input_text = "人とAIが協調するためには、"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
gen_tokens = model.generate(
    input_ids,
    max_length=100,
    do_sample=True,
    num_return_sequences=3,
    top_p=0.95,
    top_k=50,
)
for gen_text in tokenizer.batch_decode(gen_tokens, skip_special_tokens=True):
    print(gen_text)

Dataset

The model was trained on Japanese CC-100, Japanese Wikipedia, and Japanese OSCAR.

Tokenization

The model uses a special sub-word tokenizer. Please refer the original repository or GPT-NeoX-Japanese in detail.

Licenese

The MIT license

Capabilities & Tags
transformerspytorchgpt_neox_japanesetext-generationjajapanesegpt_neoxgptlmnlp
Links & Resources
Specifications
CategoryChat
AccessAPI & Local
LicenseOpen Source
PricingOpen Source
Parameters7B parameters
Rating
2.2

Try gpt neox japanese 2.7b

Access the model directly