by monologg
Open source · 30k downloads · 20 likes
KoBERT is a language model specialized in processing Korean, built on the BERT architecture. It excels in understanding and generating Korean text, delivering optimized performance for the language. Its core capabilities include semantic analysis, text classification, question answering, and content generation. The model is particularly useful for applications requiring a nuanced grasp of Korean, such as chatbots, sentiment analysis tools, or machine translation systems. What sets it apart is its training on Korean-specific corpora, enabling it to outperform generic multilingual models on tasks involving the language.
If you want to import KoBERT tokenizer with
AutoTokenizer, you should givetrust_remote_code=True.
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("monologg/kobert")
tokenizer = AutoTokenizer.from_pretrained("monologg/kobert", trust_remote_code=True)