gpt oss 20b MXFP4 Q8

Name: gpt oss 20b MXFP4 Q8
Rating: 2.1 (51 reviews)

by mlx-community

Open source · 598k downloads · 51 likes

2.1

(51 reviews)ChatAPI & Local

About

This model, gpt-oss-20b MXFP4 Q8, is an optimized and converted version of the openai/gpt-oss-20b model, specifically adapted to run with the MLX library. It excels in text generation, contextual understanding, and natural language processing tasks, delivering high performance while remaining accessible on a variety of hardware configurations. Its primary use cases include assisted writing, text data analysis, content creation, and conversational applications. What sets it apart is its quantized format (MXFP4 Q8), which enables efficient execution with reduced memory consumption without compromising response quality. Ideal for developers and researchers seeking a balance between power and accessibility.

Documentation

mlx-community/gpt-oss-20b-MXFP4-Q8

This model mlx-community/gpt-oss-20b-MXFP4-Q8 was converted to MLX format from openai/gpt-oss-20b using mlx-lm version 0.27.0.

Use with mlx

Bash

pip install mlx-lm

Python

from mlx_lm import load, generate

model, tokenizer = load("mlx-community/gpt-oss-20b-MXFP4-Q8")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Capabilities & Tags

mlxsafetensorsgpt_ossvllmtext-generationconversational4-bit

Links & Resources