--- license: apache-2.0 base_model: - nvidia/OpenReasoning-Nemotron-32B datasets: - HuggingFaceH4/ultrachat_200k --- # OpenReasoning-Nemotron-32B-W8A8-INT8-Dynamic ## Method Quantised using [vllm-project/llm-compressor](https://github.com/vllm-project/llm-compressor.git) and the following configs: ``` recipe = [ SmoothQuantModifier(smoothing_strength=0.8), GPTQModifier(targets="Linear", scheme="W8A8", ignore=["lm_head"]), ] ```