Sunbelt Computer Software

PL/B Language Development and Support

ruixiang63 (Ruixiang Wang) · GitHub

ruixiang63

Follow

🎯

Focusing

Ruixiang Wang ruixiang63

🎯

Focusing

Follow

51 followers · 172 following

NVIDIA
https://ruixiang63.github.io/

Achievements

Achievements

Highlights

Pro

Organizations

ruixiang63/README.md

Hi, I'm Ruixiang 👋

I am Senior DevTech Engineer at NVIDIA.

🚀 Recent Open Source Contributions

llama.cpp

#23869 — Speed-bench: standardized speculative decoding performance evaluation benchmark
#18039 — Eagle3 speculative decoding: 1.2–3.28× speedup across many model families
#24593 — Support Eagle3 for qwen3.5 & 3.6 achieving up to 1.94x speedup
#22105 — DFlash speculative decoding: up to 8× speedup on Qwen3 models
#24536 — Add speculative decoding metrics for better observability and parameters tuning
#24655 — Support GPU-backend sampling to improve Eagle3 performance

HuggingFace Transformers

#45665 — Performance fix: eliminated implicit H2D copies in Gated DeltaNet

Unsloth

This NVIDIA-Unsloth blog explains the following optimizations in detail.
#534 — Double-buffered checkpoint reload via CUDA streams + events, +8.4% on 8B, +6.7% on 14B fine-tuning speedup
#4173 — Packed-sequence metadata caching, +14.3% fine-tuning speedup on Qwen3-14B QLoRA SFT
#535 — GPT-OSS MoE expert routing optimization, ~10-15% fine-tuning speedup on GPT-OSS models

✍️ Technical Writing — NVIDIA Developer Blog

Model Quantization Series:

Pinned Loading

Research-Project-Title-Embedding Research-Project-Title-Embedding Public

This project aims to improve the quality eBay product title embedding. Here are the slides and my master thesis. The source code is in company's repo and not able to release now.

1
microgpt-cpp microgpt-cpp Public

C++ version of MicroGPT with GPU acceleration

C++ 1
llama.cpp llama.cpp Public

Forked from ggml-org/llama.cpp

LLM inference in C/C++

C++ 5 2
ggml-org/llama.cpp ggml-org/llama.cpp Public

LLM inference in C/C++

C++ 119k 20.2k
unslothai/unsloth unslothai/unsloth Public

Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

Python 67.8k 6.1k
unslothai/unsloth-zoo unslothai/unsloth-zoo Public

Utils for Unsloth https://github.com/unslothai/unsloth

Python 286 289