NaviGen connects dual-identifier construction, staged SFT, GRPO alignment, constrained personalized inference, and image/video generation.
🤗 Model Checkpoints: PillowTa1k/NaviGen
🧭 Concepts | 🏗️ Framework | ⚙️ Installation | 🧪 Preprocessing | 🔥 Training | 🚀 Inference | 🎬 Generation
NaviGen is a personalized AIGC generation framework built around dual item identifiers. It couples behavioral collaborative codes with semantic textual codes, then trains a Qwen3-based policy to recommend the next item and write visually generatable image or video instructions.
- ❌ Behavior-semantic gap - Recommendation signals capture user preference, while generation prompts need semantic and visual details.
- ❌ Generic instructions - AIGC prompts may be fluent but underspecified, weakly personalized, or hard for image/video models to realize.
- ❌ Misaligned optimization - Next-item prediction, TID prediction, instruction quality, and visual generatability are often optimized separately.
🧭 NaviGen treats personalized generation as a unified recommendation-to-instruction workflow. It represents each item with a collaborative identifier and a textual identifier, distills reasoning through staged SFT, and uses GRPO rewards to align recommendation accuracy, structured output, semantic consistency, and AIGC instruction quality.
- Codebase organized around preprocessing, staged SFT, GRPO training, constrained inference, and image/video generation.
- Released GRPO step-600 checkpoints are hosted on Hugging Face: PillowTa1k/NaviGen.
- The repository no longer stores released model checkpoint folders; download the checkpoints from Hugging Face and pass the local path to inference scripts.
CID: Collaborative code, a tokenized behavioral identifier such as<|cid_begin|><s_a_3855><s_b_7257><s_c_3681><|cid_end|>.TID: Textual code, a compact list of English semantic terms for an item.Dual identifier: the coupled CID + TID representation used by NaviGen.cid2cid: recommend the next CID from user history.cid2ins: predict target TID and generate a personalized AIGC instruction.
NaviGen/
assets/ # Framework figure and project assets
dataset/ # Parquet splits for cid2tid, tid2cid, cid2cid, cid2ins, and catalog mapping
preprocess/ # TID generation, identifier reasoning, prompt search, CID vocab expansion
train/ # Stage-1 SFT, Stage-2 SFT, and GRPO training
infer/ # Constrained cid2cid and cid2ins inference
generation/ # Z-Image image generation and OpenSora video generation
project_env.py # Lightweight .env loader used by repository scripts
requirements.txt # Python dependencies
We recommend using conda to create a reproducible Python environment.
conda create -n navigen python=3.10 -y
conda activate navigen
pip install -r requirements.txtFor CUDA training machines, install PyTorch, Unsloth, and vLLM builds that match your driver and CUDA version. Image generation expects a local Z-Image model directory, while video generation requires an OpenSora environment and local OpenSora checkpoints.
Main dependencies include torch, transformers, datasets, accelerate, peft, trl, unsloth, pyarrow, lm-format-enforcer, dashscope, diffusers, torchao, and optional environment-specific packages such as deepspeed, flash-attn, and bitsandbytes.
Repository scripts load .env automatically through project_env.py. Keep real API keys and machine-specific paths local.
DASHSCOPE_API_KEY=""
DASHSCOPE_API_KEYS=""
NAVIGEN_TEACHER_MODEL="qwen3.5-flash"
NAVIGEN_QWEN3_BASE_MODEL="./Qwen3-1.7B"
NAVIGEN_CID_MODEL_DIR="./Qwen3-1.7B-cid-expanded-clean"
NAVIGEN_SFT_INPUT_DIR="./dataset"
NAVIGEN_INFER_INPUT_DIR="./dataset"
NAVIGEN_PID2CID2TID_PATH="./dataset/pid2cid2tid.parquet"
NAVIGEN_ZIMAGE_PATH="./Z-Image-Turbo"Bundled parquet data lives under dataset/:
train_cid2tid.parquet
valid_cid2tid.parquet
test_cid2tid.parquet
train_tid2cid.parquet
valid_tid2cid.parquet
test_tid2cid.parquet
train_cid2cid.parquet
valid_cid2cid.parquet
test_cid2cid.parquet
train_cid2ins.parquet
valid_cid2ins.parquet
test_cid2ins.parquet
pid2cid2tid.parquet
Compatibility mapping:
Generate TIDs from item captions:
python preprocess/step0_generate_tid_from_caption.py \
--input products_user_pid2caption.json \
--output products_user_pid2tid.json \
--resumeRun identifier reasoning and prompt distillation when rebuilding supervision:
python preprocess/step1_generate_identifier_think.py --resume
python preprocess/step2_evolutionary_prompt_search.py --resume
python preprocess/step3_distill_oneshot_prompt_think.py --resumeExpand the Qwen3 tokenizer/model with CID tokens:
python preprocess/expand_qwen3_cid_vocab.py \
--model_name_or_path ./Qwen3-1.7B \
--parquet_path dataset/pid2cid2tid.parquet \
--output_dir ./Qwen3-1.7B-cid-expanded-clean \
--trust_remote_codeStage-1 aligns the expanded model with CID/TID conversion and identifier reasoning tasks.
python train/sft_aigc_stage1_embed.pyStage-2 trains the full personalized recommendation and AIGC instruction generation behavior.
python train/sft_aigc_stage2_full_ft.pyGRPO jointly optimizes cid2cid recommendation and cid2ins instruction generation. By default, the script launches torchrun and uses a vLLM rollout endpoint.
python train/rl_grpo_rec_aigc_constrained.py \
--nproc_per_node 4 \
--base_model_dir /path/to/stage2/final \
--train_cid2cid_path dataset/train_cid2cid.parquet \
--train_cid2ins_path dataset/train_cid2ins.parquet \
--val_cid2cid_path dataset/valid_cid2cid.parquet \
--val_cid2ins_path dataset/valid_cid2ins.parquet \
--judge_api_key YOUR_DASHSCOPE_KEY \
--output_dir rl_output/grpo_runThe GRPO reward combines:
R = R_cid2cid + R_cid2ins + R_format + R_think
R_cid2cid: CID matching reward withs_a,s_b, ands_ccomponent weights.R_cid2ins: semantic and instruction-quality reward for target TID and generated AIGC instruction.R_format: JSON serialization and schema reward.R_think: reasoning-format gate for valid thinking traces.
The GRPO trainer saves the final adapter under rl_output/grpo_run/final_rl_lora unless --output_dir is changed.
Download the released checkpoint from PillowTa1k/NaviGen, then pass its local directory to --model_dir.
Run constrained next-CID recommendation:
python infer/infer_sft_aigc_stage2_cid2cid_constrained.py \
--model_dir /path/to/downloaded/checkpoint \
--input_dir dataset \
--num_candidates 40 \
--generation_mode direct_json_prefixRun personalized TID and instruction generation:
python infer/infer_sft_aigc_stage2_cid2ins.py \
--model_dir /path/to/downloaded/checkpoint \
--input_dir dataset \
--generation_mode two_stage \
--max_rows 10For full evaluation, remove --max_rows or set it to a larger value. The inference scripts write JSONL predictions and metrics to their configured pred_dir.
Generate images with Z-Image from normalized prediction outputs:
python generation/gen_image_zimage.py \
--baseline oracle \
--height 512 \
--width 512Generate videos with OpenSora from prediction JSON/JSONL:
python generation/video_gen_opensora.py \
--input-json outputs.jsonl \
--prompt-field prediction.target_ins \
--num-gpus 3The complete generation flow is:
user history
-> cid2cid recommendation
-> cid2ins target TID and instruction
-> normalized AIGC prompt
-> image or video synthesis
-> personalized generated content
Model weights are hosted on Hugging Face instead of stored in this repository:
The release includes the GRPO step-600 checkpoint. Download the assets you need and pass their local paths to the corresponding inference or reproduction scripts.
cid2cid inference produces the next collaborative identifier:
{
"target_cid": "<|cid_begin|><s_a_3855><s_b_7257><s_c_3681><|cid_end|>"
}cid2ins inference predicts semantic terms and an AIGC-ready instruction:
{
"target_tid": ["dress", "summer", "floral"],
"target_ins": "Generate a bright summer product image featuring a floral dress on a clean studio background."
}🌟 If this project helps your research, please consider giving NaviGen a Star!
Thanks for visiting NaviGen ✨
