EN
中文
OneVision-Encoder
Issue Explainers
Interactive visual explanations for community questions
#105
Clarification on how OCR annotations are used during training
How PaddleOCR tags become multi-label Partial FC training signals — with animated ArcFace visualization, KaTeX formulas, and full pipeline walkthrough.
#112
Clarification on Intra-frame (I-frame) interval in data preprocessing
Fixed GOP=16 is defined during HEVC encoding (Step 2). Step 3 only reads the existing bitstream and anchors the first sampled frame as I-frame. Includes animated GOP visualization, code references, and KaTeX formulas.
#113
Are all I-frame tokens intended to be preserved in the current implementation?
Analysis of paper vs. implementation mismatch. The DALI dataloader zeros out I-frame residuals, preventing them from being selected by Top-K. Includes code references and comparison with Compressed Video Reader's keep_first_full_frame option.
#116
Frame-wise Normalization & Global Top-K
Why we normalize per frame before selecting patches globally — with animated matrix demo, pipeline walkthrough, and code references.
