Tong Zhao

Preprint

DyWeight: Dynamic Gradient Weighting for Few-Step Diffusion Sampling

Tong Zhao*, Mingkun Lei*, Liangyu Yuan*, Yanming Yang, Chenxi Song, Yang Wang, Beier Zhu, Chi Zhang

A learnable few-step ODE solver that predicts time-varying weights to adaptively combine historical gradients, staying accurate at very large step sizes.

arXiv Code

CVPR 2026

Few-Step Diffusion Sampling Through Instance-Aware Discretizations

Liangyu Yuan*, Ruoyu Wang*, Tong Zhao*, Dingwen Fu, Mingkun Lei, Beier Zhu, Chi Zhang

Learns instance-aware time discretizations: a lightweight network predicts a tailored set of sampling timesteps for each input, instead of one fixed schedule.

arXiv Code

ICML 2026

Budget-Constrained Step-Level Diffusion Caching

Mingkun Lei, Tong Zhao, Liangyu Yuan, Chi Zhang

Fixes the compute budget in advance and searches for the step-level cache policy that best preserves output quality — controllable acceleration up to 3.7× on FLUX.1-dev.

Code

ICCV 2025

Distilling Parallel Gradients for Fast ODE Solvers of Diffusion Models

Beier Zhu*, Ruoyu Wang*, Tong Zhao, Hanwang Zhang, Chi Zhang

A learnable high-order ODE solver that evaluates several gradients in parallel per step and combines them with distilled weights, cutting truncation error at very few function evaluations.

arXiv Code Project

CVPR 2026Highlight

Taming Video Models for 3D and 4D Generation via Zero-Shot Camera Control

Chenxi Song, Yanming Yang, Tong Zhao, Ruibo Li, Chi Zhang

A training-free framework that repurposes pretrained video diffusion models for controllable 3D and 4D generation via inference-time camera-trajectory guidance.

arXiv Code Project

CVPR 2026

Improving Diffusion Generalization with Weak-to-Strong Segmented Guidance

Liangyu Yuan*, Yufei Huang*, Mingkun Lei, Tong Zhao, Ruoyu Wang, Changxi Chi, Yiwei Wang, Chi Zhang

Improves diffusion generalization and sample quality with a weak-to-strong segmented guidance scheme that hybridizes classifier-free guidance and autoguidance.

arXiv Code

CVPR 2026

Fast3Dcache: Training-free 3D Geometry Synthesis Acceleration

Mengyu Yang, Yanming Yang, Chenyi Xu, Chenxi Song, Yufan Zuo, Tong Zhao, Ruibo Li, Chi Zhang

A training-free, geometry-aware caching method that reuses stable intermediate features to accelerate 3D shape-generation diffusion (~27% faster) without corrupting geometry.

arXiv Code Project

JCR Q1

Linearity in Deng Entropy

Tong Zhao, Zhen Li, Yong Deng · Chaos, Solitons & Fractals

Characterizes when Deng entropy grows linearly with the scale of the frame of discernment, and shows the slope is exactly the information fractal dimension of the mass function.

Publisher PDF

$Information fractal dimension of Random Permutation Set$

JCR Q1

Information Fractal Dimension of Random Permutation Set

Tong Zhao, Zhen Li, Yong Deng · Chaos, Solitons & Fractals

Defines the information fractal dimension for the permutation mass function of a Random Permutation Set, and shows the dimension at maximum RPS entropy is 2.

Publisher PDF

Tech Report

FunReason-MT Technical Report: Advanced Data Synthesis for Real-world Multi-Turn Tool Use

Zengzhuang Xu*, Bingguang Hao*, Zechuan Wang, Yuntao Wen, Xinyi Xu, Yang Liu, Long Chen, Dong Wang, Maolin Wang, Tong Zhao, Yicheng Chen, Cunyin Peng, Jinjie Gu, Leilei Gan, Xiangyu Zhao, Chenyi Zhuang, Shi Gu

A data-synthesis framework for real-world multi-turn function calling — environment–API graph interactions, advanced tool/query synthesis, and a guided iterative chain — that brings a 4B model to SOTA among similar-sized models on BFCLv3.

arXiv Code

Preprint

From Failure to Mastery: Generating Hard Samples for Tool-use Agents

Bingguang Hao*, Zengzhuang Xu*, Yuntao Wen*, Xinyi Xu*, Yang Liu*, Tong Zhao, Maolin Wang, Long Chen, Dong Wang, Yicheng Chen, Cunyin Peng, Xiangyu Zhao, Chenyi Zhuang, Ji Zhang

HardGen builds a dynamic API graph from agent failure cases to synthesize genuinely hard tool-use traces and refine verifiable chain-of-thought — letting a 4B model rival much larger systems on tool use.

arXiv Code