๐Ÿฆ– TEMPFLOW-GRPO: WHEN TIMING MATTERS FOR GRPO IN FLOW MODELS

TempFlow-GRPO (Temporal Flow GRPO), a principled GRPO framework that captures and exploits the temporal structure inherent in flow-based generation.

TempFlow-GRPO Structure

๐Ÿ—บ๏ธ Roadmap for TempFlow-GRPO

TempFlow-GRPO (Temporal Flow GRPO), a principled GRPO framework that captures andexploits the temporal structure inherent in flow-based generation. TempFlow-GRPO introduces two key innovations: (i) a trajectory branching mechanism that provides process rewards by concentrating stochasticity at designated branching points, enabling precise credit assignment without requiring specialized intermediate reward models; and (ii) a noise-aware weighting scheme that modulates policy optimization according to the intrinsic exploration potential of each timestep, prioritizing learning during high-impact early stages while ensuring stable refinement in later phases. These innovations endow the model with temporally-aware optimization that respects the underlying generative dynamics, leading to state-of-the-art performance in human preference alignment and standard text-to-image benchmark.

Welcome Ideas and Contribution. Stay tuned!

๐Ÿ†• News

We have presented an improved Flow-GRPO method, TempFlow-GRPO.๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ

  • [2025-08-06] We have released the first version of our paper. ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ

๐Ÿ“Š Experimental Performance

PickScore
Geneval

๐Ÿ“บ Visualization

Visualization

For more details please read our paper.
Download

BibTeX

@misc{2508.04324,
    Author = {Xiaoxuan He and Siming Fu and Yuke Zhao and Wanli Li and Jian Yang and Dacheng Yin and Fengyun Rao and Bo Zhang},
    Title = {TempFlow-GRPO: When Timing Matters for GRPO in Flow Models},
    Year = {2025},
    Eprint = {arXiv:2508.04324},
}