You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've attempted to reproduce the results on the RealEstate-10K dataset following the [DFoT](https://github.com/kwsong0113/diffusion-forcing-transformer) protocol (**measure the difference between the generation and gt dataset**), but the generated outputs show significant discrepancies from the reported numbers in the paper.
Setup and preprocessing
Randomly selected 150 scenes as test set following DFoT protocol
Annotated test set with [ViPE](https://github.com/nv-tlabs/vipe) to obtain metric depth videos and camera poses
Generated detailed prompts using Gemini-3-pro from the given video
Adapteddata_engine/create_input.pyto match expected input conditions
Evaluation metrics
FVD, PSNR, LPIPS, SSIM for comparing generations against ground truth
Results comparison
Here are the input and output:
video_voyager.mp4
The result shows that the generated video contain sever color shift and restoration shift. Could you release the evaluation code on re10k, or help me debug the reproduction problem?
I've attempted to reproduce the results on the RealEstate-10K dataset following the [DFoT](https://github.com/kwsong0113/diffusion-forcing-transformer) protocol (**measure the difference between the generation and gt dataset**), but the generated outputs show significant discrepancies from the reported numbers in the paper.
Setup and preprocessing
Randomly selected 150 scenes as test set following DFoT protocol
Annotated test set with [ViPE](https://github.com/nv-tlabs/vipe) to obtain metric depth videos and camera poses
Generated detailed prompts using Gemini-3-pro from the given video
Adapted
data_engine/create_input.pyto match expected input conditionsEvaluation metrics
FVD, PSNR, LPIPS, SSIM for comparing generations against ground truth
Results comparison
Here are the input and output:
video_voyager.mp4
The result shows that the generated video contain sever color shift and restoration shift. Could you release the evaluation code on re10k, or help me debug the reproduction problem?