Introducing CaRDiff, a framework for video saliency prediction using MLLM CoT reasoning and diffusion model.