Yunlong (Yolo) Tang
Hi there / 你好 / こんにちは Welcome to my homepage!
My name is Yunlong Tang (). I’m a second-year Ph.D. student in Computer Science at University of Rochester (UR), advised by Prof. Chenliang Xu. I obtained B.Eng. (2019-2023) in Intelligence Science & Technology from Southern University of Science and Technology (SUSTech), with supervision from Prof. Feng Zheng. I’ve interned at ByteDance and Tencent.
My research focuses on Multimodal Learning, especially Video Understanding & Generation. I also have a keen interest in AI-Agents and Computational Arts.
I am actively looking for any collaboration. Please feel free to contact me if you are interested!
News
Oct 13, 2024 | 🚀 MMComposition has been publicly released. Read our Paper, check out the latest 🏆Leaderboard, and access the Code to evaluate your own models. |
---|---|
Aug 23, 2024 | Introducing CaRDiff, a framework for video saliency prediction using MLLM CoT reasoning and diffusion model. |
Aug 05, 2024 | 🏅 We won the first place in AIM 2024 Challenge on Video Saliency Prediction @ ECCV Workshop! Thanks to Gen Zhan and Li Yang! |
Jul 23, 2024 | 📢 We've recently updated our survey: "Video Understanding with Large Language Models: A Survey"! |
Jul 15, 2024 | One paper about egocentric video understanding with LLM has been accepted by ACM MM 2024. |
Selected Research
* Equal Contribution | † Corresponding Author
- arXiv preprint arXiv:2411.10979, 2024
- arXiv preprint arXiv:2410.09733, 2024
- arXiv preprint arXiv:2408.12009, 2024
- ACM MM
- arXiv preprint arXiv:2312.17432, 2023
-
- ACCVIn Proceedings of the Asian Conference on Computer Vision (ACCV), 2022
Misc.
Fun Facts
- My nickname, YOLO, is a soramimi/mondegreen for Yunlong.
- I'm a Tech-otaku, ACGN enthusiast, J-Pop fan, and 🚀 e/acc proponent.
- I have a certain artistic foundation (10+ years of experience in drawing/painting 🎨).
Visitor Map
"What I cannot create, I do not understand."
—— Richard Feynman