Xubo Liu

Xubo Liu 刘徐博

I am a final-year Ph.D. student in Centre for Vision, Speech & Singal Processing at University of Surrey advised by Prof. Wenwu Wang and Prof. Mark D. Plumbley. My passion is to build AI models to understand the world with multi-modalities and engage with humans. Currently, I work on computational auditory scene analysis, multimodal content creation and large language models for audo/speech/music signals.

Previously, I spent four months working with Dr. Christian Fuegen and Dr. Egor Lakomkin at Meta AI, London. During my PhD, I worked closely with Dr. Qiuqiang Kong at the Chinese University of Hong Kong (CUHK). I graduated with First Class Honors from Queen Mary University of London in 2020 with a BSc in Telecommunications Engineering.

I am open to research collaborations. Please feel free to email me.

Email: xubo.liu@surrey.ac.uk

Personal: [Google Scholar] | [Github] | [Linkedin] | [Twitter]

Publications

	Separate Anything You Describe Xubo Liu, Qiuqiang Kong, Yan Zhao, Haohe Liu, Yi Yuan, Yuzhuo Liu, Rui Xia, Yuxuan Wang, Mark D Plumbley, Wenwu Wang arXiv:2308.05037 paper \| project \| code Media coverage:
	WavJourney: Compositional Audio Creation with Large Language Models Xubo Liu, Zhongkai Zhu, Haohe Liu, Yi Yuan, Meng Cui, Qiushi Huang, Jinhua Liang, Yin Cao, Qiuqiang Kong, Mark D Plumbley, Wenwu Wang arXiv:2307.14335 paper \| project \| code Media coverage:
	SynthVSR: Scaling Up Visual Speech Recognition with Synthetic Supervision Xubo Liu, Egor Lakomkin, Konstantinos Vougioukas, Pingchuan Ma, Honglie Chen, Ruiming Xie, Morrie Doulaty, Niko Moritz, Jachym Kolar, Stavros Petridis, Maja Pantic, Christian Fuegen CVPR 2023 paper \| project
	Visually-Aware Audio Captioning with Adaptive Audio-Visual Attention Xubo Liu, Qiushi Huang, Xinhao Mei, Haohe Liu, Qiuqiang Kong, Jianyuan Sun, Shengchen Li, Tom Ko, Yu Zhang, Lilian H Tang, Mark D Plumbley, Volkan Kılıç, Wenwu Wang Interspeech 2023 paper \| code
	Simple Pooling Front-ends For Efficient Audio Classification Xubo Liu, Haohe Liu, Qiuqiang Kong, Xinhao Mei, Mark D. Plumbley, Wenwu Wang ICASSP 2023 paper \| code
	Separate What You Describe: Language-Queried Audio Source Separation Xubo Liu, Haohe Liu, Qiuqiang Kong, Xinhao Mei, Jinzheng Zhao, Qiushi Huang, Mark D Plumbley, Wenwu Wang Interspeech 2022 paper \| project \| code
	Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning Xubo Liu, Turab Iqbal, Jinzheng Zhao, Qiushi Huang, Mark D Plumbley, Wenwu Wang MLSP 2021 paper \| code
	CL4AC: A Contrastive Loss for Audio Captioning Xubo Liu, Qiushi Huang, Xinhao Mei, Tom Ko, H Lilian Tang, Mark D Plumbley, Wenwu Wang DCASE Workshop 2021 paper \| code

Professional Services

Special Session Chair of Multimodal Learning for Audio and Language" at EUSIPCO 2023
Journal reviewer: IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), IEEE Signal Processing Letters, International Journal of Computer Vision (IJCV)
Conference reviewer: ECCV (24), ICML (24), CVPR (24), EMNLP (23), ICASSP (23-24), INTERSPEECH (22-24), MLSP (23)

Template credits: Unnat