- 👋 Hi, I’m Ye Zhen, a PhD student at HKUST.
- 👀 I’m interested in Multimodal generation and speech synthesis.
- if you have any questions, please feel free to contact me with zhenye312@gmail.com
🍉
Speech synthesis, Audio generation, Speech LLM
-
Hong Kong University of Science and Technology
- Hong Kong
- @zhenye234
- https://huggingface.co/ZhenYe234
- in/zhen-ye-25734a358
Pinned Loading
-
LLaSA_training
LLaSA_training PublicLLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
-
X-Codec-2.0
X-Codec-2.0 PublicCodec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
-
Talker-T2AV
Talker-T2AV PublicTalker-T2AV Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling
-
CoMoSpeech
CoMoSpeech PublicACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
-
FlashSpeech
FlashSpeech PublicACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.
