
学术报告:New Challenges and Recent Progress on Speech Processing in A Cocktail Party



时间:2022/11/18 [周五]  上午 11:00-12:00

地点:腾讯会议 818-897-203

题目:New Challenges and Recent Progress on Speech Processing in A Cocktail Party

摘要:Although intelligent speech processing has been greatly advanced in research and widely used in many real-life applications, there still remains a large performance gap between controlled environments and real-life scenarios. One of the core problems in the real-world condition is known as the cocktail party problem. The cocktail party defines a complicated scenario where multiple talkers speak simultaneously with the presence of background noise and reverberation. It is easy for humans to attend to a target source of interest and recognize the speech in such conditions, but the mechanism behind this strong capability has not been well studied yet. In the past few decades, researchers have tried to develop algorithms for machines to mimic humans' capability in the cocktail party scenario, but the performance is still far from satisfactory. In this talk, we will summarize recent progress and present our efforts on speech processing in the cocktail party problem, especially the new techniques on speech separation and automatic speech recognition (ASR) those developed in SJTU. Finally, we will discuss the new challenges and potential directions to solve the cocktail party problem.

报告人简介:钱彦旻,上海交通大学计算机科学与工程系教授,博士生导师。清华大学博士,英国剑桥大学工程系博士后。国家优秀青年基金、上海市青年英才扬帆计划、吴文俊人工智能自然科学奖一等奖(第一完成人)获得者。现为IEEE高级会员、ISCA会员,同时也是国际开源项目Kaldi语音识别工具包的13位创始成员之一。担任InterSpeech, ISCSLP等国际会议的领域主席和TPC委员;IEEE T-ASLP, IEEE J-STSP, IEEE SPL, ICASSP, InterSpeech等期刊和国际会议审稿人。有10余年从事智能语音及语言处理、人机交互、模式识别及机器学习的研究和产业化工作经验。在本领域的一流国际期刊和会议上发表学术论文200余篇,Google Scholar引用总数10000余次,申请60余项中美专利,合作撰写和翻译多本外文书籍。3次获得领域内国际权威期刊和会议的最优论文奖,3次带队获得国际评测冠军。作为负责人和主要参与者参加了包括国家自然科学基金、国家脑科学计划、国家重点研发计划、国家863、英国EPSRC等多个项目。目前的研究领域包括:语音识别,说话人和语种识别,语音抗噪与分离,语音情感感知,自然语言理解,深度学习建模,多媒体信号处理等。