个人信息Personal Information
副研究员
教师英文名称:Wei-Qiang ZHANG
教师拼音名称:zhangweiqiang
电子邮箱:
办公地点:电子工程馆5-111
联系方式:010-62781847
学位:博士学位
毕业院校:清华大学
学科:信号与信息处理
语音与音频技术实验室毕业生何珂鑫喜获清华大学优秀硕士论文和优秀硕士毕业生
点击次数:
语音与音频技术实验室毕业生何珂鑫喜获清华大学优秀硕士论文和优秀硕士毕业生。何珂鑫同学2018年入学,两年硕士毕业。硕士期间成绩优秀,获得清华大学综合奖学金,2019年参加DCASE音频事件检测与场景分类国际测评弱监督音频标注任务获得第二名,在ICASSP、INTERSPEECH等语音顶会发表多篇学术论文。
何珂鑫硕士论文主要研究方向为音频事件检测和音频场景分类,重点研究弱监督学习下的音频事件检测,这里弱监督学习主要指不确切监督和不准确监督两种类型的问题。对于不确切监督问题,提出多层级的聚合结构应用于多示例学习算法中的聚合函数,在不改变网络参数量的情况下实现检测性能提升。对于不准确监督问题,提出多阶段训练策略,优化噪声标签的学习问题,在大规模且带噪的音频标签数据集上效果提升明显。此外,还对音频事件检测模型结构和损失函数进行了改进,优化了音频事件检测中的标签稀疏分布和标签边界模糊问题。
何珂鑫硕士期间发表的学术论文:
K.-X. He, Y. Shen, W.-Q. Zhang, and J. Liu, “Staged training strategy and multi-activation for audio tagging with noisy and sparse multi-label data,” in Proc. ICASSP, Barcelona, Spain, May 2020, pp. 631–635.
doi: 10.1109/ICASSP40776.2020.9053776.
K.-X. He, Y. Shen, and W.-Q. Zhang, “Hierarchical pooling structure for weakly labeled sound event detection,” in Proc. INTERSPEECH, Graz, Austria, Sep. 2019, pp. 3624–3628.
doi: 10.21437/Interspeech.2019-2049.
K.-X. He, Y. Shen, and W.-Q. Zhang, “Multiple neural networks with ensemble method for audio tagging with noisy labels and minimal supervision,” in Proc. DCASE Workshop, New York, USA, Oct. 2019, pp. 89–93.
doi: 10.33682/r7nr-v396.
K.-X. He, W.-Q. Zhang, J. Liu, and Y. Liu, “Dilated-gated convolutional neural network with a new loss function on sound event detection,” in Proc. APSIPA ASC, Lanzhou, China, Nov. 2019, pp. 1491–1495.
doi: 10.1109/APSIPAASC47483.2019.9023308.
Y. Shen, K.-X. He, and W.-Q. Zhang, “Learning how to listen: A temporal-frequential attention model for sound event detection,” in Proc. INTERSPEECH, Graz, Austria, Sep. 2019, pp. 2563–2567.
doi: 10.21437/Interspeech.2019-2045.
Y. Shen, K.-X. He, and W.-Q. Zhang, “SAM-GCNN: A gated convolutional neural network with segment-level attention mechanism for home activity monitoring,” in Proc. ISSPIT, Louisville, Kentucky, USA, Dec. 2018, pp. 679–684.
doi: 10.1109/ISSPIT.2018.8642767.