张卫强

个人信息Personal Information

教师英文名称：Wei-Qiang Zhang

教师拼音名称：Zhang Wei Qiang

电子邮箱：

办公地点：电子工程馆5-111

联系方式：010-62781847

学位：博士学位

毕业院校：清华大学

学科：信号与信息处理

当前位置：中文主页 >> 科学研究

科研项目

- 2023年-2026年：NSFC面上项目“基于自监督预训练模型的异常声音检测”，项目主持。
- 2021年-2023年：工信部人工智能产业创新任务揭榜挂帅项目“人工智能训练资源库”，课题主持。
- 2019年-2022年：NSFC联合重点项目“复杂环境下语音数据的说话人识别及关键词检索”，项目主持。
- 2019年-2023年：国家重点研发计划重点专项课题“基于语音信息的分析”，课题主持。
- 2019年-2021年：教育部项目“人工智能安全应用及人工智能安全防护关键技术研究”，课题主持。
- 2014年-2017年：NSFC面上项目“噪声和短语音条件下的说话人识别”，项目主持。
- 2011年-2013年：NSFC青年项目“面向海量语音信息处理的垃圾过滤和数据选择方法研究”，项目主持。
- 2010年-2013年：NSFC重大研究计划重点支持项目“多人多方对话中的语音分离、内容分析与理解”，项目参与。
- 2009年-2011年：国家863重点项目“陪护机器人”，项目参与。

期刊论文

- N. Si, H. Zhang, W. Zhang, W.-Q. Zhang, H. Chang, and D. Qu, “Gradient-aware knowledge distillation: Tackling gradient insensitivity through teacher guided gradient scaling,” Neural Networks, vol. 195, Art. no. 108229, Mar. 2026. doi: 10.1016/j.neunet.2025.108229.
- W.-Q. Zhang, “Accelerating Cross-correlation for long sequences with short lag constraints: An optimized block-wise approach,” Digital Signal Processing, vol. 168, Art. no. 105509, Jan. 2026. doi: 10.1016/j.dsp.2025.105509.
- B. Han, A. Jiang, X. Zheng, W.-Q. Zhang, J. Liu, P. Fan, and Y. Qian, “Exploring self-supervised audio models for generalized anomalous sound detection,” IEEE Transactions on Audio, Speech and Language Processing, vol. 33, pp. 4126-4141, 2025. doi: 10.1109/TASLPRO.2025.3606200.
- J. Du, J. Li, G. Chen, and W.-Q. Zhang, “SpeechColab leaderboard: An open-source platform for automatic speech recognition evaluation,” Computer Speech & Language, vol. 94, Art. no. 101805, Nov. 2025. doi: 10.1016/j.csl.2025.101805.
- Y.-F. Shao, F. Guo, P. Jiang, W. Li, and W.-Q. Zhang, “Damage detection and classification of carbon fiber-reinforced polymer composite materials based on acoustic emission and convolutional recurrent neural network,” Structural Health Monitoring, vol. 24, no. 6, pp. 3344-3362, Nov. 2025. doi: 10.1177/14759217241270883.
- H. Wang and W.-Q. Zhang, “Unstructured pruning and low rank factorisation of self-supervised pre-trained speech models,” IEEE Journal of Selected Topics in Signal Processing, vol. 18, no. 6, pp. 1046–1058, Sept. 2024. doi: 10.1109/JSTSP.2024.3433616.
- Y.-F. Shao, P. Jiang, Y. Dong, W. Li, and W.-Q. Zhang, “AE-IRMamba: Low complexity inverted residual Mamba for identification of piezoelectric ceramic and optical fiber acoustic emission sensors signals,” IEEE Sensors Journal, vol. 21, no. 21, pp. 34549–34560, Nov. 2024. doi: 10.1109/JSEN.2024.3457913.
- H. Zhang, N. Si, Y. Chen, X. Yang, D. Qu, and W.-Q. Zhang, “Improving speech translation by cross-modal multi-grained contrastive learning,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1075–1086, Feb. 2023. doi: 10.1109/TASLP.2023.3244521.
- X. Chen, J. Wang, X.-L. Zhang, W.-Q. Zhang, and K. Yang, “LMD: A learnable mask network to detect adversarial examples for speaker verification,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 2476–2490, Jun. 2023. doi: 10.1109/TASLP.2023.3288417.
- J. Yao, X. Chen, X.-L. Zhang, W.-Q. Zhang, and K. Yang, “Symmetric saliency-based adversarial attack to speaker identification,” IEEE Signal Processing Letters, vol. 30, 2023. doi: 10.1109/LSP.2023.3236509.
- Y. Qin, L. Sun, H. Chen, W. Yang, W.-Q. Zhang, J. Fei, and G. Wang, “MVKT-ECG: Efficient single-lead ECG classification for multi-label arrhythmia by multi-view knowledge transferring,” Computers in Biology and Medicine, vol. 166, Art. no. 107503, Sept. 2023. doi: 10.1016/j.compbiomed.2023.107503.
- C. Wu, F. Wu, T. Qi, W.-Q. Zhang, X. Xie, and Y. Huang, “Removing AI’s sentiment manipulation of personalized news delivery,” Humanities & Social Sciences Communications, vol. 9, Art. no. 459, 2022. doi: 10.1057/s41599-022-01473-1.
- J. Zhao and W.-Q. Zhang, “Improving automatic speech recognition performance for low-resource languages with self-supervised models,” IEEE Journal of Selected Topics in Signal Processing, vol. 16, no. 6, pp. 1227–1241, Oct. 2022. doi: 10.1109/JSTSP.2022.3184480.
- Z. Zhao and W.-Q. Zhang, “End-to-end keyword search system based on attention mechanism and energy scorer for low resource languages,” Neural Networks, vol. 139, pp. 326-334, Jul. 2021. doi: 10.1016/j.neunet.2021.04.002.
- C. Lu, Y. Liu, W.-Q. Zhang and S. Zhang, “Tightness of a new and enhanced semidefinite relaxation for MIMO detection,” SIAM Journal on Optimization, vol. 29, no. 1, pp. 719-742, Jan. 2019. doi: 10.1137/17M115075X.
- J. Kang, W.-Q. Zhang, W.-W. Liu, J. Liu, and M. T. Johnson, “Lattice based transcription loss for end-to-end speech recognition,” Journal of Signal Processing Systems, vol. 90, no. 7, pp. 1013-1023, Sept. 2018. doi: 10.1007/s11265-017-1292-0.
- C. Lu, Z. Deng, W.-Q. Zhang, and S.-C. Fang, “Argument division based branch-and-bound algorithm for unit-modulus constrained complex quadratic programming,” Journal of Global Optimization, vol. 70, no. 1, pp. 171-187, Jan. 2018. doi: 10.1007/s10898-017-0551-8.
- X.-K. Yang, L. He, D. Qu, and W.-Q. Zhang, “Semi-supervised minimum redundancy maximum relevance feature selection for audio classification,” Multimedia Tools and Applications, vol. 77, pp. 713-739, Jan. 2018. doi: 10.1007/s11042-016-4287-0.
- X. Yang, D. Qu, W.-L. Zhang, and W.-Q. Zhang, “An adapted data selection for deep learning-based audio segmentation in multi-genre broadcast channel,” Digital Signal Processing, vol. 81, pp. 8-15, Oct. 2018. doi: 10.1016/j.dsp.2018.03.004.
- W.-W. Liu, M. Cai, W.-Q. Zhang, J. Liu, and M. T. Johnson, “Discriminative boosting algorithm for diversified front-end phonotactic language recognition,” Journal of Signal Processing Systems, vol. 82, no. 2, pp. 229-239, Feb. 2016. doi: 10.1007/s11265-015-1017-1.
- W.-Q. Zhang, “Fast Doppler rate estimation based on fourth-order moment spectrum,” Electronics Letters, vol. 51, no. 23, pp. 1926–1928, Nov. 2015. doi: 10.1049/el.2015.2182.
- Z.-Y. Li, W.-Q. Zhang, and J. Liu, “Multi-resolution time frequency feature and complementary combination for short utterance speaker recognition,” Multimedia Tools and Applications, vol. 74, pp. 937-953, Feb. 2015. doi: 10.1007/s11042-013-1705-4.
- W.-Q. Zhang, W.-W. Liu, Z.-Y. Li, Y.-Z. Shi, and J. Liu, “Spoken language recognition based on gap-weighted subsequence kernels,” Speech Communication, vol. 60, pp. 1-12, May 2014. doi: 10.1016/j.specom.2014.01.005.
- Y.-Z. Shi, W.-Q. Zhang, J. Liu, and M. Johnson, “Efficient one-pass decoding with NNLM for speech recognition,” IEEE Signal Processing Letters, vol. 21, no. 4, pp. 377-381, Apr. 2014. doi: 10.1109/LSP.2014.2303136.
- W.-L. Zhang, D. Qu, W.-Q. Zhang, and B.-C. Li. “Rapid speaker adaptation using compressive sensing,” Speech Communication, vol. 55, no. 10, pp. 950-963, Nov.-Dec. 2013. doi: 10.1016/j.specom.2013.06.012.
- W.-L. Zhang, W.-Q. Zhang, B.-C. Li, D. Qu, and M. T. Johnson, “Bayesian speaker adaptation based on a new hierarchical probabilistic model,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 7, pp. 2002-2015, Sept. 2012. doi: 10.1109/TASL.2012.2193390.
- W.-Q. Zhang, L. He, Y. Deng, J. Liu, and M. Johnson, “Time-frequency cepstral feature and constrained heteroscedastic linear discriminant analysis for language recognition,” IEEE Transactions on Audio, Speech and Language Processing, vol. 19, no. 2, pp. 266-272, Feb. 2011. doi: 10.1109/TASL.2010.2047680.
- W.-Q. Zhang, T. Hou, and J. Liu, “Discriminative score fusion for language identification,” Chinese Journal of Electronics, vol. 19, no. 1, pp. 124–128, Jan. 2010. doi: 10.23919/CJE.2010.10159256.
- W.-Q. Zhang and J. Liu, “An equalized heteroscedastic linear discriminant analysis algorithm,” IEEE Signal Processing Letters, vol. 15, pp. 585-588, 2008. doi: 10.1109/LSP.2008.2001561.
- R. Tao, W.-Q. Zhang, and E.-Q. Chen, “Two-stage method for joint time delay and Doppler shift estimation,” IET Radar, Sonar & Navigation, vol. 2, no. 1, pp. 71-77, Feb. 2008. doi: 10.1049/iet-rsn:20060014.
- R. Tao, B. Deng, W.-Q. Zhang, and Y. Wang, “Sampling and sampling rate conversion of band limited signals in the fractional Fourier transform domain,” IEEE Transactions on Signal Processing, vol. 56, no. 1, pp. 158-171, Jan. 2008. doi: 10.1109/TSP.2007.901666.

会议论文

- Z. Weng, D. Shen, T. Liu, G. Chen, R. Shi, J. Chen, C. Ding, W.-Q. Zhang, and Z. Chen, “VocalRep: Structure-aware vocal representations for multimodal generation,” to be published in Proc. ACL, 2026.
- S. Yan, Y. Chen, R. Zhou, Z. Yao, S. Chen, T. Zhang, S. Zhang, W.-Q. Zhang, Y. Huang, H. Duan, and Y. Zhang, “Explore-on-Graph: Incentivizing autonomous exploration of large language models on knowledge graphs with path-refined reward modeling,” to be published in Proc. ICLR, 2026.
- G. Lin, Z. Chen, Y. Fu, K. Li, and W.-Q. Zhang, “Enhancing multilingual LLM-based ASR with mixture of experts and dynamic downsampling,” to be published in Proc. ICASSP, 2026.
- W. Liang, Y. Qiu, A. Jiang, B. Han, T. Liu, X. Zheng, P. Fan, C. Lu, J. Liu, and W.-Q. Zhang, “RefGen: Reference-guided synthetic data generation for anomalous sound detection,” to be published in Proc. ICASSP, 2026.
- J. Fan, W. Liang, and W.-Q. Zhang, “SARNet: A spike-aware consecutive validation framework for accurate remaining useful life prediction,” to be published in Proc. ICASSP, 2026.
- R. Bao, H. Ma, S. Liu, C. Gong, C. Zhang, X.-L. Zhang, W.-Q. Zhang, and X. Li, “ALMA-Chor: Leveraging audio-lyric alignment with mamba for chorus detection,” to be published in Proc. ICASSP, 2026.
- Y. Yang, Z. Song, J. Zhuo, M. Cui, J. Li, B. Yang, Y. Du, Z. Ma, X. Liu, Z. Wang, K. Li, S. Fan, K. Yu, W.-Q. Zhang, G. Chen, and X. Chen, “GigaSpeech 2: An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement,” in Proc. ACL, 2025, pp. 2673–2686. doi: 10.18653/v1/2025.acl-long.135.
- Y. Pu and W.-Q. Zhang, “Integrating pause information with word embeddings in language models for Alzheimer’s disease detection from spontaneous speech,” in Proc. ICASSP, 2025. doi: 10.1109/ICASSP49660.2025.10888563.
- Z. Wan, Z. Qiu, Y. Liu, and W.-Q. Zhang, “Metadata-enhanced speech emotion recognition: Augmented residual integration and co-attention in two-stage fine-tuning,” in Proc. ICASSP, 2025. doi: 10.1109/ICASSP49660.2025.10890812.
- Z. Chen, Y.-F. Shao, Y. Ma, M. Wei, L. Zhang, and W.-Q. Zhang, “Improving acoustic scene classification in low-resource conditions,” in Proc. ICASSP, 2025. doi: 10.1109/ICASSP49660.2025.10888928.
- A. Jiang, X. Zheng, B. Han, Y. Qiu, P. Fan, W.-Q. Zhang, L. Cheng, and J. Liu, “Adaptive prototype learning for anomalous sound detection with partially known attributes,” in Proc. ICASSP, 2025. doi: 10.1109/ICASSP49660.2025.10889514.
- B. Han, W. Huang, Z. Chen, A. Jiang, P. Fan, L. Cheng, Z. Lv, J. Liu, W.-Q. Zhang, and Y. Qian, “Data-efficient low-complexity acoustic scene classification via distilling and progressive pruning,” in Proc. ICASSP, 2025. doi: 10.1109/ICASSP49660.2025.10890296.
- K. Pang, M. Bai, J. Yang, W.-Q. Zhang, M. Jiang, and Y. Huang, “Winstega: An adaptive robust enhancement framework for generative linguistic steganography,” in Proc. ICASSP, 2025. doi: 10.1109/ICASSP49660.2025.10888944.
- K. Jia, J. Li, K. Li, and W.-Q. Zhang, “Whisper-based multilingual Alzheimer’s disease detection and improvements for low-resource language,” in Proc. Interspeech, 2025, pp. 549-553. doi: 10.21437/Interspeech.2025-1118.
- Q. Sun, Z. Qiu, Y. Pu, J. Li, X. Chen, and W.-Q. Zhang, “PPGs-BERT: Leveraging phoneme sequence and BERT for Alzheimer’s disease detection from spontaneous speech,” in Proc. Interspeech, 2025, pp. 554-558. doi: 10.21437/Interspeech.2025-489.
- Y. Pu, X. Liu, G. Zhang, Z. Yan, W.-Q. Zhang, and X. Chen, “Empowering large language models for end-to-end speech translation leveraging synthetic data,” in Proc. Interspeech, 2025, pp. 26-30. doi: 10.21437/Interspeech.2025-2341.
- W. Liang, R. Zhang, X. Zhang, Y. Ma, and W.-Q. Zhang, “DepressGEN: Synthetic data generation framework for depression detection,” in Proc. Interspeech, 2025, pp. 464-468. doi: 10.21437/Interspeech.2025-280.
- B. Han, Z. Lv, A. Jiang, W. Huang, Z. Chen, Y. Deng, J. Ding, C. Lu, W.-Q. Zhang, P. Fan, J. Liu, and Y. Qian, “Exploring large scale pre-trained models for robust machine anomalous sound detection,” in Proc. ICASSP, 2024, pp. 1327–1330. doi: 10.1109/ICASSP48485.2024.10447183.
- J. Li and W.-Q. Zhang, “Whisper-based transfer learning for Alzheimer disease classification: Leveraging speech segments with full transcripts as prompts,” in Proc. ICASSP, 2024, pp. 11211–11215. doi: 10.1109/ICASSP48485.2024.10448004.
- H. Wang, G. Hu, G. Lin, W.-Q. Zhang, and J. Li, “Simul-Whisper: Attention-guided streaming Whisper with truncation detection,” in Proc. Interspeech, 2024, pp. 4483–4487. doi: 10.21437/Interspeech.2024-1814.
- J. Li, Y. Pu, Q. Sun, and W.-Q. Zhang, “Improving Whisper’s recognition performance for under-represented language Kazakh leveraging unpaired speech and text,” in Proc. Interspeech, 2024, pp. 2514–2518. doi: 10.21437/Interspeech.2024-1790.
- A. Jiang, B. Han, Z. Lv, Y. Deng, W.-Q. Zhang, X. Chen, Y. Qian, J. Liu, and P. Fan, “AnoPatch: Towards better consistency in machine anomalous sound detection,” in Proc. Interspeech, 2024, pp. 107–111. doi: 10.21437/Interspeech.2024-1761.
- X. Zheng, A. Jiang, B. Han, Y. Qian, P. Fan, J. Liu, and W.-Q. Zhang, “Improving anomalous sound detection via low-rank adaptation fine-tuning of pre-trained audio models,” in Proc. SLT, 2024, pp. 979–984. doi: 10.1109/SLT61566.2024.10832335.
- A. Jiang, Y. Shi, P. Fan, W.-Q. Zhang, and J. Liu, “CoopASD: Cooperative machine anomalous sound detection with privacy concerns,” in Proc. GLOBECOM, 2024, pp. 346–351. doi: 10.1109/GLOBECOM52923.2024.10901774.
- X. Chen, Y. Pu, J. Li, and W.-Q. Zhang, “Cross-lingual Alzheimer’s disease detection based on paralinguistic and pre-trained features,” in Proc. ICASSP, 2023. doi: 10.1109/ICASSP49357.2023.10095522.
- A. Jiang, W.-Q. Zhang, Y. Deng, P. Fan, and J. Liu, “Unsupervised anomaly detection and localization of machine audio: A GAN-based approach,” in Proc. ICASSP, 2023. doi: 10.1109/ICASSP49357.2023.10096813.
- H. Wang, S. Wang, W.-Q. Zhang, and J. Bai, “DistilXLSR: A light weight cross-lingual speech representation model,” in Proc. Interspeech, 2023, pp. 2273–2277. doi: 10.21437/Interspeech.2023-1444.
- H. Wang, S. Wang, W.-Q. Zhang, H. Suo, and Y. Wan, “Task-agnostic structured pruning of speech representation models,” in Proc. Interspeech, 2023, pp. 231–235. doi: 10.21437/Interspeech.2023-1442.
- Z. Cui, W. Wu, C. Zhang, W.-Q. Zhang, and J. Wu, “Transferring speech-generic and depression-specific knowledge for Alzheimer’s disease detection,” in Proc. ASRU, 2023. doi: 10.1109/ASRU57964.2023.10389785.
- Y. Wang, C. Tang, Z. Ma, Z. Zheng, X. Chen, and W.-Q. Zhang, “Exploring effective distillation of self-supervised speech models for automatic speech recognition,” in Proc. ASRU, 2023. doi: 10.1109/ASRU57964.2023.10389746.
- Q. Hou, A. Jiang, W.-Q. Zhang, P. Fan, and J. Liu, “Decoupling detectors for scalable anomaly detection in AIoT systems with multiple machines,” in Proc. GLOBECOM, 2023, pp. 5943–5948. doi: 10.1109/GLOBECOM54140.2023.10436800.
- J. Zhao, H. Wang, J. Li, S. Chai, G. Wang, G. Chen, and W.-Q. Zhang, “The THUEE system description for the IARPA OpenASR21 challenge,” in Proc. Interspeech, 2022. doi: 10.21437/Interspeech.2022-269.
- J. Zhao, G. Shi, G.-B. Wang, and W.-Q. Zhang, “Automatic speech recognition for low-resource languages: The THUEE systems for the IARPA OpenASR20 evaluation,” in Proc. ASRU, 2021, pp. 335–341. doi: 10.1109/ASRU51503.2021.9688260.
- L. Xue, K. Song, D. Wu, X. Tan, N. L. Zhang, T. Qin, W.-Q. Zhang, and T.-Y. Liu, “DeepRapper: Neural rap generation with rhyme and rhythm modeling,” in Proc. ACL, 2021, pp. 69-81. doi: 10.18653/v1/2021.acl-long.6.
- G. Chen, S. Chai, G. Wang, J. Du, W.-Q. Zhang, C. Weng, D. Su, D. Povey, J. Trmal, J. Zhang, M. Jin, S. Khudanpur, S. Watanabe, S. Zhao, W. Zou, X. Li, X. Yao, Y. Wang, Y. Wang, Z. You, and Z. Yan, “GigaSpeech: An evolving, multi-domain ASR corpus with 10,000 hours of transcribed audio,” in Proc. Interspeech, 2021, pp. 3670-3674. doi: 10.21437/Interspeech.2021-1965.
- J. Zhao, Z. Lv, A. Han, G. Wang, G. Shi, J. Kang, J. Yan, P. Hu, S. Huang, and W.-Q. Zhang, “The TNT team system descriptions of Cantonese and Mongolian for IARPA OpenASR20,” in Proc. Interspeech, 2021, pp. 4344-4348. doi: 10.21437/Interspeech.2021-1063.
- H. Yu, J. Zhao, S. Yang, Z. Wu, Y. Nie, and W.-Q. Zhang, “Language recognition based on unsupervised pretrained models,” in Proc. Interspeech, 2021, pp. 3271-3275. doi: 10.21437/Interspeech.2021-807.
- Y. Yan, X. Tan, B. Li, G. Zhang, T. Qin, S. Zhao, Y. Shen, W.-Q. Zhang, and T.-Y. Liu, “Adaptive text to speech for spontaneous style,” in Proc. Interspeech, 2021, pp. 4668-4672. doi: 10.21437/Interspeech.2021-584.
- K. He, Y. Shen, W.-Q. Zhang, and J. Liu, “Staged training strategy and multi-activation for audio tagging with noisy and sparse multi-label data,” in Proc. ICASSP, 2020, pp. 631-635. doi: 10.1109/ICASSP40776.2020.9053776.
- J. Xie, R. Yan, S. Xiao, L. Peng, M. T. Johnson, and W.-Q. Zhang, “Dynamic temporal residual learning for speech recognition,” in Proc. ICASSP, 2020, pp. 7709-7713. doi: 10.1109/ICASSP40776.2020.9054653.
- Z. Zhao and W.-Q. Zhang, “End-to-end keyword search based on attention and energy scorer for low resource languages,” in Proc. Interspeech, 2020, pp. 2587-2591. doi: 10.21437/Interspeech.2020-2613.
- R. Li, T. Liang, D. Song, Y. Liu, Y. Wu, C. Xu, P. Ouyang, X. Zhang, X. Chen, W.-Q. Zhang, S. Yin, and L. He, “THUEE system for NIST SRE19 CTS challenge,” in Proc. Interspeech, 2020, pp. 2232-2236. doi: 10.21437/Interspeech.2020-1245.
- Z. Li, L. He, J. Li, L. Wang, and W.-Q. Zhang, “Towards discriminative representations and unbiased predictions: Class-specific angular softmax for speech emotion recognition,” in Proc. Interspeech, 2019, pp. 1696-1700. doi: 10.21437/Interspeech.2019-1683.
- K. He, Y. Shen, and W.-Q. Zhang, “Hierarchical pooling structure for weakly labeled sound event detection,” in Proc. Interspeech, 2019, pp. 3624-3628. doi: 10.21437/Interspeech.2019-2049.
- H. Yang and W.-Q. Zhang, “Music genre classification using duplicated convolutional layers in neural networks,” in Proc. Interspeech, 2019, pp. 3382-3386. doi: 10.21437/Interspeech.2019-1298.
- Y. Shen, K. He, and W.-Q. Zhang, “Learning how to listen: A temporal-frequential attention model for sound event detection,” in Proc. Interspeech, 2019, pp. 2563-2567. doi: 10.21437/Interspeech.2019-2045.
- J. Kang, W.-Q. Zhang, and J. Liu, “Gated convolutional networks based hybrid acoustic models for low resource speech recognition,” in Proc. ASRU, 2017, pp. 157-164. doi: 10.1109/ASRU.2017.8268930.
- Z.-Q. Lv, J. Kang, W.-Q. Zhang, and J. Liu, “An LSTM-CTC based verification system for proxy-word based OOV keyword search,” in Proc. ICASSP, 2017, pp. 5655-5659. doi: 10.1109/ICASSP.2017.7953239.
- Y. Tian, L. He, M. Cai, W.-Q. Zhang, and J. Liu, “Deep neural networks based speaker modeling at different levels of phonetic granularity,” in Proc. ICASSP, 2017, pp. 5440-5444. doi: 10.1109/ICASSP.2017.7953196.
- X.-K. Yang, D. Qu, W.-L. Zhang, and W.-Q. Zhang, “The NDSC transcription system for the 2016 multi-genre broadcast challenge,” in Proc. SLT, 2016, pp. 273-278. doi: 10.1109/SLT.2016.7846276.
- Z.-Q. Lv, M. Cai, W.-Q. Zhang, and J. Liu, “A novel discriminative score calibration method for keyword search,” in Proc. Interspeech, 2016, pp. 745-749. doi: 10.21437/Interspeech.2016-606.
- Y. Tian, M. Cai, H. Liang, W.-Q. Zhang, and J. Liu, “Improving deep neural networks based speaker verification using unlabeled data,” in Proc. Interspeech, 2016, pp. 1863-1867. doi: 10.21437/Interspeech.2016-614.
- Z.-Q. Lv, M. Cai, C. Lu, J. Kang, L.-K. Hui, W.-Q. Zhang, and J. Liu, “Improved system fusion for keyword search,” in Proc. ASRU, 2015, pp. 231-236. doi: 10.1109/ASRU.2015.7404799.
- M. Cai, Z.-Q. Lv, B.-L. Song, Y.-Z. Shi, W.-L. Wu, C. Lu, W.-Q. Zhang, and J. Liu, “The THUEE system for the OpenKWS14 keyword search evaluation,” in Proc. ICASSP, 2015, pp. 4734-4738. doi: 10.1109/ICASSP.2015.7178869.
- J. Kang, C. Lu, M. Cai, W.-Q. Zhang, and J. Liu, “Neuron sparseness versus connection sparseness in deep neural network for large vocabulary speech recognition,” in Proc. ICASSP, 2015, pp. 4954-4958. doi: 10.1109/ICASSP.2015.7178913.
- Y.-Z. Shi, W.-Q. Zhang, M. Cai, and J. Liu, “Variance regularization of RNNLM for speech recognition,” in Proc. ICASSP, 2014, pp. 4931-4935. doi: 10.1109/ICASSP.2014.6854532.
- W.-W. Liu, W.-Q. Zhang, Y.-Z. Shi, A. Ji, J. Xu, and J. Liu, “Improved phonotactic language recognition based on RNN feature reconstruction,” in Proc. ICASSP, 2014, pp. 5359-5363. doi: 10.1109/ICASSP.2014.6854619.
- W.-W. Liu, W.-Q. Zhang, and J. Liu, “Phonotactic language identification based on time-gap-weighted lattice kernels,” in Proc. Interspeech, 2014, pp. 3022-3026. doi: 10.21437/Interspeech.2014-606.
- W.-L. Zhang, D. Qu, W.-Q. Zhang, and B.-C. Li, “Speaker adaptation based on sparse and low-rank eigenphone matrix estimation,” in Proc. Interspeech, 2014, pp. 2792-2796. doi: 10.21437/Interspeech.2014-496.
- Z.-Y. Li, W.-Q. Zhang, W.-W. Liu, Y. Tian, and J. Liu, “Text-independent speaker verification via state alignment,” in Proc. Odyssey, 2014, pp. 68–72. doi: 10.21437/Odyssey.2014-10.
- Y.-Z. Shi, W.-Q. Zhang, M. Cai, and J. Liu, “Temporal kernel neural network language model,” in Proc. ICASSP, 2013, pp. 8247-8251. doi: 10.1109/ICASSP.2013.6639273.
- W.-Q. Zhang, Z.-Y. Li, W. Liu, and J. Liu, “THU-EE system fusion for the NIST 2012 speaker recognition evaluation,” in Proc. Interspeech, 2013, pp. 2474-2478. doi: 10.21437/Interspeech.2013-413.
- W. Liu, W.-Q. Zhang, Zhang, Z.-Y. Li, and J. Liu. “Parallel absolute-relative feature based phonotactic language recognition,” in Proc. Interspeech, 2013, pp. 59-63. doi: 10.21437/Interspeech.2013-38.
- W.-L. Zhang, W.-Q. Zhang, and B.-C. Li, “Compact acoustic modeling based on acoustic manifold using a mixture of factor analyzers,” in Proc. ASRU, 2013, pp. 37-42. doi: 10.1109/ASRU.2013.6707702.
- Z.-Y. Li, W.-Q. Zhang, L. He, and J. Liu, “Complementary combination in i-vector level for language recognition,” in Proc. Odyssey, 2012, pp. 334-337. Available: https://www.isca-archive.org/odyssey_2012/li12_odyssey.html
- Y.-Z. Shi, W.-Q. Zhang, and J. Liu, “Robust audio fingerprinting based on local spectral luminance maxima scheme,” in Proc. Interspeech, 2011, pp. 2485-2488. doi: 10.21437/Interspeech.2011-636.
- W.-L. Zhang, W.-Q. Zhang, and B.-C. Li, “Speaker adaptation based on speaker-dependent eigenphone estimation,” in Proc. ASRU, 2011, pp. 48-52. doi: 10.1109/ASRU.2011.6163904.
- W.-Q. Zhang, Y. Deng, L. He, and J. Liu, “Variant time-frequency cepstral features for speaker recognition,” in Proc. Interspeech, 2010, pp. 2122-2125. doi: 10.21437/Interspeech.2010-160.
- S. Meng, W.-Q. Zhang, and J. Liu, “Combining Chinese spoken term detection systems via side-information conditioned linear logistic regression,” in Proc. Interspeech, 2010, pp. 685-688. doi: 10.21437/Interspeech.2010-260.
- J. Yang, J. Liu, and W.-Q. Zhang, “A fast query by humming system based on notes,” in Proc. Interspeech, 2010, pp. 2898-2901. doi: 10.21437/Interspeech.2010-753.
- W.-Q. Zhang, Y. Shan, and J. Liu, “Multiple background models for speaker verification,” in Proc. Odyssey, 2010, pp. 47-51. Available: https://www.isca-archive.org/odyssey_2010/zhang10_odyssey.html
- W.-Q. Zhang and J. Liu, “Two-stage method for specific audio retrieval,” in Proc. ICASSP, 2007, pp. IV-85-88. doi: 10.1109/ICASSP.2007.367169.

近期专利 More>>

出版著作

数字语音处理理论与应用[译著],2016