(0)

基本信息

  • 性别:
  • 聘任技术职务:副研究员
  • 学历:研究生教育
  • 联系电话:
  • 电子邮箱:yanhua@shnu.edu.cn
  • 通讯地址:上海市徐汇区桂林路100号
  • 部门:信息与机电工程学院
  • 学位:工学博士学位
  • 毕业院校:中国科学技术大学
  • 办公地址:上海师范大学信息与机电工程学院

研究方向

研究方向:

人工智能,大数据挖掘,智能语音信息处理,机器学习,包括:

(1)端到端深度学习架构,基于transformer的声学建模算法等

(2)大词汇量连续语音识别声学建模,语音唤醒和识别技术等

(3)声纹识别:Anti-spoofing speaker verification, end-to-end speaker recognition等

(4)基于深度学习的远场语音信号噪声鲁棒性研究,如语音增强,麦克风阵列信号处理等

(5)特殊声学事件检测


自2006年起一直从事人工智能、深度学习、智能语音信息处理相关方面的研究,主持并参与过国家自然科学基金、十一五国防预研、英国EPSRC等项目。在自动语音识别、说话人识别、机器学习与模式识别等方面做出了一定的研究成果,在语音领域两大顶级国际会议ICASSP、INTERSPEECH 上发表多篇学术论文,同时担任“上海师范大学-云知声 自然人机交互”联合实验室负责人,主要从事智能语音人机交互方面的研究。


学习及工作经历:

  • 2013/06-至今,   上海师范大学,电气信息系,副研究员

  • 2011/10-2013/4, 英国剑桥大学,博士后,导师:Phil Woodland

  • 2009/09-2010/02,微软亚洲研究院,语音组,Intern

  • 2008/07-2009/02,新加坡南洋理工大学及I2R研究院,HLT部门,交流学生

  • 2006/9–2011/6, 中国科学技术大学,科大讯飞语音实验室, 博士, 导师:戴礼荣



国际科研合作:

1. 与新加坡国立大学(NUS)的Haizhou Li教授合作,联合培养在读研究生


科研获奖及学术评测:

(1)2018年,上海产学研合作优秀项目奖二等奖。 

(2)2016年,国际中英文混合语音识别竞赛(OC16 Chinese-English MixASR Challenge)”,提交的“SHNU”中英文混合语音识别系统取得国际第2名。

(3)20187-8月,第5届国际多通道语音分离和识别评测(The 5th CHiME Speech Separation and Recognition Challenge, CHiME-5),提交的“SHNU系统成绩排在国际第9名。

(4)20191-2月,国际防攻击声纹识别评测(Automatic Speaker Verification Spoofing and Countermeasures Challenge, ASVspoof2019)”, 提交的“SHNU系统成绩排在国际第13/156名。

(5)20199月,联合实验室学生参加 “多通道远场文本相关声纹识别-AISHELL Speaker Verification Challenge 2019”竞赛,获得第4/50.

(6)2008 NIST Speaker Recognition Evaluation (SRE),在核心测试任务中,作为关键技术人员及组长带领的团队获得 EERminDCF两项国际第一名,DCF第三名,综合成绩国际第一,该成果被国家自然科学基金委,中国科学院网站等 100多家媒体报导。

(7)2009 NIST Language Recognition Evaluation,团队在通用语种测试中各项指标综合排名国际第二;同时,在更具挑战性的 8组方言对测试中,有 6组方言对测试性能均远远超过了其他参赛单位,综合排名国际第一。

(8)2010 NIST Speaker Recognition Evaluation,作为关键技术人员及组长带领的团队获得 EER, minDCF DCF指标综合成绩国际第二名。



主持的科研项目:

(1)国家自然科学基金项目,中英文混合语音识别中声学建模关键技术研究, 在研.

(2)上海市青年科技英才扬帆计划, 基于深度学习的声纹识别方法研究,  已结题.

(3)2018联盟计划项目,防攻击的声纹识别关键技术研发,已结题

(4)联合实验室横向课题,自然人机交互关键技术研发, 已结题.

(5)校一般科研项目,面向语音识别的副语言信息标注算法研究,已结题。

(6)企业横向课题-联盟计划项目, 多语种混合语音识别开发,  已结题.

(7)校产学研项目,噪声环境下中英混合的语音识别系统研发,  已结题.


近期论文:

Xinyuan Zhou, Grandee Lee, Emre Yilmaz, Yanhua Long, Jiaen Liang and Haizhou Li. "Self-and-Mixed Attention Decoder with Deep Acoustic Structure for Transformer-based LVCSR", Interspeech, 2020, accepted. 


Xinyuan Zhou, Emre Yilmaz, Yanhua Long, Jiaen Liang and Haizhou Li. "Multi-Encoder-Decoder Transformer for code-switching speech recognition", Interspeech, 2020, accepted. 


Yanhua Long, Qiaozheng Zhang, Shuang Wei, Hong Ye and Jichen Yang. "Acoustic data augmentation for Mandarin-English code-switching speech recognition", 2020, Applied Acoustics, online:https://doi.org/10.1016/j.apacoust.2019.107175


Renke He, Yanhua Long, Yijie Li and Jiaen Liang. “Mask-based blind source separation and MVDR beamforming in ASR", International Journal of Speech Technology, 2020, online: http://link.springer.com/article/10.1007/s10772-019-09666-x


Yan Shi, Juanjuan Zhou, Yanhua Long, Yijie Li, Hongwei Mao. "Addressing Text-Dependent Speaker Verification Using Singing Speech". Applied Sciences, 2019, 9(13), 2636. 


Yanhua  Long, Shuang Wei, Qiaozheng Zhang, Chunxia Yang. "Large-Scale Semi-Supervised Training in Deep Learning Acoustic Model for ASR". IEEE Access, 2019,(7):133615-133627.


Zhiming Feng, Qiqi Tong, Yanhua Long, Shuang Wei, Chunxia Yang, Qiaozheng Zhang. "SHNU Anti-spoofing systems for ASVspoof 2019 Challenge", APSIPA 2019, pp.548-552.


Yanhua Long, Yijie Li, Bo Zhang. "Offline to online speaker adaptation for real-time deep neural network based LVCSR systems". Multimedia Tools and Applications,2018,77(21):28101-28119.


Yanhua Long, Renke He. "The SHNU System for the CHiME-5 Challenge". Proc. CHiME 2018 Workshop on Speech Processing in Everyday Environments. 2018: 64-66. 


Yanhua Long,  Hong Ye,Yijie Li, Jiaen Liang. "Active Learning for LF-MMI Trained Neural Networks in ASR". Proc. Interspeech, 2018:2898-2902. 


Yan Zhang, Yanhua Long, Xiangrong Shen, et.al. "Articulatory movement features for short-duration text-dependent speaker verification". International Journal of Speech Technology, 2017, 20(4):753-759.


Yanhua Long, Yijie Li, Hong Ye. Hongwei Mao. "Domain adaptation of lattice-free MMI based TDNN models for speech recognition". International Journal of Speech Technology, 2017, 20(1):171-178.


Yanhua Long, Hong Ye, Jifeng Ni. "Domain Compensation Based on Phonetically Discriminative Features for Speaker Verification", Computer Speech & Language, 2017,(41): 161-179.


Haoran Wei, Yanhua Long, Hongwei Mao. "Improvements on self-adaptive voice activity detector for telephone data", International Journal of Speech Technology, 2016,19(3):623-630.


龙艳花,倪继锋,叶宏.“基于深度神经网络的说话人信道自适应方法”,2016,48(2): 151-155.


Yanhua Long, Hong Ye. "Filled Pause Refinement Based on the Pronunciation Probability for Lecture Speech", PLos One, 10(4):2015, e0123466.doi: 10.1371/ journal.pone. 0123466.


Bo Li, Yanhua Long, Hong Ye. "Outlier Detection and Cluster Center Initialization for K-means Algorithm", Journal of Computational Information Systems, 11(12): 2015, 4333–4342.


龙艳花,戴礼荣. “采用M-矢量和支持向量机的说话人确认系统”.华中科技大学学报(自然科学版)201442(8):63-68.


Yanhua Long, Mark Gales, Pierre Lanchantin, Xunying Liu, Matthew Stephen Seigel, Phil Woodland. “Improving Lightly Supervised Training for Broadcast Transcription”. Interspeech, pp.2187-2191, 2013.


Pierre  Lanchantin, Peter Bell, Mark Gales, Thomas Hain, Xunying Liu, Yanhua Long, Jennifer Quinnell, Steve Renals, et al. “Automatic transcription of multi-genre media archives”. SLAM, pp.26-31, 2013.


P. Bell, M. Gales, P. Lanchantin, X. Liu, Y. Long, S. Renals, P. Swietojanski, P.C. Woodland, “Transcription of Multi-Genre Media Archives Using Out-of-domain data”, SLT, pp.324-329, 2012.


Yanhua Long ,Zhi-Jie Yan, Frank K Soong, Lirong Dai, Wu Guo.“Improvements in Speaker Characterization Using Spectral Subband Energy Based on Harmonic plus Noise Model”, pp. 373-376 , INTERSPEECH, 2011.


Yanhua Long , Zhi-Jie Yan, Frank K Soong, Lirong Dai, Wu Guo. “Speaker Characterization using Spectral Subband Energy Ratio based on Harmonic Plus Noise Model”, pp.4520-4523, ICASSP, 2011.


Ying XU, Yan Song, Yanhua Long, Haibing Zhong, Lirong Dai. ” The Description of iFlyTek Speech Lab System for NIST2009 Language Recognition Evaluation”, pp.157-161, ISCSLP, 2010.


Yanhua Long, LiRong Dai, Er-yu Wang, Bin Ma, Wu Guo. “Non-negative matrix factorization based discriminative features for speaker verification”, pp.291-295, ISCSLP, 2010.


Wu Guo, Yanhua Long, Eryu Wang,er.al. “IFly speech lab 2010 speaker recognition evaluation system description”. NIST SRE2010, system description paper. (NIST SRE2010 Evaluation paper)


Yanhua Long, LiRong Dai, Bin Ma, Wu Guo. “Effects of the Phonological Relevance in Speaker Verification”, pp.2130-2133 , INTERSPEECH, 2010.


Wu Guo, Zhao Zhang, Yanhua Long, Lirong Dai. “N-gram Nearest Neighbor Algorithm for Voice Password System”, pp.4438-4441, ICASSP, 2010.


Yanhua Long, Bin Ma, Haizhou Li, Wu Guo, Eng Siong Chng, Lirong Dai.  “Exploiting Prosodic Information for Speaker Recognition”, pp. 4225- 4228, ICASSP,2009.


Wu Guo, Yanhua Long, Yijie Li, Lei Pan, Eryu Wang, Lirong Dai. “ iFLY system for the NIST 2008 speaker recognition evaluation”, pp. 4209 – 4212, ICASSP, 2009.


Yanhua Long, Wu Guo, Bin Ma, Eng Siong Chng, Donglai Zhu, Lirong Dai, Haizhou Li. “Subspace Construction and Selection for Speaker Recognition”, pp.1-4, ICICS, 2009.


Yanhua Long, Wu Guo, Lirong dai. “A PCA Method Based on Speaker Session Variability”, Journal of Pattern recognition and artificial intelligence, pp.270-274, No. 22, Issue 2, 2009.


Yanhua Long, Wu Guo, LiRong Dai.  “To Balance Training Data for SVM Based Speaker Verification “, Journal of Chinese Information Processing, pp.76-80, No. 5, Issue 3, 2008. (Chinese core journals)


Yanhua Long, Wu Guo, LiRong Dai.” An SIPCA-WCCN Method for SVM-based Speaker Verification System”, pp.1295–1299, ICALIP, 2008.


Yanhua Long, Wu Guo, LiRong Dai.” Interfusing the Confused Region Score of Speaker Verification Systems”, pp.1-4, ISCSLP2008.


Yanhua Long, Wu Guo, LiRong Dai.” Sequence Kernel for SVM based Speaker verification system”, Journal of Tsinghua University (Science and Technology), pp.688-692, Vol.48, No.S1, 2008. 


学术成果(以下信息源于科研管理系统)

学术成果:
专利成果
  • [1] 龙艳花,叶宏,周雷. 一种文件夹加密方法. 中国专利:申请状态(申请号:201410784456.1).
  • [2] 龙艳花,叶宏,魏浩然. 采用声纹和语音识别进行个性化电视语音唤醒的方法. 中国专利:申请状态(申请号:201410840544.9).
  • [3] 魏浩然,龙艳花,冯志民,叶宏,茅红伟. 一种基于位置信息的语音端点检测方法. 中国专利:申请状态(申请号:201710624269.0).

教学工作

教学工作:
教职工课程信息
开课学年开课学期课程名称
2020-20211线性代数
2019-20202数字语音处理
2018-20192数字语音处理
2018-20191线性代数

荣誉奖励

荣誉奖励:

2015-2016 年度上海师范大学三八红旗手

2017年,获上海师范大学“第八届优秀青年学术骨干”荣誉称号

社会兼职

社会兼职: