python爱⼼代码_语⾳增强论⽂及相关代码整理
在很多年前,语⾳增强的主要⽅法还是⼀些传统的⽅法,例如基于模型的,基于滤波器的等等,这些传统的⽅法⼤多数都被前辈⼤佬们研究透了,也已经很成熟了,也是⽬前⼯业上⽤于前端去噪的常⽤⽅法,经典的webrtc中的降噪⽬前使⽤的就是基于维纳滤波的降噪算法,有兴趣的可以拿来webrtc语⾳处理的c代码⼀下,最近看了⼀下,脑⽠⼦嗡嗡的。
⾃从深度神经⽹络在计算机视觉领域取得了巨⼤成就以后,语⾳增强领域的⼤佬们也都按耐不住了,想亲⾃尝试⼀把。最早出现的应该是基于全连接深度神经⽹络语⾳增强算法,也取得了不错的效果,后来不断有学术界的⼤佬们开始尝试各种⽹络,不完全统计包含全连接神经⽹络、卷积神经⽹络、全卷积神经⽹络、扩⼤(空洞卷积)卷积神经⽹络的、循环神经⽹络、LSTM、GRU、⽣成对抗⽹络、Wasserstein⽣成对抗⽹络、条件⽣成对抗⽹络等等各式各样的⽹络。⽬前还有基于语⾳合成的语⾳增强
⽅法(论⽂在下⾯,2019年的ICASSP)。虽然各个算法出现,但是⽬前想⽤在⼯业上还需要⼀段时间,这些⽅法基本毫⽆例外的都会对语⾳的频谱造成⼀定的破坏,想⽤在语⾳识别前端,提升识别率,我劝你谨慎,实验结果会令你⼤失所望。我觉的如果想⽤于纯粹的降噪系统中,例如(⽿机,助听器等设备),我觉得还是可以的,因为⼈对于有频谱损失的⾳频还是听不出来的。
下⾯是整理的基于各个⽹络的语⾳增强算法以及相关的代码。
最近在深度学习在语⾳增强中的应⽤最前沿的应该数GAN⽹络了吧,把⽣成器当做增强⽹络,⽤判别器区分⼲净语⾳和增强语⾳。
1 .SEGAN: Speech Enhancement Generative Adversarial Network 【相关代码】
2. Speech Enhancement Based on A New Architecture of Wasserstein Generative Adversarial Networks
3. Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification
4.Language and Noise Transfer in Speech Enhancement Generative Adversarial Network
5.Exploring speech enhancement with generative adversarial networks for robust speech recognition
6.Time-Frequency Masking-based Speech Enhancement using Generative Adversarial Network
7.Adversarial Feature-Mapping for Speech Enhancement[微软AI研究院]
8.Sergan: Speech Enhancement Using Relativistic Generative Adversarial Networks with Gradient Penalty[ICASSP2019][相关代码]
9.CP-GAN:Context Pyramid Generative Adversarial Network for Speech Enhancement[ICASSP2020]
10.PAGAN:A Phase-adapted Generative Adversarial Network for Speech Enhancement[ICASSP2020][相关代码]
Tdcgan: Temporal Dilated Convolutional Generative Adversarial Network for End-to-end Speech
11.Tdcgan: Temporal Dilated Convolutional Generative Adversarial Network for End-to-end Speech
2020/08/18最新]
Enhancement
python新手代码及作用Enhancement[2020/08/18最新
在卷积神经⽹络⽅⾯,有基于全卷积的,有基于冗余卷积的,在时域上和在频域上处理语⾳。论⽂链接如下:
1.Single channel speech enhancement using convolutional neural network 【相关代码】
2.A Fully Convolution Neural Network for Speech Enhancement
3.Raw Waveform-based Speech Enhancement by Fully Convolutional Networks
4.Speech Denoising with Deep Feature Losses 【相关代码】
5.A New Framework for Supervised Speech Enhancement in the Time Domain
6.A Wavenet for Speech Denoising【相关代码】
7.Fully Convolution Recurrnet Network for Speech Enhancement[ICASSP2020]
在DNN⽅⾯,主要是在频域内处理语⾳,通过短时傅⾥叶变换求得短时频谱,然后对短时频谱进⾏处理,利⽤含噪语⾳的相位进⾏重构增强语⾳。还有⼀些⼩是DNN和传统语⾳增强⽅法进⾏结合的办法,
把传统语⾳中的features换成DNN⽹络,基本这个套路。论链接如下:
1.Speech Enhancement In Multiple-Noise Conditions using Deep Neural Networks
2.NMF-based Speech Enhancement Incorporating Deep Neural Network 【相关代码】
3.A Novel Single Channel Speech Enhancement Based on Joint Deep Neural Network and Wiener Filter
4.An Experimental Study on Speech Enhancement Based on Deep Neural Networks 【相关代码c++】【相关代码python】【相关代码matlab】
5.A Regression Approach to Speech Enhancement Based on Deep Neural Networks 【相关代码 见4】
基于RNN或者LSTM的语⾳增强技术相关⽂章:
1. Multiple-target deep learning for LSTM-RNN based speech enhancement 【相关代码】
2. Densely Connected Progressive Learning for LSTM-Based Speech Enhancement(2018 ICASSP)【相关代码】
2019年ICASSP会议上出现了将语⾳合成应⽤于语⾳增强上⾯,这是⼀种很新颖的想法。
1. Speech Denoising by Parametric Resynthesis
持续更新中..........
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论