使用openSMILE提取MFCC简易教程(Mac)--688IT编程网

使⽤openSMILE提取MFCC简易教程（Mac）

openSMILE是⼀款专门为提取⾳频特征设计的软件，介绍和安装⽅法⽹上已经有很多，这⾥不再赘述，我摸索openSMILE的使⽤⽅法的时候发现⽹上关于这个软件的教程很少，所以将⾃⼰使⽤的经验写出来放到这个博客上来，希望有⼈使⽤这个软件的时候不要再绕那么多弯路。

我安装软件的时候跟visual studio不停地冲突，所以我尝试了⼀下安装到mac系统上，并且使⽤shell编写程序脚本，进⾏特征的提取。

在使⽤openSMILE的时候，决定了你提取的特征的参数都存储在你所使⽤的.config⽂件中，包括了frame size, mfcc的系数的数量，是否包含delta等，⼩⽩完全可以在官⽅提供的配置⽂件上进⾏修改，提取⾃⼰模型所需的特征值。

在openSMILE中，默认提取后存储的⽂件格式为compatibility HTK format，在python中我没有到很合适的function读取这种⽂件类型，最终我使⽤的是MATLAB的voice tool 包中的readhtk() 函数。具体的模型建⽴⽅法我将在下⼀篇博客中说明。

现在我们先来看⼀下官⽅提供的配置⽂件MFCC12_0_f 。

///////////////////////////////

///// > openSMILE configuration file to extract MFCC features < //////

///// HTK target kind: MFCC_0_D_A, numCeps=12 //////////

///////////

///// All rights reserverd. See file COPYING for details. //////

///////////////////////////////

;

; This section is always required in openSMILE configuration files

; it configures the componentManager and gives a list of all components which are to be loaded

; The order in which the components are listed should match

;

the order of the data flow for most efficient processing

;

///////////////////////////////

[componentInstances:cComponentManager]

instance[dataMemory].type=cDataMemory

\{shared/standard_f.inc}

[componentInstances:cComponentManager]

; audio framer

instance[frame].type=cFramer

; speech pre-emphasis (on a per frame basis as HTK does it)

instance[pe].type=cVectorPreemphasis

;

apply a window function to pre-emphasised frames

instance[win].type=cWindower

; transform to the frequency domain using FFT

instance[fft].type=cTransformFFT

; compute magnitude of the complex fft from the previous component

instance[fftmag].type=cFFTmagphase

; compute Mel-bands from magnitude spectrum

instance[melspec].type=cMelspec

; compute MFCC from Mel-band spectrum

instance[mfcc].type=cMfcc

; compute delta coefficients from mfcc and energy

instance[delta].type=cDeltaRegression

; compute acceleration coefficients from delta coefficients of mfcc and energy

instance[accel].type=cDeltaRegression

; run single threaded (nThreads=1)

; NOTE: a single thread is more efficient for processing small files, since multi-threaded processing involves moredede仿站教程

; overhead during startup, which will make the system slower in the end

nThreads=1

; do not show any internal dataMemory level settings

; (if you want to see them set the value to 1, 2, 3, or 4, depending on the amount of detail you wish) printLevelStats=0

/////////////////////////////////

///////// component configuration ////////////

/////////////////////////////////

; the following sections configure the components listed above

; a help on configuration parameters can be obtained with

; SMILExtract -H

; or

; SMILExtract -H configTypeName (= componentTypeName)

/////////////////////////////////

[frame:cFramer]

reader.dmLevel=wave

writer.dmLevel=frames

noPostEOIprocessing = 1

copyInputName = 1

frameSize = 0.0250

frameStep = 0.010

frameMode = fixed

frameCenterSpecial = left

[pe:cVectorPreemphasis]

reader.dmLevel=frames

writer.dmLevel=framespe

k = 0.97

de = 0

[win:cWindower]

reader.dmLevel=framespe

writer.dmLevel=winframes

copyInputName = 1

processArrayFields = 1

; hamming window

winFunc = ham

; no gain, no offset

gain = 1.0

offset = 0

[fft:cTransformFFT]

reader.dmLevel=winframes

writer.dmLevel=fft

copyInputName = 1

processArrayFields = 1

inverse = 0

; for compatibility with 2.2.0 and older versions

zeroPadSymmetric = 0

[fftmag:cFFTmagphase]

reader.dmLevel=fft

writer.dmLevel=fftmag

copyInputName = 1

processArrayFields = 1

inverse = 0

magnitude = 1

phase = 0

[melspec:cMelspec]

reader.dmLevel=fftmag

writer.dmLevel=melspec

copyInputName = 1

processArrayFields = 1

;

htk compatible sample value scaling

htkcompatible = 1

nBands = 26

; use power spectrum instead of magnitude spectrum

usePower = 1

lofreq = 0

hifreq = 8000

specScale = mel

inverse = 0

[mfcc:cMfcc]

reader.dmLevel=melspec

writer.dmLevel=ft0

copyInputName = 1

processArrayFields = 1

firstMfcc = 0

lastMfcc = 12

cepLifter = 22.0

htkcompatible = 1

[delta:cDeltaRegression]

reader.dmLevel=ft0

writer.dmLevel=ft0de

nameAppend = de

copyInputName = 1

noPostEOIprocessing = 0

deltawin=2

blocksize=1

[accel:cDeltaRegression]

reader.dmLevel=ft0de

writer.dmLevel=ft0dede

nameAppend = de

copyInputName = 1

noPostEOIprocessing = 0

deltawin=2

blocksize=1

//////////////////////////

/////// data output configuration //////

//////////////////////////

[componentInstances:cComponentManager]

instance[audspec_lldconcat].type=cVectorConcat

[audspec_lldconcat:cVectorConcat]

reader.dmLevel = ft0;ft0de;ft0dede

writer.dmLevel = lld

includeSingleElementFields = 1

\{shared/standard_data_f.inc}

/---------------------- END -------------------------///

整个⽂件被⼤致分为了三个部分：

第⼀部分是参数介绍，介绍了这个⽂件中涉及到的MFCC的参数以及它们在⽂件中的命名是什么；

第⼆部分是参数的数值设置；

第三部分是结果输出的设置，⼀般来说可以保持这部分不变。

在设置好所需的配置⽂件之后，提取过程就⾮常简单了。

⾸先我们打开⼀个空⽩⽂档，需要设定输⼊⽬录、输出⽬录以及openSMILE所在⽬录。在这个范例中，我们输⼊⽬录下的所有⽂件都

是.wav 格式，所以减少了⼀步验证⽂件格式的操作。然后转⾄openSMILE所在⽬录，使⽤循环将输⼊⽬录下的⽂件全部提取出来并存放⾄输出⽬录，存⼊的⽂件名格式为 xx.mfcc.htk 。点击保存，存储类型为.sh ，使⽤时，直接拖⼊terminal终端即可运⾏。

⽰例代码如下。

#!/bin/bash

#vi .bash_profile

PATH=$PATH:$HOME/bin

dir=/Users/lemon/Documents/wan/test

OPATH=/Users/lemon/Documents/wanzhi/test_mfcc

os=/Users/wan/Downloads/opensmile-2.3.0

cd $os

for wav in $(ls $dir); do

SMILExtract -C config/MFCC12_0_f -I $dir/$wav -O $OPATH/$wav.mfcc.htk

echo "$wav is extracted"

done

echo "work finished!"

⾄此，我们去查看输出路径，会发现提取好的mfcc特征都以.htk 的格式存放好了。

下⼀篇将解释如何把这些数据导⼊到MATLAB中进⾏分类。

688IT编程网

使用openSMILE提取MFCC简易教程(Mac)

发表评论

推荐文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

热门文章

随机森林特征选择原理

自动驾驶系统中的随机森林算法解析

随机森林算法及其在生物信息学中的应用

监督学习中的随机森林算法解析(六)

随机森林算法在数据分析中的应用

机器学习——随机森林,RandomForestClassifier参数含义详解

随机森林的算法

随机森林算法作用

监督学习中的随机森林算法解析(十)

随机森林算法案例

随机森林案例

二分类问题常用的模型

绘制ssd框架训练流程

一种基于信息熵和DTW的多维时间序列相似性度量算法

SVM训练过程范文

如何使用支持向量机进行股票预测与交易分析

二分类交叉熵损失函数binary

tinybert_训练中文文本分类模型_概述说明

基于门控可形变卷积和分层Transformer的图像修复模型及其应用

人工智能开发技术的测试和评估方法

最新文章

基于随机森林的数据分类算法改进

人工智能中的智能识别与分类技术

基于人工智能技术的随机森林算法在医疗数据挖掘中的应用

随机森林回归模型的建模步骤

r语言随机森林预测模型校准曲线

《2024年随机森林算法优化研究》范文

标签列表

688IT编程网

使用openSMILE提取MFCC简易教程(Mac)

发表评论

推荐文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

热门文章

随机森林特征选择原理

自动驾驶系统中的随机森林算法解析

随机森林算法及其在生物信息学中的应用

监督学习中的随机森林算法解析(六)

随机森林算法在数据分析中的应用

机器学习——随机森林,RandomForestClassifier参数含义详解

随机森林 的算法

随机森林算法作用

监督学习中的随机森林算法解析(十)

随机森林算法案例

随机森林案例

二分类问题常用的模型

绘制ssd框架训练流程

一种基于信息熵和DTW的多维时间序列相似性度量算法

SVM训练过程范文

如何使用支持向量机进行股票预测与交易分析

二分类交叉熵损失函数binary

tinybert_训练中文文本分类模型_概述说明

基于门控可形变卷积和分层Transformer的图像修复模型及其应用

人工智能开发技术的测试和评估方法

最新文章

基于随机森林的数据分类算法改进

人工智能中的智能识别与分类技术

基于人工智能技术的随机森林算法在医疗数据挖掘中的应用

随机森林回归模型的建模步骤

r语言随机森林预测模型校准曲线

《2024年随机森林算法优化研究》范文

标签列表

随机森林的算法