MFA强制对齐⾳频和⾳素的⽤法
⽬录
环境
ubuntu 18.04.4 LTS
准备⼯作
1. 下载Linux版本的MFA库(这⾥下的版本是Version 1.1.0 Beta 2)
2. 下载汉语预训练的()
3. 下载发⾳词典
4. 存放数据的data⽂件夹有(.wav)⾳频⽂件和(.lab)正交注解⽂件,其中(.lab)⽂件内容如下所⽰:
lv4 shi4 yang2 chun1 yan1 jing3 da4 kuai4 wen2 zhang1 de5 di3 se4 si4 yue4 de5 lin2 luan2 geng4 shi4 lv4 de5 xian1 huo2 xiu4 mei4 shi1 yi4 ang4 ran2
⾳素对齐
1. 解压montreal-forced-aligner_
tar zxvf montreal-forced-aligner_
2. 把mandarin.zip放在montreal-forced-aligner/pretrained_models⽬录,把data⽂件和mandarin-for-montreal-forced-aligner-
pre-trained-model.lexicon发⾳词典放在montreal-forced-aligner⽬录。
3. 终端进⼊montreal-forced-aligner⽬录执⾏mfa_align脚本
./bin/mfa_align data mandarin-for-montreal-forced-aligner-pre-trained-model.lexicon pretrained_models/mandarin.zip result
4. result⽬录下⽣成(.TextGrid)对齐⽂本
File type="ooTextFile"
Object class ="TextGrid"
xmin = 0.0
xmax = 9.8125
tiers? <exists>
exists的用法size = 2
item []:
item [1]:
class ="IntervalTier"
name ="words"
xmin = 0.0
xmax = 9.8125
intervals: size = 37
intervals [1]:
xmin = 0.0
xmax = 0.840
text =""
intervals [2]:
xmin = 0.840
xmax = 1.170
text ="lv4"
intervals [3]:
intervals [3]:
xmin = 1.170
xmax = 1.330
text ="shi4"
intervals [4]:
xmin = 1.330
xmax = 1.610
text ="yang2"
intervals [5]:
xmin = 1.610
xmax = 1.910
text ="chun1" ......
intervals [37]:
xmin = 8.920
xmax = 9.8125
text =""
item [2]:
class ="IntervalTier" name ="phones"
xmin = 0.0
xmax = 9.8125
intervals: size = 74 intervals [1]:
xmin = 0.000
xmax = 0.840
text ="sil"
intervals [2]:
xmin = 0.840
xmax = 0.960
text ="l"
intervals [3]:
xmin = 0.960
xmax = 1.170
text ="v4"
intervals [4]:
xmin = 1.170
xmax = 1.290
text ="sh"
intervals [5]:
xmin = 1.290
xmax = 1.330
text ="ii4" ......
intervals [74]:
xmin = 9.790
xmax = 9.8125
text =""
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论