1 Star 0 Fork 0

东方佑/acoustic_model

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
MIT
<!doctype html>
<html>
<head>
<meta charset='UTF-8'><meta name='viewport' content='width=device-width initial-scale=1'>
<title>README</title><style type='text/css'>html {overflow-x: initial !important;}:root { --bg-color:#ffffff; --text-color:#333333; --select-text-bg-color:#B5D6FC; --select-text-font-color:auto; --monospace:"Lucida Console",Consolas,"Courier",monospace; }
html { font-size: 14px; background-color: var(--bg-color); color: var(--text-color); font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; -webkit-font-smoothing: antialiased; }
body { margin: 0px; padding: 0px; height: auto; bottom: 0px; top: 0px; left: 0px; right: 0px; font-size: 1rem; line-height: 1.42857; overflow-x: hidden; background: inherit; }
iframe { margin: auto; }
a.url { word-break: break-all; }
a:active, a:hover { outline: 0px; }
.in-text-selection, ::selection { text-shadow: none; background: var(--select-text-bg-color); color: var(--select-text-font-color); }
#write { margin: 0px auto; height: auto; width: inherit; word-break: normal; word-wrap: break-word; position: relative; white-space: normal; overflow-x: visible; padding-top: 40px; }
#write.first-line-indent p { text-indent: 2em; }
#write.first-line-indent li p, #write.first-line-indent p * { text-indent: 0px; }
#write.first-line-indent li { margin-left: 2em; }
.for-image #write { padding-left: 8px; padding-right: 8px; }
body.typora-export { padding-left: 30px; padding-right: 30px; }
.typora-export .footnote-line, .typora-export li, .typora-export p { white-space: pre-wrap; }
@media screen and (max-width: 500px) {
  body.typora-export { padding-left: 0px; padding-right: 0px; }
  #write { padding-left: 20px; padding-right: 20px; }
  .CodeMirror-sizer { margin-left: 0px !important; }
  .CodeMirror-gutters { display: none !important; }
}
#write li > figure:first-child { margin-top: -20px; }
#write ol, #write ul { position: relative; }
img { max-width: 100%; vertical-align: middle; }
button, input, select, textarea { color: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; font-size: inherit; line-height: inherit; font-family: inherit; }
input[type="checkbox"], input[type="radio"] { line-height: normal; padding: 0px; }
*, ::after, ::before { box-sizing: border-box; }
#write h1, #write h2, #write h3, #write h4, #write h5, #write h6, #write p, #write pre { width: inherit; }
#write h1, #write h2, #write h3, #write h4, #write h5, #write h6, #write p { position: relative; }
h1, h2, h3, h4, h5, h6 { break-after: avoid-page; break-inside: avoid; orphans: 2; }
p { orphans: 4; }
h1 { font-size: 2rem; }
h2 { font-size: 1.8rem; }
h3 { font-size: 1.6rem; }
h4 { font-size: 1.4rem; }
h5 { font-size: 1.2rem; }
h6 { font-size: 1rem; }
.md-math-block, .md-rawblock, h1, h2, h3, h4, h5, h6, p { margin-top: 1rem; margin-bottom: 1rem; }
.hidden { display: none; }
.md-blockmeta { color: rgb(204, 204, 204); font-weight: 700; font-style: italic; }
a { cursor: pointer; }
sup.md-footnote { padding: 2px 4px; background-color: rgba(238, 238, 238, 0.7); color: rgb(85, 85, 85); border-radius: 4px; cursor: pointer; }
sup.md-footnote a, sup.md-footnote a:hover { color: inherit; text-transform: inherit; text-decoration: inherit; }
#write input[type="checkbox"] { cursor: pointer; width: inherit; height: inherit; }
figure { overflow-x: auto; margin: 1.2em 0px; max-width: calc(100% + 16px); padding: 0px; }
figure > table { margin: 0px !important; }
tr { break-inside: avoid; break-after: auto; }
thead { display: table-header-group; }
table { border-collapse: collapse; border-spacing: 0px; width: 100%; overflow: auto; break-inside: auto; text-align: left; }
table.md-table td { min-width: 32px; }
.CodeMirror-gutters { border-right: 0px; background-color: inherit; }
.CodeMirror-linenumber { user-select: none; }
.CodeMirror { text-align: left; }
.CodeMirror-placeholder { opacity: 0.3; }
.CodeMirror pre { padding: 0px 4px; }
.CodeMirror-lines { padding: 0px; }
div.hr:focus { cursor: none; }
#write pre { white-space: pre-wrap; }
#write.fences-no-line-wrapping pre { white-space: pre; }
#write pre.ty-contain-cm { white-space: normal; }
.CodeMirror-gutters { margin-right: 4px; }
.md-fences { font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; overflow: visible; white-space: pre; background: inherit; position: relative !important; }
.md-diagram-panel { width: 100%; margin-top: 10px; text-align: center; padding-top: 0px; padding-bottom: 8px; overflow-x: auto; }
#write .md-fences.mock-cm { white-space: pre-wrap; }
.md-fences.md-fences-with-lineno { padding-left: 0px; }
#write.fences-no-line-wrapping .md-fences.mock-cm { white-space: pre; overflow-x: auto; }
.md-fences.mock-cm.md-fences-with-lineno { padding-left: 8px; }
.CodeMirror-line, twitterwidget { break-inside: avoid; }
.footnotes { opacity: 0.8; font-size: 0.9rem; margin-top: 1em; margin-bottom: 1em; }
.footnotes + .footnotes { margin-top: 0px; }
.md-reset { margin: 0px; padding: 0px; border: 0px; outline: 0px; vertical-align: top; background: 0px 0px; text-decoration: none; text-shadow: none; float: none; position: static; width: auto; height: auto; white-space: nowrap; cursor: inherit; -webkit-tap-highlight-color: transparent; line-height: normal; font-weight: 400; text-align: left; box-sizing: content-box; direction: ltr; }
li div { padding-top: 0px; }
blockquote { margin: 1rem 0px; }
li .mathjax-block, li p { margin: 0.5rem 0px; }
li { margin: 0px; position: relative; }
blockquote > :last-child { margin-bottom: 0px; }
blockquote > :first-child, li > :first-child { margin-top: 0px; }
.footnotes-area { color: rgb(136, 136, 136); margin-top: 0.714rem; padding-bottom: 0.143rem; white-space: normal; }
#write .footnote-line { white-space: pre-wrap; }
@media print {
  body, html { border: 1px solid transparent; height: 99%; break-after: avoid; break-before: avoid; }
  #write { margin-top: 0px; padding-top: 0px; border-color: transparent !important; }
  .typora-export * { -webkit-print-color-adjust: exact; }
  html.blink-to-pdf { font-size: 13px; }
  .typora-export #write { padding-left: 32px; padding-right: 32px; padding-bottom: 0px; break-after: avoid; }
  .typora-export #write::after { height: 0px; }
  @page { margin: 20mm 0px; }
}
.footnote-line { margin-top: 0.714em; font-size: 0.7em; }
a img, img a { cursor: pointer; }
pre.md-meta-block { font-size: 0.8rem; min-height: 0.8rem; white-space: pre-wrap; background: rgb(204, 204, 204); display: block; overflow-x: hidden; }
p > .md-image:only-child:not(.md-img-error) img, p > img:only-child { display: block; margin: auto; }
p > .md-image:only-child { display: inline-block; width: 100%; }
#write .MathJax_Display { margin: 0.8em 0px 0px; }
.md-math-block { width: 100%; }
.md-math-block:not(:empty)::after { display: none; }
[contenteditable="true"]:active, [contenteditable="true"]:focus { outline: 0px; box-shadow: none; }
.md-task-list-item { position: relative; list-style-type: none; }
.task-list-item.md-task-list-item { padding-left: 0px; }
.md-task-list-item > input { position: absolute; top: 0px; left: 0px; margin-left: -1.2em; margin-top: calc(1em - 10px); border: none; }
.math { font-size: 1rem; }
.md-toc { min-height: 3.58rem; position: relative; font-size: 0.9rem; border-radius: 10px; }
.md-toc-content { position: relative; margin-left: 0px; }
.md-toc-content::after, .md-toc::after { display: none; }
.md-toc-item { display: block; color: rgb(65, 131, 196); }
.md-toc-item a { text-decoration: none; }
.md-toc-inner:hover { text-decoration: underline; }
.md-toc-inner { display: inline-block; cursor: pointer; }
.md-toc-h1 .md-toc-inner { margin-left: 0px; font-weight: 700; }
.md-toc-h2 .md-toc-inner { margin-left: 2em; }
.md-toc-h3 .md-toc-inner { margin-left: 4em; }
.md-toc-h4 .md-toc-inner { margin-left: 6em; }
.md-toc-h5 .md-toc-inner { margin-left: 8em; }
.md-toc-h6 .md-toc-inner { margin-left: 10em; }
@media screen and (max-width: 48em) {
  .md-toc-h3 .md-toc-inner { margin-left: 3.5em; }
  .md-toc-h4 .md-toc-inner { margin-left: 5em; }
  .md-toc-h5 .md-toc-inner { margin-left: 6.5em; }
  .md-toc-h6 .md-toc-inner { margin-left: 8em; }
}
a.md-toc-inner { font-size: inherit; font-style: inherit; font-weight: inherit; line-height: inherit; }
.footnote-line a:not(.reversefootnote) { color: inherit; }
.md-attr { display: none; }
.md-fn-count::after { content: "."; }
code, pre, samp, tt { font-family: var(--monospace); }
kbd { margin: 0px 0.1em; padding: 0.1em 0.6em; font-size: 0.8em; color: rgb(36, 39, 41); background: rgb(255, 255, 255); border: 1px solid rgb(173, 179, 185); border-radius: 3px; box-shadow: rgba(12, 13, 14, 0.2) 0px 1px 0px, rgb(255, 255, 255) 0px 0px 0px 2px inset; white-space: nowrap; vertical-align: middle; }
.md-comment { color: rgb(162, 127, 3); opacity: 0.8; font-family: var(--monospace); }
code { text-align: left; vertical-align: initial; }
a.md-print-anchor { white-space: pre !important; border-width: initial !important; border-style: none !important; border-color: initial !important; display: inline-block !important; position: absolute !important; width: 1px !important; right: 0px !important; outline: 0px !important; background: 0px 0px !important; text-decoration: initial !important; text-shadow: initial !important; }
.md-inline-math .MathJax_SVG .noError { display: none !important; }
.html-for-mac .inline-math-svg .MathJax_SVG { vertical-align: 0.2px; }
.md-math-block .MathJax_SVG_Display { text-align: center; margin: 0px; position: relative; text-indent: 0px; max-width: none; max-height: none; min-height: 0px; min-width: 100%; width: auto; overflow-y: hidden; display: block !important; }
.MathJax_SVG_Display, .md-inline-math .MathJax_SVG_Display { width: auto; margin: inherit; display: inline-block !important; }
.MathJax_SVG .MJX-monospace { font-family: var(--monospace); }
.MathJax_SVG .MJX-sans-serif { font-family: sans-serif; }
.MathJax_SVG { display: inline; font-style: normal; font-weight: 400; line-height: normal; zoom: 90%; text-indent: 0px; text-align: left; text-transform: none; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; }
.MathJax_SVG * { transition: none; }
.MathJax_SVG_Display svg { vertical-align: middle !important; margin-bottom: 0px !important; }
.os-windows.monocolor-emoji .md-emoji { font-family: "Segoe UI Symbol", sans-serif; }
.md-diagram-panel > svg { max-width: 100%; }
[lang="mermaid"] svg, [lang="flow"] svg { max-width: 100%; }
[lang="mermaid"] .node text { font-size: 1rem; }
table tr th { border-bottom: 0px; }
video { max-width: 100%; display: block; margin: 0px auto; }
iframe { max-width: 100%; width: 100%; border: none; }
.highlight td, .highlight tr { border: 0px; }


.CodeMirror { height: auto; }
.CodeMirror.cm-s-inner { background: inherit; }
.CodeMirror-scroll { overflow-y: hidden; overflow-x: auto; z-index: 3; }
.CodeMirror-gutter-filler, .CodeMirror-scrollbar-filler { background-color: rgb(255, 255, 255); }
.CodeMirror-gutters { border-right: 1px solid rgb(221, 221, 221); background: inherit; white-space: nowrap; }
.CodeMirror-linenumber { padding: 0px 3px 0px 5px; text-align: right; color: rgb(153, 153, 153); }
.cm-s-inner .cm-keyword { color: rgb(119, 0, 136); }
.cm-s-inner .cm-atom, .cm-s-inner.cm-atom { color: rgb(34, 17, 153); }
.cm-s-inner .cm-number { color: rgb(17, 102, 68); }
.cm-s-inner .cm-def { color: rgb(0, 0, 255); }
.cm-s-inner .cm-variable { color: rgb(0, 0, 0); }
.cm-s-inner .cm-variable-2 { color: rgb(0, 85, 170); }
.cm-s-inner .cm-variable-3 { color: rgb(0, 136, 85); }
.cm-s-inner .cm-string { color: rgb(170, 17, 17); }
.cm-s-inner .cm-property { color: rgb(0, 0, 0); }
.cm-s-inner .cm-operator { color: rgb(152, 26, 26); }
.cm-s-inner .cm-comment, .cm-s-inner.cm-comment { color: rgb(170, 85, 0); }
.cm-s-inner .cm-string-2 { color: rgb(255, 85, 0); }
.cm-s-inner .cm-meta { color: rgb(85, 85, 85); }
.cm-s-inner .cm-qualifier { color: rgb(85, 85, 85); }
.cm-s-inner .cm-builtin { color: rgb(51, 0, 170); }
.cm-s-inner .cm-bracket { color: rgb(153, 153, 119); }
.cm-s-inner .cm-tag { color: rgb(17, 119, 0); }
.cm-s-inner .cm-attribute { color: rgb(0, 0, 204); }
.cm-s-inner .cm-header, .cm-s-inner.cm-header { color: rgb(0, 0, 255); }
.cm-s-inner .cm-quote, .cm-s-inner.cm-quote { color: rgb(0, 153, 0); }
.cm-s-inner .cm-hr, .cm-s-inner.cm-hr { color: rgb(153, 153, 153); }
.cm-s-inner .cm-link, .cm-s-inner.cm-link { color: rgb(0, 0, 204); }
.cm-negative { color: rgb(221, 68, 68); }
.cm-positive { color: rgb(34, 153, 34); }
.cm-header, .cm-strong { font-weight: 700; }
.cm-del { text-decoration: line-through; }
.cm-em { font-style: italic; }
.cm-link { text-decoration: underline; }
.cm-error { color: red; }
.cm-invalidchar { color: red; }
.cm-constant { color: rgb(38, 139, 210); }
.cm-defined { color: rgb(181, 137, 0); }
div.CodeMirror span.CodeMirror-matchingbracket { color: rgb(0, 255, 0); }
div.CodeMirror span.CodeMirror-nonmatchingbracket { color: rgb(255, 34, 34); }
.cm-s-inner .CodeMirror-activeline-background { background: inherit; }
.CodeMirror { position: relative; overflow: hidden; }
.CodeMirror-scroll { height: 100%; outline: 0px; position: relative; box-sizing: content-box; background: inherit; }
.CodeMirror-sizer { position: relative; }
.CodeMirror-gutter-filler, .CodeMirror-hscrollbar, .CodeMirror-scrollbar-filler, .CodeMirror-vscrollbar { position: absolute; z-index: 6; display: none; }
.CodeMirror-vscrollbar { right: 0px; top: 0px; overflow: hidden; }
.CodeMirror-hscrollbar { bottom: 0px; left: 0px; overflow: hidden; }
.CodeMirror-scrollbar-filler { right: 0px; bottom: 0px; }
.CodeMirror-gutter-filler { left: 0px; bottom: 0px; }
.CodeMirror-gutters { position: absolute; left: 0px; top: 0px; padding-bottom: 30px; z-index: 3; }
.CodeMirror-gutter { white-space: normal; height: 100%; box-sizing: content-box; padding-bottom: 30px; margin-bottom: -32px; display: inline-block; }
.CodeMirror-gutter-wrapper { position: absolute; z-index: 4; background: 0px 0px !important; border: none !important; }
.CodeMirror-gutter-background { position: absolute; top: 0px; bottom: 0px; z-index: 4; }
.CodeMirror-gutter-elt { position: absolute; cursor: default; z-index: 4; }
.CodeMirror-lines { cursor: text; }
.CodeMirror pre { border-radius: 0px; border-width: 0px; background: 0px 0px; font-family: inherit; font-size: inherit; margin: 0px; white-space: pre; word-wrap: normal; color: inherit; z-index: 2; position: relative; overflow: visible; }
.CodeMirror-wrap pre { word-wrap: break-word; white-space: pre-wrap; word-break: normal; }
.CodeMirror-code pre { border-right: 30px solid transparent; width: fit-content; }
.CodeMirror-wrap .CodeMirror-code pre { border-right: none; width: auto; }
.CodeMirror-linebackground { position: absolute; left: 0px; right: 0px; top: 0px; bottom: 0px; z-index: 0; }
.CodeMirror-linewidget { position: relative; z-index: 2; overflow: auto; }
.CodeMirror-wrap .CodeMirror-scroll { overflow-x: hidden; }
.CodeMirror-measure { position: absolute; width: 100%; height: 0px; overflow: hidden; visibility: hidden; }
.CodeMirror-measure pre { position: static; }
.CodeMirror div.CodeMirror-cursor { position: absolute; visibility: hidden; border-right: none; width: 0px; }
.CodeMirror div.CodeMirror-cursor { visibility: hidden; }
.CodeMirror-focused div.CodeMirror-cursor { visibility: inherit; }
.cm-searching { background: rgba(255, 255, 0, 0.4); }
@media print {
  .CodeMirror div.CodeMirror-cursor { visibility: hidden; }
}







 .typora-export li, .typora-export p, .typora-export,  .footnote-line {white-space: normal;} 
</style>
</head>
<body class='typora-export' >
<div  id='write'  class = 'is-node'><div class='md-toc' mdtype='toc'><p class="md-toc-content"><span class="md-toc-item md-toc-h1" data-ref="n2"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n2">RamoSpeech 开源音频模型</a></span><span class="md-toc-item md-toc-h2" data-ref="n3"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n3">简介</a></span><span class="md-toc-item md-toc-h2" data-ref="n5"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n5">模型</a></span><span class="md-toc-item md-toc-h2" data-ref="n13"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n13">依赖库</a></span><span class="md-toc-item md-toc-h2" data-ref="n33"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n33">运行训练代码</a></span><span class="md-toc-item md-toc-h2" data-ref="n43"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n43">运行预测代码</a></span><span class="md-toc-item md-toc-h2" data-ref="n45"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n45">数据获取</a></span><span class="md-toc-item md-toc-h2" data-ref="n47"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n47">Loss使用</a></span><span class="md-toc-item md-toc-h2" data-ref="n56"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n56">解码器</a></span><span class="md-toc-item md-toc-h2" data-ref="n62"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n62">引用他人</a></span></p></div><h1><a name='header-n2' class='md-header-anchor '></a>RamoSpeech 开源音频模型</h1><h2><a name='header-n3' class='md-header-anchor '></a>简介</h2><p>RamoSpeech是一款由<a href='https://github.com/ramosmy'>ramosmy</a>开源的Automated Speech Recognition框架,本仓库中存放的是开源框架中的音频模型,该音频模型使用<a href='https://github.com/pytorch/pytorch'>Pytorch</a>编写,基于较早的模型DeepCNN + DeepLSTM + FC + CTC实现.</p><h2><a name='header-n5' class='md-header-anchor '></a>模型</h2><ol start='' ><li><p>DFCNN</p><pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="bash" style="break-inside: unset;"><div class="CodeMirror cm-s-inner CodeMirror-wrap" lang="bash"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 0px; left: 4px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: 0px; border-right-width: 0px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><span><span>​</span>x</span></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-code" role="presentation" style=""><div class="CodeMirror-activeline" style="position: relative;"><div class="CodeMirror-activeline-background CodeMirror-linebackground"></div><div class="CodeMirror-gutter-background CodeMirror-activeline-gutter" style="left: 0px; width: 0px;"></div><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">AcousticModel(</span></pre></div><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  (dropout): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.5, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  (conv1): Sequential(</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv1_conv1): Conv2d(1, <span class="cm-number">32</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">bias</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv1_norm1): BatchNorm2d(32, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv1_relu1): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv1_dropout1): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv1_conv2): Conv2d(32, <span class="cm-number">32</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv1_norm2): BatchNorm2d(32, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv1_relu2): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv1_maxpool): MaxPool2d<span class="cm-def">(kernel_size</span><span class="cm-operator">=</span><span class="cm-number">2</span>, <span class="cm-def">stride</span><span class="cm-operator">=</span><span class="cm-number">2</span>, <span class="cm-def">padding</span><span class="cm-operator">=</span><span class="cm-number">0</span>, <span class="cm-def">dilation</span><span class="cm-operator">=</span><span class="cm-number">1</span>, <span class="cm-def">ceil_mode</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv1_dropout2): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  )</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  (conv2): Sequential(</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv2_conv1): Conv2d(32, <span class="cm-number">64</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv2_norm1): BatchNorm2d(64, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv2_relu1): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv2_dropout1): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv2_conv2): Conv2d(64, <span class="cm-number">64</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv2_norm2): BatchNorm2d(64, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv2_relu2): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv2_maxpool): MaxPool2d<span class="cm-def">(kernel_size</span><span class="cm-operator">=</span><span class="cm-number">2</span>, <span class="cm-def">stride</span><span class="cm-operator">=</span><span class="cm-number">2</span>, <span class="cm-def">padding</span><span class="cm-operator">=</span><span class="cm-number">0</span>, <span class="cm-def">dilation</span><span class="cm-operator">=</span><span class="cm-number">1</span>, <span class="cm-def">ceil_mode</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv2_dropout2): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  )</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  (conv3): Sequential(</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv3_conv1): Conv2d(64, <span class="cm-number">128</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv3_relu1): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv3_dropout1): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.2, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv3_norm1): BatchNorm2d(128, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv3_conv2): Conv2d(128, <span class="cm-number">128</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv3_norm2): BatchNorm2d(128, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv3_relu2): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv3_maxpool): MaxPool2d<span class="cm-def">(kernel_size</span><span class="cm-operator">=</span><span class="cm-number">2</span>, <span class="cm-def">stride</span><span class="cm-operator">=</span><span class="cm-number">2</span>, <span class="cm-def">padding</span><span class="cm-operator">=</span><span class="cm-number">0</span>, <span class="cm-def">dilation</span><span class="cm-operator">=</span><span class="cm-number">1</span>, <span class="cm-def">ceil_mode</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv3_dropout2): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.2, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  )</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  (conv4): Sequential(</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv4_conv1): Conv2d(128, <span class="cm-number">128</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv4_norm1): BatchNorm2d(128, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv4_relu1): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv4_dropout1): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.2, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv4_conv2): Conv2d(128, <span class="cm-number">128</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv4_relu2): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv4_conv3): Conv2d(128, <span class="cm-number">128</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv4_norm2): BatchNorm2d(128, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv4_relu3): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp;  (conv4_dropout2): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.2, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  )</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  (fc1): Linear<span class="cm-def">(in_features</span><span class="cm-operator">=</span><span class="cm-number">3200</span>, <span class="cm-def">out_features</span><span class="cm-operator">=</span><span class="cm-number">128</span>, <span class="cm-def">bias</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  (fc2): Linear<span class="cm-def">(in_features</span><span class="cm-operator">=</span><span class="cm-number">256</span>, <span class="cm-def">out_features</span><span class="cm-operator">=</span><span class="cm-number">128</span>, <span class="cm-def">bias</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  (fc3): Linear<span class="cm-def">(in_features</span><span class="cm-operator">=</span><span class="cm-number">128</span>, <span class="cm-def">out_features</span><span class="cm-operator">=</span><span class="cm-number">1215</span>, <span class="cm-def">bias</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">  (rnn): LSTM(128, <span class="cm-number">128</span>, <span class="cm-def">num_layers</span><span class="cm-operator">=</span><span class="cm-number">4</span>, <span class="cm-def">batch_first</span><span class="cm-operator">=</span>True, <span class="cm-def">dropout</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">bidirectional</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">[<span class="cm-string">'wang3'</span>, <span class="cm-string">'luo4'</span>, <span class="cm-string">'shang4'</span>, <span class="cm-string">'yi1'</span>, <span class="cm-string">'zhang1'</span>, <span class="cm-string">'yong3'</span>, <span class="cm-string">'jia1'</span>, <span class="cm-string">'qiao2'</span>, <span class="cm-string">'tou2'</span>, <span class="cm-string">'mo3'</span>, <span class="cm-string">'ji4'</span>, <span class="cm-string">'fan4'</span>, <span class="cm-string">'dian4'</span>, <span class="cm-string">'de'</span>, <span class="cm-string">'jie2'</span>, <span class="cm-string">'zhang4'</span>, <span class="cm-string">'dan1'</span>, <span class="cm-string">'shi2'</span>, <span class="cm-string">'fen1'</span>, <span class="cm-string">'yin3'</span>, <span class="cm-string">'ren2'</span>, <span class="cm-string">'zhu4'</span>, <span class="cm-string">'mu4'</span>]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">[<span class="cm-string">'wang3'</span>, <span class="cm-string">'luo4'</span>, <span class="cm-string">'shang4'</span>, <span class="cm-string">'yi1'</span>, <span class="cm-string">'zhang1'</span>, <span class="cm-string">'yong3'</span>, <span class="cm-string">'jia1'</span>, <span class="cm-string">'qiao2'</span>, <span class="cm-string">'tou2'</span>, <span class="cm-string">'guo2'</span>, <span class="cm-string">'ji4'</span>, <span class="cm-string">'fan4'</span>, <span class="cm-string">'dian4'</span>, <span class="cm-string">'de'</span>, <span class="cm-string">'jie2'</span>, <span class="cm-string">'zhang4'</span>, <span class="cm-string">'dan1'</span>, <span class="cm-string">'shi2'</span>, <span class="cm-string">'fen1'</span>, <span class="cm-string">'yin3'</span>, <span class="cm-string">'ren2'</span>, <span class="cm-string">'zhu4'</span>, <span class="cm-string">'mu4'</span>]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Prediction using <span class="cm-number">1</span>.78259s</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span cm-text="">​</span></span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">&nbsp;psdz-SYS-4028GR-TR&nbsp;&nbsp;yufeng&nbsp;&nbsp;(e)&nbsp;speech&nbsp;&nbsp;~&nbsp;&nbsp;RamoSpeech&nbsp;&nbsp;<span class="cm-builtin">sh</span> speech2text.sh </span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Using padding as &lt;PAD&gt; as <span class="cm-number">0</span></span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Using unknown as &lt;UNK&gt; as <span class="cm-number">1</span></span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Handling data_config/aishell_train.txt</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">120098it [00:00, <span class="cm-number">234720</span>.48it/s]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Handling data_config/thchs_train.txt</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">10000it [00:00, <span class="cm-number">127377</span>.65it/s]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">loading test data:</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span class="cm-number">100</span>%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| <span class="cm-number">7176</span>/7176 [00:00&lt;00:00, <span class="cm-number">259723</span>.57it/s]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Prediction using <span class="cm-number">1</span>.63606s</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">[2019-08-01 <span class="cm-number">14</span>:41:19,653 INFO] Translating shard <span class="cm-number">0</span>.</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span cm-text="">​</span></span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">SENT <span class="cm-number">1</span>: [<span class="cm-string">'shang4'</span>, <span class="cm-string">'hai3'</span>, <span class="cm-string">'zhi2'</span>, <span class="cm-string">'wu4'</span>, <span class="cm-string">'yuan2'</span>, <span class="cm-string">'you3'</span>, <span class="cm-string">'ge4'</span>, <span class="cm-string">'zhong3'</span>, <span class="cm-string">'ge4'</span>, <span class="cm-string">'yang4'</span>, <span class="cm-string">'de'</span>, <span class="cm-string">'zhi2'</span>, <span class="cm-string">'wu4'</span>]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">PRED <span class="cm-number">1</span>: 上 海 植 物 园 有 各 种 各 样 的 植 物</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">PRED SCORE: <span class="cm-attribute">-0</span>.9450</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">PRED AVG SCORE: <span class="cm-attribute">-0</span>.0727, PRED PPL: <span class="cm-number">1</span>.0754</span></pre></div></div></div></div></div><div style="position: absolute; height: 0px; width: 1px; border-bottom: 0px solid transparent; top: 1224px;"></div><div class="CodeMirror-gutters" style="display: none; height: 1224px;"></div></div></div></pre></li><li><p>Come soon...</p></li></ol><p>Come soon!</p><h2><a name='header-n13' class='md-header-anchor '></a>依赖库</h2><figure><table><thead><tr><th style='text-align:center;' ><strong><a href='https://github.com/horovod/horovod'>Horovod</a></strong></th><th style='text-align:center;' ><strong>通用流行的分布式深度学习框架,详细了解可参见Horovod github</strong></th></tr></thead><tbody><tr><td style='text-align:center;' ><strong><a href='https://github.com/pytorch/pytorch'>Pytorch</a></strong></td><td style='text-align:center;' ><strong>本仓库采用的深度学习框架</strong></td></tr></tbody></table></figure><p>以上所提是构建模型的最基本库,要保证该模型可以在你的机子上运行,还需要:</p><ol start='' ><li>tesorboardX</li><li>tqdm</li><li>scipy, numpy</li></ol><p>安装上述所需文件,只需要:</p><pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="bash"><div class="CodeMirror cm-s-inner CodeMirror-wrap" lang="bash"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 0px; left: 4px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: 0px; border-right-width: 0px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-code" role="presentation"><div class="CodeMirror-activeline" style="position: relative;"><div class="CodeMirror-activeline-background CodeMirror-linebackground"></div><div class="CodeMirror-gutter-background CodeMirror-activeline-gutter" style="left: 0px; width: 0px;"></div><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">pip install <span class="cm-attribute">-r</span> requirments.txt</span></pre></div></div></div></div></div></div><div style="position: absolute; height: 0px; width: 1px; border-bottom: 0px solid transparent; top: 17px;"></div><div class="CodeMirror-gutters" style="display: none; height: 17px;"></div></div></div></pre><p>推荐使用virtuenv新开一个环境,当然也可以使用</p><pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="bash"><div class="CodeMirror cm-s-inner CodeMirror-wrap" lang="bash"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 0px; left: 4px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: 0px; border-right-width: 0px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-code" role="presentation"><div class="CodeMirror-activeline" style="position: relative;"><div class="CodeMirror-activeline-background CodeMirror-linebackground"></div><div class="CodeMirror-gutter-background CodeMirror-activeline-gutter" style="left: 0px; width: 0px;"></div><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">conda create <span class="cm-attribute">-n</span> YOUR_NEW_ENV <span class="cm-def">python</span><span class="cm-operator">=</span><span class="cm-number">3</span>.7</span></pre></div></div></div></div></div></div><div style="position: absolute; height: 0px; width: 1px; border-bottom: 0px solid transparent; top: 17px;"></div><div class="CodeMirror-gutters" style="display: none; height: 17px;"></div></div></div></pre><h2><a name='header-n33' class='md-header-anchor '></a>运行训练代码</h2><ol start='' ><li><p>DFCNN</p><p>如果不加载预训练的模型的话:</p><pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="bash"><div class="CodeMirror cm-s-inner CodeMirror-wrap" lang="bash"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 0px; left: 4px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: 0px; border-right-width: 0px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-code" role="presentation"><div class="CodeMirror-activeline" style="position: relative;"><div class="CodeMirror-activeline-background CodeMirror-linebackground"></div><div class="CodeMirror-gutter-background CodeMirror-activeline-gutter" style="left: 0px; width: 0px;"></div><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">horovodrun <span class="cm-attribute">-np</span> YOUR_WORKER_NUMBERS <span class="cm-attribute">-H</span> localhost:YOUR_WORKER_NUMBERS python train.py \</span></pre></div><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; &nbsp; &nbsp; &nbsp;<span class="cm-attribute">--data_type</span> YOUR_DATA_TYPE <span class="cm-attribute">--model_path</span> YOUR_MODEL_PATH <span class="cm-attribute">--model_name</span> YOUR_MODEL_NAME \</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; &nbsp; &nbsp; &nbsp;<span class="cm-attribute">--gpu_rank</span> YOUR_WORKER_NUMBERS <span class="cm-attribute">--epochs</span> <span class="cm-number">1000</span> <span class="cm-attribute">--save_step</span> <span class="cm-number">20</span> <span class="cm-attribute">--batch_size</span> YOUR_BATCH_SIZE</span></pre></div></div></div></div></div><div style="position: absolute; height: 0px; width: 1px; border-bottom: 0px solid transparent; top: 51px;"></div><div class="CodeMirror-gutters" style="display: none; height: 51px;"></div></div></div></pre><p>加载预训练模型请添加 --load_model</p><p>YOUR_NUM_WORKERS指线程数</p><p>YOUR_DATA_TYPE指数据类型,分为all, thchs, aishell(陆续会增加primewords, st-cmds)</p><p>请根据你的GPU数量来决定你的GPU_RANK</p></li></ol><h2><a name='header-n43' class='md-header-anchor '></a>运行预测代码</h2><p>Come soon!</p><h2><a name='header-n45' class='md-header-anchor '></a>数据获取</h2><p>Come soon!</p><h2><a name='header-n47' class='md-header-anchor '></a>Loss使用</h2><ol start='' ><li>语音识别中通用的loss就是<a href='ftp://ftp.idsia.ch/pub/juergen/icml2006.pdf'>ctc_loss</a>,本仓库主要也采用ctc_loss来进行序列建模,所幸Pytorch1.1.0版本中有自带的ctc_loss可供使用,使用起来很方便,此处就不加其他赘述.</li><li>Cross Entropy Loss本来就是非常好的多标签分类问题one-hot形式的流行loss,然而由于输出层的输出结果并不是和目标拼音一一对应的,而是一个多对一映射,所以普通的CrossEntropy没有多大用处,此处我们参考了CVPR2019年的一篇新文章,<a href='https://arxiv.org/pdf/1904.08364.pdf'>Aggregation Cross-Entropy for Sequence Recognition</a>,在参考论文的情况下,对开放的代码进行修改适合到ASR的场景,参考ace.py(还未进入测试阶段)</li><li>Attention机制</li></ol><h2><a name='header-n56' class='md-header-anchor '></a>解码器</h2><ol start='' ><li>本仓库附有一个简易的BeamSearch解码器,参考BeamSearch.py文件.但是运行速度较慢,解码需用时25秒左右(BeamWidth=10),不建议采用.</li><li>本仓库另提供一个由Baidu DeepSpeech2提供的解码器,参考ctcDecode.py文件,当然百度提供的是character-level的,本仓库对该文件做了部分改动,使其成为word-level的,速度提升显著,大约BeamWidth=30,用时1.5s,读者可以自行参考对比.</li></ol><h2><a name='header-n62' class='md-header-anchor '></a>引用他人</h2><ol start='' ><li><a href='https://github.com/nl8590687/ASRT_SpeechRecognition'>ASRT_SpeechRecognition</a> 感谢AiLemon提供的开源ASR代码,在我构建基础模型的时候,有很大的参考意义.</li><li><a href='https://github.com/PaddlePaddle/DeepSpeech'>DeepSpeech2,Baidu</a></li><li><a href='https://arxiv.org/pdf/1904.08364.pdf'>Aggregation Cross-Entropy for Sequence Recognition</a></li></ol><p>&nbsp;</p></div>
</body>
</html>
MIT License Copyright (c) 2019 ramosmy Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

简介

This is a sub-repository in building to create acoustic model in Mandarin speech recognition. 展开 收起
MIT
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/chenyang918/acoustic_model.git
git@gitee.com:chenyang918/acoustic_model.git
chenyang918
acoustic_model
acoustic_model
master

搜索帮助

D67c1975 1850385 1daf7b77 1850385