1 Star 0 Fork 0

东方佑/acoustic_model

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
README.html 52.69 KB
一键复制 编辑 原始数据 按行查看 历史
<!doctype html>
<html>
<head>
<meta charset='UTF-8'><meta name='viewport' content='width=device-width initial-scale=1'>
<title>README</title><style type='text/css'>html {overflow-x: initial !important;}:root { --bg-color:#ffffff; --text-color:#333333; --select-text-bg-color:#B5D6FC; --select-text-font-color:auto; --monospace:"Lucida Console",Consolas,"Courier",monospace; }
html { font-size: 14px; background-color: var(--bg-color); color: var(--text-color); font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; -webkit-font-smoothing: antialiased; }
body { margin: 0px; padding: 0px; height: auto; bottom: 0px; top: 0px; left: 0px; right: 0px; font-size: 1rem; line-height: 1.42857; overflow-x: hidden; background: inherit; }
iframe { margin: auto; }
a.url { word-break: break-all; }
a:active, a:hover { outline: 0px; }
.in-text-selection, ::selection { text-shadow: none; background: var(--select-text-bg-color); color: var(--select-text-font-color); }
#write { margin: 0px auto; height: auto; width: inherit; word-break: normal; word-wrap: break-word; position: relative; white-space: normal; overflow-x: visible; padding-top: 40px; }
#write.first-line-indent p { text-indent: 2em; }
#write.first-line-indent li p, #write.first-line-indent p * { text-indent: 0px; }
#write.first-line-indent li { margin-left: 2em; }
.for-image #write { padding-left: 8px; padding-right: 8px; }
body.typora-export { padding-left: 30px; padding-right: 30px; }
.typora-export .footnote-line, .typora-export li, .typora-export p { white-space: pre-wrap; }
@media screen and (max-width: 500px) {
body.typora-export { padding-left: 0px; padding-right: 0px; }
#write { padding-left: 20px; padding-right: 20px; }
.CodeMirror-sizer { margin-left: 0px !important; }
.CodeMirror-gutters { display: none !important; }
}
#write li > figure:first-child { margin-top: -20px; }
#write ol, #write ul { position: relative; }
img { max-width: 100%; vertical-align: middle; }
button, input, select, textarea { color: inherit; font-style: inherit; font-variant: inherit; font-weight: inherit; font-stretch: inherit; font-size: inherit; line-height: inherit; font-family: inherit; }
input[type="checkbox"], input[type="radio"] { line-height: normal; padding: 0px; }
*, ::after, ::before { box-sizing: border-box; }
#write h1, #write h2, #write h3, #write h4, #write h5, #write h6, #write p, #write pre { width: inherit; }
#write h1, #write h2, #write h3, #write h4, #write h5, #write h6, #write p { position: relative; }
h1, h2, h3, h4, h5, h6 { break-after: avoid-page; break-inside: avoid; orphans: 2; }
p { orphans: 4; }
h1 { font-size: 2rem; }
h2 { font-size: 1.8rem; }
h3 { font-size: 1.6rem; }
h4 { font-size: 1.4rem; }
h5 { font-size: 1.2rem; }
h6 { font-size: 1rem; }
.md-math-block, .md-rawblock, h1, h2, h3, h4, h5, h6, p { margin-top: 1rem; margin-bottom: 1rem; }
.hidden { display: none; }
.md-blockmeta { color: rgb(204, 204, 204); font-weight: 700; font-style: italic; }
a { cursor: pointer; }
sup.md-footnote { padding: 2px 4px; background-color: rgba(238, 238, 238, 0.7); color: rgb(85, 85, 85); border-radius: 4px; cursor: pointer; }
sup.md-footnote a, sup.md-footnote a:hover { color: inherit; text-transform: inherit; text-decoration: inherit; }
#write input[type="checkbox"] { cursor: pointer; width: inherit; height: inherit; }
figure { overflow-x: auto; margin: 1.2em 0px; max-width: calc(100% + 16px); padding: 0px; }
figure > table { margin: 0px !important; }
tr { break-inside: avoid; break-after: auto; }
thead { display: table-header-group; }
table { border-collapse: collapse; border-spacing: 0px; width: 100%; overflow: auto; break-inside: auto; text-align: left; }
table.md-table td { min-width: 32px; }
.CodeMirror-gutters { border-right: 0px; background-color: inherit; }
.CodeMirror-linenumber { user-select: none; }
.CodeMirror { text-align: left; }
.CodeMirror-placeholder { opacity: 0.3; }
.CodeMirror pre { padding: 0px 4px; }
.CodeMirror-lines { padding: 0px; }
div.hr:focus { cursor: none; }
#write pre { white-space: pre-wrap; }
#write.fences-no-line-wrapping pre { white-space: pre; }
#write pre.ty-contain-cm { white-space: normal; }
.CodeMirror-gutters { margin-right: 4px; }
.md-fences { font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; overflow: visible; white-space: pre; background: inherit; position: relative !important; }
.md-diagram-panel { width: 100%; margin-top: 10px; text-align: center; padding-top: 0px; padding-bottom: 8px; overflow-x: auto; }
#write .md-fences.mock-cm { white-space: pre-wrap; }
.md-fences.md-fences-with-lineno { padding-left: 0px; }
#write.fences-no-line-wrapping .md-fences.mock-cm { white-space: pre; overflow-x: auto; }
.md-fences.mock-cm.md-fences-with-lineno { padding-left: 8px; }
.CodeMirror-line, twitterwidget { break-inside: avoid; }
.footnotes { opacity: 0.8; font-size: 0.9rem; margin-top: 1em; margin-bottom: 1em; }
.footnotes + .footnotes { margin-top: 0px; }
.md-reset { margin: 0px; padding: 0px; border: 0px; outline: 0px; vertical-align: top; background: 0px 0px; text-decoration: none; text-shadow: none; float: none; position: static; width: auto; height: auto; white-space: nowrap; cursor: inherit; -webkit-tap-highlight-color: transparent; line-height: normal; font-weight: 400; text-align: left; box-sizing: content-box; direction: ltr; }
li div { padding-top: 0px; }
blockquote { margin: 1rem 0px; }
li .mathjax-block, li p { margin: 0.5rem 0px; }
li { margin: 0px; position: relative; }
blockquote > :last-child { margin-bottom: 0px; }
blockquote > :first-child, li > :first-child { margin-top: 0px; }
.footnotes-area { color: rgb(136, 136, 136); margin-top: 0.714rem; padding-bottom: 0.143rem; white-space: normal; }
#write .footnote-line { white-space: pre-wrap; }
@media print {
body, html { border: 1px solid transparent; height: 99%; break-after: avoid; break-before: avoid; }
#write { margin-top: 0px; padding-top: 0px; border-color: transparent !important; }
.typora-export * { -webkit-print-color-adjust: exact; }
html.blink-to-pdf { font-size: 13px; }
.typora-export #write { padding-left: 32px; padding-right: 32px; padding-bottom: 0px; break-after: avoid; }
.typora-export #write::after { height: 0px; }
@page { margin: 20mm 0px; }
}
.footnote-line { margin-top: 0.714em; font-size: 0.7em; }
a img, img a { cursor: pointer; }
pre.md-meta-block { font-size: 0.8rem; min-height: 0.8rem; white-space: pre-wrap; background: rgb(204, 204, 204); display: block; overflow-x: hidden; }
p > .md-image:only-child:not(.md-img-error) img, p > img:only-child { display: block; margin: auto; }
p > .md-image:only-child { display: inline-block; width: 100%; }
#write .MathJax_Display { margin: 0.8em 0px 0px; }
.md-math-block { width: 100%; }
.md-math-block:not(:empty)::after { display: none; }
[contenteditable="true"]:active, [contenteditable="true"]:focus { outline: 0px; box-shadow: none; }
.md-task-list-item { position: relative; list-style-type: none; }
.task-list-item.md-task-list-item { padding-left: 0px; }
.md-task-list-item > input { position: absolute; top: 0px; left: 0px; margin-left: -1.2em; margin-top: calc(1em - 10px); border: none; }
.math { font-size: 1rem; }
.md-toc { min-height: 3.58rem; position: relative; font-size: 0.9rem; border-radius: 10px; }
.md-toc-content { position: relative; margin-left: 0px; }
.md-toc-content::after, .md-toc::after { display: none; }
.md-toc-item { display: block; color: rgb(65, 131, 196); }
.md-toc-item a { text-decoration: none; }
.md-toc-inner:hover { text-decoration: underline; }
.md-toc-inner { display: inline-block; cursor: pointer; }
.md-toc-h1 .md-toc-inner { margin-left: 0px; font-weight: 700; }
.md-toc-h2 .md-toc-inner { margin-left: 2em; }
.md-toc-h3 .md-toc-inner { margin-left: 4em; }
.md-toc-h4 .md-toc-inner { margin-left: 6em; }
.md-toc-h5 .md-toc-inner { margin-left: 8em; }
.md-toc-h6 .md-toc-inner { margin-left: 10em; }
@media screen and (max-width: 48em) {
.md-toc-h3 .md-toc-inner { margin-left: 3.5em; }
.md-toc-h4 .md-toc-inner { margin-left: 5em; }
.md-toc-h5 .md-toc-inner { margin-left: 6.5em; }
.md-toc-h6 .md-toc-inner { margin-left: 8em; }
}
a.md-toc-inner { font-size: inherit; font-style: inherit; font-weight: inherit; line-height: inherit; }
.footnote-line a:not(.reversefootnote) { color: inherit; }
.md-attr { display: none; }
.md-fn-count::after { content: "."; }
code, pre, samp, tt { font-family: var(--monospace); }
kbd { margin: 0px 0.1em; padding: 0.1em 0.6em; font-size: 0.8em; color: rgb(36, 39, 41); background: rgb(255, 255, 255); border: 1px solid rgb(173, 179, 185); border-radius: 3px; box-shadow: rgba(12, 13, 14, 0.2) 0px 1px 0px, rgb(255, 255, 255) 0px 0px 0px 2px inset; white-space: nowrap; vertical-align: middle; }
.md-comment { color: rgb(162, 127, 3); opacity: 0.8; font-family: var(--monospace); }
code { text-align: left; vertical-align: initial; }
a.md-print-anchor { white-space: pre !important; border-width: initial !important; border-style: none !important; border-color: initial !important; display: inline-block !important; position: absolute !important; width: 1px !important; right: 0px !important; outline: 0px !important; background: 0px 0px !important; text-decoration: initial !important; text-shadow: initial !important; }
.md-inline-math .MathJax_SVG .noError { display: none !important; }
.html-for-mac .inline-math-svg .MathJax_SVG { vertical-align: 0.2px; }
.md-math-block .MathJax_SVG_Display { text-align: center; margin: 0px; position: relative; text-indent: 0px; max-width: none; max-height: none; min-height: 0px; min-width: 100%; width: auto; overflow-y: hidden; display: block !important; }
.MathJax_SVG_Display, .md-inline-math .MathJax_SVG_Display { width: auto; margin: inherit; display: inline-block !important; }
.MathJax_SVG .MJX-monospace { font-family: var(--monospace); }
.MathJax_SVG .MJX-sans-serif { font-family: sans-serif; }
.MathJax_SVG { display: inline; font-style: normal; font-weight: 400; line-height: normal; zoom: 90%; text-indent: 0px; text-align: left; text-transform: none; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; }
.MathJax_SVG * { transition: none; }
.MathJax_SVG_Display svg { vertical-align: middle !important; margin-bottom: 0px !important; }
.os-windows.monocolor-emoji .md-emoji { font-family: "Segoe UI Symbol", sans-serif; }
.md-diagram-panel > svg { max-width: 100%; }
[lang="mermaid"] svg, [lang="flow"] svg { max-width: 100%; }
[lang="mermaid"] .node text { font-size: 1rem; }
table tr th { border-bottom: 0px; }
video { max-width: 100%; display: block; margin: 0px auto; }
iframe { max-width: 100%; width: 100%; border: none; }
.highlight td, .highlight tr { border: 0px; }
.CodeMirror { height: auto; }
.CodeMirror.cm-s-inner { background: inherit; }
.CodeMirror-scroll { overflow-y: hidden; overflow-x: auto; z-index: 3; }
.CodeMirror-gutter-filler, .CodeMirror-scrollbar-filler { background-color: rgb(255, 255, 255); }
.CodeMirror-gutters { border-right: 1px solid rgb(221, 221, 221); background: inherit; white-space: nowrap; }
.CodeMirror-linenumber { padding: 0px 3px 0px 5px; text-align: right; color: rgb(153, 153, 153); }
.cm-s-inner .cm-keyword { color: rgb(119, 0, 136); }
.cm-s-inner .cm-atom, .cm-s-inner.cm-atom { color: rgb(34, 17, 153); }
.cm-s-inner .cm-number { color: rgb(17, 102, 68); }
.cm-s-inner .cm-def { color: rgb(0, 0, 255); }
.cm-s-inner .cm-variable { color: rgb(0, 0, 0); }
.cm-s-inner .cm-variable-2 { color: rgb(0, 85, 170); }
.cm-s-inner .cm-variable-3 { color: rgb(0, 136, 85); }
.cm-s-inner .cm-string { color: rgb(170, 17, 17); }
.cm-s-inner .cm-property { color: rgb(0, 0, 0); }
.cm-s-inner .cm-operator { color: rgb(152, 26, 26); }
.cm-s-inner .cm-comment, .cm-s-inner.cm-comment { color: rgb(170, 85, 0); }
.cm-s-inner .cm-string-2 { color: rgb(255, 85, 0); }
.cm-s-inner .cm-meta { color: rgb(85, 85, 85); }
.cm-s-inner .cm-qualifier { color: rgb(85, 85, 85); }
.cm-s-inner .cm-builtin { color: rgb(51, 0, 170); }
.cm-s-inner .cm-bracket { color: rgb(153, 153, 119); }
.cm-s-inner .cm-tag { color: rgb(17, 119, 0); }
.cm-s-inner .cm-attribute { color: rgb(0, 0, 204); }
.cm-s-inner .cm-header, .cm-s-inner.cm-header { color: rgb(0, 0, 255); }
.cm-s-inner .cm-quote, .cm-s-inner.cm-quote { color: rgb(0, 153, 0); }
.cm-s-inner .cm-hr, .cm-s-inner.cm-hr { color: rgb(153, 153, 153); }
.cm-s-inner .cm-link, .cm-s-inner.cm-link { color: rgb(0, 0, 204); }
.cm-negative { color: rgb(221, 68, 68); }
.cm-positive { color: rgb(34, 153, 34); }
.cm-header, .cm-strong { font-weight: 700; }
.cm-del { text-decoration: line-through; }
.cm-em { font-style: italic; }
.cm-link { text-decoration: underline; }
.cm-error { color: red; }
.cm-invalidchar { color: red; }
.cm-constant { color: rgb(38, 139, 210); }
.cm-defined { color: rgb(181, 137, 0); }
div.CodeMirror span.CodeMirror-matchingbracket { color: rgb(0, 255, 0); }
div.CodeMirror span.CodeMirror-nonmatchingbracket { color: rgb(255, 34, 34); }
.cm-s-inner .CodeMirror-activeline-background { background: inherit; }
.CodeMirror { position: relative; overflow: hidden; }
.CodeMirror-scroll { height: 100%; outline: 0px; position: relative; box-sizing: content-box; background: inherit; }
.CodeMirror-sizer { position: relative; }
.CodeMirror-gutter-filler, .CodeMirror-hscrollbar, .CodeMirror-scrollbar-filler, .CodeMirror-vscrollbar { position: absolute; z-index: 6; display: none; }
.CodeMirror-vscrollbar { right: 0px; top: 0px; overflow: hidden; }
.CodeMirror-hscrollbar { bottom: 0px; left: 0px; overflow: hidden; }
.CodeMirror-scrollbar-filler { right: 0px; bottom: 0px; }
.CodeMirror-gutter-filler { left: 0px; bottom: 0px; }
.CodeMirror-gutters { position: absolute; left: 0px; top: 0px; padding-bottom: 30px; z-index: 3; }
.CodeMirror-gutter { white-space: normal; height: 100%; box-sizing: content-box; padding-bottom: 30px; margin-bottom: -32px; display: inline-block; }
.CodeMirror-gutter-wrapper { position: absolute; z-index: 4; background: 0px 0px !important; border: none !important; }
.CodeMirror-gutter-background { position: absolute; top: 0px; bottom: 0px; z-index: 4; }
.CodeMirror-gutter-elt { position: absolute; cursor: default; z-index: 4; }
.CodeMirror-lines { cursor: text; }
.CodeMirror pre { border-radius: 0px; border-width: 0px; background: 0px 0px; font-family: inherit; font-size: inherit; margin: 0px; white-space: pre; word-wrap: normal; color: inherit; z-index: 2; position: relative; overflow: visible; }
.CodeMirror-wrap pre { word-wrap: break-word; white-space: pre-wrap; word-break: normal; }
.CodeMirror-code pre { border-right: 30px solid transparent; width: fit-content; }
.CodeMirror-wrap .CodeMirror-code pre { border-right: none; width: auto; }
.CodeMirror-linebackground { position: absolute; left: 0px; right: 0px; top: 0px; bottom: 0px; z-index: 0; }
.CodeMirror-linewidget { position: relative; z-index: 2; overflow: auto; }
.CodeMirror-wrap .CodeMirror-scroll { overflow-x: hidden; }
.CodeMirror-measure { position: absolute; width: 100%; height: 0px; overflow: hidden; visibility: hidden; }
.CodeMirror-measure pre { position: static; }
.CodeMirror div.CodeMirror-cursor { position: absolute; visibility: hidden; border-right: none; width: 0px; }
.CodeMirror div.CodeMirror-cursor { visibility: hidden; }
.CodeMirror-focused div.CodeMirror-cursor { visibility: inherit; }
.cm-searching { background: rgba(255, 255, 0, 0.4); }
@media print {
.CodeMirror div.CodeMirror-cursor { visibility: hidden; }
}
.typora-export li, .typora-export p, .typora-export, .footnote-line {white-space: normal;}
</style>
</head>
<body class='typora-export' >
<div id='write' class = 'is-node'><div class='md-toc' mdtype='toc'><p class="md-toc-content"><span class="md-toc-item md-toc-h1" data-ref="n2"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n2">RamoSpeech 开源音频模型</a></span><span class="md-toc-item md-toc-h2" data-ref="n3"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n3">简介</a></span><span class="md-toc-item md-toc-h2" data-ref="n5"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n5">模型</a></span><span class="md-toc-item md-toc-h2" data-ref="n13"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n13">依赖库</a></span><span class="md-toc-item md-toc-h2" data-ref="n33"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n33">运行训练代码</a></span><span class="md-toc-item md-toc-h2" data-ref="n43"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n43">运行预测代码</a></span><span class="md-toc-item md-toc-h2" data-ref="n45"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n45">数据获取</a></span><span class="md-toc-item md-toc-h2" data-ref="n47"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n47">Loss使用</a></span><span class="md-toc-item md-toc-h2" data-ref="n56"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n56">解码器</a></span><span class="md-toc-item md-toc-h2" data-ref="n62"><a class="md-toc-inner" style="cursor: pointer;" href="#header-n62">引用他人</a></span></p></div><h1><a name='header-n2' class='md-header-anchor '></a>RamoSpeech 开源音频模型</h1><h2><a name='header-n3' class='md-header-anchor '></a>简介</h2><p>RamoSpeech是一款由<a href='https://github.com/ramosmy'>ramosmy</a>开源的Automated Speech Recognition框架,本仓库中存放的是开源框架中的音频模型,该音频模型使用<a href='https://github.com/pytorch/pytorch'>Pytorch</a>编写,基于较早的模型DeepCNN + DeepLSTM + FC + CTC实现.</p><h2><a name='header-n5' class='md-header-anchor '></a>模型</h2><ol start='' ><li><p>DFCNN</p><pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="bash" style="break-inside: unset;"><div class="CodeMirror cm-s-inner CodeMirror-wrap" lang="bash"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 0px; left: 4px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: 0px; border-right-width: 0px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><span><span></span>x</span></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-code" role="presentation" style=""><div class="CodeMirror-activeline" style="position: relative;"><div class="CodeMirror-activeline-background CodeMirror-linebackground"></div><div class="CodeMirror-gutter-background CodeMirror-activeline-gutter" style="left: 0px; width: 0px;"></div><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">AcousticModel(</span></pre></div><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> (dropout): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.5, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> (conv1): Sequential(</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv1_conv1): Conv2d(1, <span class="cm-number">32</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">bias</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv1_norm1): BatchNorm2d(32, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv1_relu1): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv1_dropout1): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv1_conv2): Conv2d(32, <span class="cm-number">32</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv1_norm2): BatchNorm2d(32, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv1_relu2): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv1_maxpool): MaxPool2d<span class="cm-def">(kernel_size</span><span class="cm-operator">=</span><span class="cm-number">2</span>, <span class="cm-def">stride</span><span class="cm-operator">=</span><span class="cm-number">2</span>, <span class="cm-def">padding</span><span class="cm-operator">=</span><span class="cm-number">0</span>, <span class="cm-def">dilation</span><span class="cm-operator">=</span><span class="cm-number">1</span>, <span class="cm-def">ceil_mode</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv1_dropout2): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> )</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> (conv2): Sequential(</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv2_conv1): Conv2d(32, <span class="cm-number">64</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv2_norm1): BatchNorm2d(64, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv2_relu1): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv2_dropout1): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv2_conv2): Conv2d(64, <span class="cm-number">64</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv2_norm2): BatchNorm2d(64, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv2_relu2): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv2_maxpool): MaxPool2d<span class="cm-def">(kernel_size</span><span class="cm-operator">=</span><span class="cm-number">2</span>, <span class="cm-def">stride</span><span class="cm-operator">=</span><span class="cm-number">2</span>, <span class="cm-def">padding</span><span class="cm-operator">=</span><span class="cm-number">0</span>, <span class="cm-def">dilation</span><span class="cm-operator">=</span><span class="cm-number">1</span>, <span class="cm-def">ceil_mode</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv2_dropout2): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> )</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> (conv3): Sequential(</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv3_conv1): Conv2d(64, <span class="cm-number">128</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv3_relu1): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv3_dropout1): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.2, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv3_norm1): BatchNorm2d(128, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv3_conv2): Conv2d(128, <span class="cm-number">128</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv3_norm2): BatchNorm2d(128, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv3_relu2): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv3_maxpool): MaxPool2d<span class="cm-def">(kernel_size</span><span class="cm-operator">=</span><span class="cm-number">2</span>, <span class="cm-def">stride</span><span class="cm-operator">=</span><span class="cm-number">2</span>, <span class="cm-def">padding</span><span class="cm-operator">=</span><span class="cm-number">0</span>, <span class="cm-def">dilation</span><span class="cm-operator">=</span><span class="cm-number">1</span>, <span class="cm-def">ceil_mode</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv3_dropout2): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.2, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> )</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> (conv4): Sequential(</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv4_conv1): Conv2d(128, <span class="cm-number">128</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv4_norm1): BatchNorm2d(128, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv4_relu1): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv4_dropout1): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.2, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv4_conv2): Conv2d(128, <span class="cm-number">128</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv4_relu2): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv4_conv3): Conv2d(128, <span class="cm-number">128</span>, <span class="cm-def">kernel_size</span><span class="cm-operator">=</span>(3, <span class="cm-number">3</span>), <span class="cm-def">stride</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>), <span class="cm-def">padding</span><span class="cm-operator">=</span>(1, <span class="cm-number">1</span>))</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv4_norm2): BatchNorm2d(128, <span class="cm-def">eps</span><span class="cm-operator">=</span>1e-05, <span class="cm-def">momentum</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">affine</span><span class="cm-operator">=</span>True, <span class="cm-def">track_running_stats</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv4_relu3): ReLU()</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; (conv4_dropout2): Dropout<span class="cm-def">(p</span><span class="cm-operator">=</span><span class="cm-number">0</span>.2, <span class="cm-def">inplace</span><span class="cm-operator">=</span>False)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> )</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> (fc1): Linear<span class="cm-def">(in_features</span><span class="cm-operator">=</span><span class="cm-number">3200</span>, <span class="cm-def">out_features</span><span class="cm-operator">=</span><span class="cm-number">128</span>, <span class="cm-def">bias</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> (fc2): Linear<span class="cm-def">(in_features</span><span class="cm-operator">=</span><span class="cm-number">256</span>, <span class="cm-def">out_features</span><span class="cm-operator">=</span><span class="cm-number">128</span>, <span class="cm-def">bias</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> (fc3): Linear<span class="cm-def">(in_features</span><span class="cm-operator">=</span><span class="cm-number">128</span>, <span class="cm-def">out_features</span><span class="cm-operator">=</span><span class="cm-number">1215</span>, <span class="cm-def">bias</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> (rnn): LSTM(128, <span class="cm-number">128</span>, <span class="cm-def">num_layers</span><span class="cm-operator">=</span><span class="cm-number">4</span>, <span class="cm-def">batch_first</span><span class="cm-operator">=</span>True, <span class="cm-def">dropout</span><span class="cm-operator">=</span><span class="cm-number">0</span>.1, <span class="cm-def">bidirectional</span><span class="cm-operator">=</span>True)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">)</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">[<span class="cm-string">'wang3'</span>, <span class="cm-string">'luo4'</span>, <span class="cm-string">'shang4'</span>, <span class="cm-string">'yi1'</span>, <span class="cm-string">'zhang1'</span>, <span class="cm-string">'yong3'</span>, <span class="cm-string">'jia1'</span>, <span class="cm-string">'qiao2'</span>, <span class="cm-string">'tou2'</span>, <span class="cm-string">'mo3'</span>, <span class="cm-string">'ji4'</span>, <span class="cm-string">'fan4'</span>, <span class="cm-string">'dian4'</span>, <span class="cm-string">'de'</span>, <span class="cm-string">'jie2'</span>, <span class="cm-string">'zhang4'</span>, <span class="cm-string">'dan1'</span>, <span class="cm-string">'shi2'</span>, <span class="cm-string">'fen1'</span>, <span class="cm-string">'yin3'</span>, <span class="cm-string">'ren2'</span>, <span class="cm-string">'zhu4'</span>, <span class="cm-string">'mu4'</span>]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">[<span class="cm-string">'wang3'</span>, <span class="cm-string">'luo4'</span>, <span class="cm-string">'shang4'</span>, <span class="cm-string">'yi1'</span>, <span class="cm-string">'zhang1'</span>, <span class="cm-string">'yong3'</span>, <span class="cm-string">'jia1'</span>, <span class="cm-string">'qiao2'</span>, <span class="cm-string">'tou2'</span>, <span class="cm-string">'guo2'</span>, <span class="cm-string">'ji4'</span>, <span class="cm-string">'fan4'</span>, <span class="cm-string">'dian4'</span>, <span class="cm-string">'de'</span>, <span class="cm-string">'jie2'</span>, <span class="cm-string">'zhang4'</span>, <span class="cm-string">'dan1'</span>, <span class="cm-string">'shi2'</span>, <span class="cm-string">'fen1'</span>, <span class="cm-string">'yin3'</span>, <span class="cm-string">'ren2'</span>, <span class="cm-string">'zhu4'</span>, <span class="cm-string">'mu4'</span>]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Prediction using <span class="cm-number">1</span>.78259s</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span cm-text=""></span></span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">&nbsp;psdz-SYS-4028GR-TR&nbsp;&nbsp;yufeng&nbsp;&nbsp;(e)&nbsp;speech&nbsp;&nbsp;~&nbsp;&nbsp;RamoSpeech&nbsp;&nbsp;<span class="cm-builtin">sh</span> speech2text.sh </span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Using padding as &lt;PAD&gt; as <span class="cm-number">0</span></span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Using unknown as &lt;UNK&gt; as <span class="cm-number">1</span></span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Handling data_config/aishell_train.txt</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">120098it [00:00, <span class="cm-number">234720</span>.48it/s]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Handling data_config/thchs_train.txt</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">10000it [00:00, <span class="cm-number">127377</span>.65it/s]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">loading test data:</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span class="cm-number">100</span>%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| <span class="cm-number">7176</span>/7176 [00:00&lt;00:00, <span class="cm-number">259723</span>.57it/s]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">Prediction using <span class="cm-number">1</span>.63606s</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">[2019-08-01 <span class="cm-number">14</span>:41:19,653 INFO] Translating shard <span class="cm-number">0</span>.</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"><span cm-text=""></span></span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">SENT <span class="cm-number">1</span>: [<span class="cm-string">'shang4'</span>, <span class="cm-string">'hai3'</span>, <span class="cm-string">'zhi2'</span>, <span class="cm-string">'wu4'</span>, <span class="cm-string">'yuan2'</span>, <span class="cm-string">'you3'</span>, <span class="cm-string">'ge4'</span>, <span class="cm-string">'zhong3'</span>, <span class="cm-string">'ge4'</span>, <span class="cm-string">'yang4'</span>, <span class="cm-string">'de'</span>, <span class="cm-string">'zhi2'</span>, <span class="cm-string">'wu4'</span>]</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">PRED <span class="cm-number">1</span>: 上 海 植 物 园 有 各 种 各 样 的 植 物</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">PRED SCORE: <span class="cm-attribute">-0</span>.9450</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">PRED AVG SCORE: <span class="cm-attribute">-0</span>.0727, PRED PPL: <span class="cm-number">1</span>.0754</span></pre></div></div></div></div></div><div style="position: absolute; height: 0px; width: 1px; border-bottom: 0px solid transparent; top: 1224px;"></div><div class="CodeMirror-gutters" style="display: none; height: 1224px;"></div></div></div></pre></li><li><p>Come soon...</p></li></ol><p>Come soon!</p><h2><a name='header-n13' class='md-header-anchor '></a>依赖库</h2><figure><table><thead><tr><th style='text-align:center;' ><strong><a href='https://github.com/horovod/horovod'>Horovod</a></strong></th><th style='text-align:center;' ><strong>通用流行的分布式深度学习框架,详细了解可参见Horovod github</strong></th></tr></thead><tbody><tr><td style='text-align:center;' ><strong><a href='https://github.com/pytorch/pytorch'>Pytorch</a></strong></td><td style='text-align:center;' ><strong>本仓库采用的深度学习框架</strong></td></tr></tbody></table></figure><p>以上所提是构建模型的最基本库,要保证该模型可以在你的机子上运行,还需要:</p><ol start='' ><li>tesorboardX</li><li>tqdm</li><li>scipy, numpy</li></ol><p>安装上述所需文件,只需要:</p><pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="bash"><div class="CodeMirror cm-s-inner CodeMirror-wrap" lang="bash"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 0px; left: 4px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: 0px; border-right-width: 0px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-code" role="presentation"><div class="CodeMirror-activeline" style="position: relative;"><div class="CodeMirror-activeline-background CodeMirror-linebackground"></div><div class="CodeMirror-gutter-background CodeMirror-activeline-gutter" style="left: 0px; width: 0px;"></div><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">pip install <span class="cm-attribute">-r</span> requirments.txt</span></pre></div></div></div></div></div></div><div style="position: absolute; height: 0px; width: 1px; border-bottom: 0px solid transparent; top: 17px;"></div><div class="CodeMirror-gutters" style="display: none; height: 17px;"></div></div></div></pre><p>推荐使用virtuenv新开一个环境,当然也可以使用</p><pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="bash"><div class="CodeMirror cm-s-inner CodeMirror-wrap" lang="bash"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 0px; left: 4px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: 0px; border-right-width: 0px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-code" role="presentation"><div class="CodeMirror-activeline" style="position: relative;"><div class="CodeMirror-activeline-background CodeMirror-linebackground"></div><div class="CodeMirror-gutter-background CodeMirror-activeline-gutter" style="left: 0px; width: 0px;"></div><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">conda create <span class="cm-attribute">-n</span> YOUR_NEW_ENV <span class="cm-def">python</span><span class="cm-operator">=</span><span class="cm-number">3</span>.7</span></pre></div></div></div></div></div></div><div style="position: absolute; height: 0px; width: 1px; border-bottom: 0px solid transparent; top: 17px;"></div><div class="CodeMirror-gutters" style="display: none; height: 17px;"></div></div></div></pre><h2><a name='header-n33' class='md-header-anchor '></a>运行训练代码</h2><ol start='' ><li><p>DFCNN</p><p>如果不加载预训练的模型的话:</p><pre spellcheck="false" class="md-fences md-end-block ty-contain-cm modeLoaded" lang="bash"><div class="CodeMirror cm-s-inner CodeMirror-wrap" lang="bash"><div style="overflow: hidden; position: relative; width: 3px; height: 0px; top: 0px; left: 4px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" tabindex="0" style="position: absolute; bottom: -1em; padding: 0px; width: 1000px; height: 1em; outline: none;"></textarea></div><div class="CodeMirror-scrollbar-filler" cm-not-content="true"></div><div class="CodeMirror-gutter-filler" cm-not-content="true"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="margin-left: 0px; margin-bottom: 0px; border-right-width: 0px; padding-right: 0px; padding-bottom: 0px;"><div style="position: relative; top: 0px;"><div class="CodeMirror-lines" role="presentation"><div role="presentation" style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre><span>xxxxxxxxxx</span></pre></div><div class="CodeMirror-measure"></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-code" role="presentation"><div class="CodeMirror-activeline" style="position: relative;"><div class="CodeMirror-activeline-background CodeMirror-linebackground"></div><div class="CodeMirror-gutter-background CodeMirror-activeline-gutter" style="left: 0px; width: 0px;"></div><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;">horovodrun <span class="cm-attribute">-np</span> YOUR_WORKER_NUMBERS <span class="cm-attribute">-H</span> localhost:YOUR_WORKER_NUMBERS python train.py \</span></pre></div><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; &nbsp; &nbsp; &nbsp;<span class="cm-attribute">--data_type</span> YOUR_DATA_TYPE <span class="cm-attribute">--model_path</span> YOUR_MODEL_PATH <span class="cm-attribute">--model_name</span> YOUR_MODEL_NAME \</span></pre><pre class=" CodeMirror-line " role="presentation"><span role="presentation" style="padding-right: 0.1px;"> &nbsp; &nbsp; &nbsp; &nbsp;<span class="cm-attribute">--gpu_rank</span> YOUR_WORKER_NUMBERS <span class="cm-attribute">--epochs</span> <span class="cm-number">1000</span> <span class="cm-attribute">--save_step</span> <span class="cm-number">20</span> <span class="cm-attribute">--batch_size</span> YOUR_BATCH_SIZE</span></pre></div></div></div></div></div><div style="position: absolute; height: 0px; width: 1px; border-bottom: 0px solid transparent; top: 51px;"></div><div class="CodeMirror-gutters" style="display: none; height: 51px;"></div></div></div></pre><p>加载预训练模型请添加 --load_model</p><p>YOUR_NUM_WORKERS指线程数</p><p>YOUR_DATA_TYPE指数据类型,分为all, thchs, aishell(陆续会增加primewords, st-cmds)</p><p>请根据你的GPU数量来决定你的GPU_RANK</p></li></ol><h2><a name='header-n43' class='md-header-anchor '></a>运行预测代码</h2><p>Come soon!</p><h2><a name='header-n45' class='md-header-anchor '></a>数据获取</h2><p>Come soon!</p><h2><a name='header-n47' class='md-header-anchor '></a>Loss使用</h2><ol start='' ><li>语音识别中通用的loss就是<a href='ftp://ftp.idsia.ch/pub/juergen/icml2006.pdf'>ctc_loss</a>,本仓库主要也采用ctc_loss来进行序列建模,所幸Pytorch1.1.0版本中有自带的ctc_loss可供使用,使用起来很方便,此处就不加其他赘述.</li><li>Cross Entropy Loss本来就是非常好的多标签分类问题one-hot形式的流行loss,然而由于输出层的输出结果并不是和目标拼音一一对应的,而是一个多对一映射,所以普通的CrossEntropy没有多大用处,此处我们参考了CVPR2019年的一篇新文章,<a href='https://arxiv.org/pdf/1904.08364.pdf'>Aggregation Cross-Entropy for Sequence Recognition</a>,在参考论文的情况下,对开放的代码进行修改适合到ASR的场景,参考ace.py(还未进入测试阶段)</li><li>Attention机制</li></ol><h2><a name='header-n56' class='md-header-anchor '></a>解码器</h2><ol start='' ><li>本仓库附有一个简易的BeamSearch解码器,参考BeamSearch.py文件.但是运行速度较慢,解码需用时25秒左右(BeamWidth=10),不建议采用.</li><li>本仓库另提供一个由Baidu DeepSpeech2提供的解码器,参考ctcDecode.py文件,当然百度提供的是character-level的,本仓库对该文件做了部分改动,使其成为word-level的,速度提升显著,大约BeamWidth=30,用时1.5s,读者可以自行参考对比.</li></ol><h2><a name='header-n62' class='md-header-anchor '></a>引用他人</h2><ol start='' ><li><a href='https://github.com/nl8590687/ASRT_SpeechRecognition'>ASRT_SpeechRecognition</a> 感谢AiLemon提供的开源ASR代码,在我构建基础模型的时候,有很大的参考意义.</li><li><a href='https://github.com/PaddlePaddle/DeepSpeech'>DeepSpeech2,Baidu</a></li><li><a href='https://arxiv.org/pdf/1904.08364.pdf'>Aggregation Cross-Entropy for Sequence Recognition</a></li></ol><p>&nbsp;</p></div>
</body>
</html>
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
1
https://gitee.com/chenyang918/acoustic_model.git
git@gitee.com:chenyang918/acoustic_model.git
chenyang918
acoustic_model
acoustic_model
master

搜索帮助