代码拉取完成,页面将自动刷新
同步操作将从 greitzmann/chinese_word_segmentation_transformer 强制同步,此操作会覆盖自 Fork 仓库以来所做的任何修改,且无法恢复!!!
确定后同步将在后台操作,完成时将刷新页面,请耐心等待。
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# 基本配置
class basic_config:
def __init__(self):
self.raw_data_dir = './raw_data' # 原始数据存放的文件夹,可有多个文件
self.tokenized_data_dir = './data' # 规整化后的数据存放位置
self.log_dir = './log'
self.model_save_dir = './out' # 训练模型保存地址
# 输入文本与目标文本的语言类型,cn:中文,en:英文
self.source_language_type = 'cn'
self.target_language_type = 'cn'
self.special_symbol = True # 是否有特殊符号"<[a-zA-Z]+>"作为特殊符号
self.source_language_lower = True # 是否将英文统一为小写
self.target_language_lower = True
# vocabulary
self.vocab_remains = ['<pad>', '<unk>', '<s>', '</s>']
self.max_source_length = 64
self.max_target_length = 64
# data
self.num_epochs = 50
self.batch_size = 2048 # 每次训练的字符数
self.boundary_scale = 2
self.min_boundary = 4
"""Parameters for the base Transformer model."""
# Model params
self.initializer_gain = 1.0 # Used in trainable variable initialization.
self.hidden_size = 512 # Model dimension in the hidden layers.
self.num_hidden_layers = 6 # Number of layers in the encoder and decoder stacks.
self.num_heads = 8 # Number of heads to use in multi-headed attention.
self.filter_size = 2048 # Inner layer dimensionality in the feedforward network.
# Dropout values (only used when training)
self.layer_postprocess_dropout = 0.2
self.attention_dropout = 0.2
self.relu_dropout = 0.2
# Training params
self.label_smoothing = 0.1
self.learning_rate = 1.0
self.learning_rate_decay_rate = 1.0
self.learning_rate_warmup_steps = 16000
# Optimizer params
self.optimizer_adam_beta1 = 0.9
self.optimizer_adam_beta2 = 0.997
self.optimizer_adam_epsilon = 1e-09
# Default prediction params
self.extra_decode_length = 50
self.beam_size = 4
self.alpha = 0.6 # used to calculate length normalization in beam search
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。