1 Star 0 Fork 0

wangye707/work

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
该仓库未声明开源许可证文件(LICENSE),使用请关注具体项目描述及其代码上游依赖。
克隆/下载
reduce.py 1.11 KB
一键复制 编辑 原始数据 按行查看 历史
wangye707 提交于 2019-06-14 15:04 . Add files via upload
#!D:/workplace/python
# -*- coding: utf-8 -*-
# @File : reduce.py
# @Author: WangYe
# @Date : 2018/3/5
# @Software: PyCharm
from operator import itemgetter
import sys
current_word = None
current_count = 0
word = None
# input comes from STDIN
for line in sys.stdin:
# remove leading and trailing whitespace
line = line.strip()
# parse the input we got from mapper.py
word, count = line.split('\t', 1)
# convert count (currently a string) to int
try:
count = int(count)
except ValueError:
# count was not a number, so silently
# ignore/discard this line
continue
# this IF-switch only works because Hadoop sorts map output
# by key (here: word) before it is passed to the reducer
if current_word == word:
current_count += count
else:
if current_word:
# write result to STDOUT
print ('%s\t%s' % (current_word, current_count))
current_count = count
current_word = word
# do not forget to output the last word if needed!
if current_word == word:
print ('%s\t%s' % (current_word, current_count))
马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
Python
1
https://gitee.com/wangye707/work.git
git@gitee.com:wangye707/work.git
wangye707
work
work
master

搜索帮助