Hadoop实验-HDFS与Mapreduce操作--688IT编程网

Hadoop实验-HDFS与Mapreduce操作

⼀、实验⽬的

1、利⽤虚拟机搭建集部署hadoop

2、HDFS⽂件操作以及⽂件接⼝编程；

3、MAPREDUCE并⾏程序开发、发布与调⽤。

⼆、实验内容

1、虚拟机集搭建部署hadoop

利⽤VMware、centOS-7、Xshell(secureCrt)等软件搭建集部署hadoop，具体操作参照

www.bilibili/video/BV1Kf4y1z7Nw?p=1

2、HDFS⽂件操作

（1）在分布式⽂件系统上验证HDFS⽂件命令

[-ls <path>] //显⽰⽬标路径当前⽬录下的所有⽂件

[-du <path>] //以字节为单位显⽰⽬录中所有⽂件的⼤⼩，或该⽂件的⼤⼩（如果path为⽂件）

（2）. HDFS⽂件操作

调⽤HDFS⽂件接⼝实现对分布式⽂件系统中⽂件的访问，如创建、修改、删除等。源代码：

package mapreduce;

import org.f.Configuration;

import org.apache.hadoop.fs.FSDataOutputStream;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

import org.apache.log4j.BasicConfigurator;

public class test {

public static void main(String[] args) {

try {

Configuration conf = new Configuration();

FileSystem fs = (conf);

String filename = "hdfs://node01:8020/";

FSDataOutputStream os = fs.create(new Path(filename));

byte[] buff = "hello world!".getBytes();

os.write(buff, 0, buff.length);

System.out.println("Create" + filename);

} catch (Exception e) {

e.printStackTrace();

}

运⾏结果：

3.MAPREDUCE并⾏程序开发

1）求每年最⾼⽓温

源代码：

package mapreduce;

import java.io.IOException;

import org.f.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Job;

import org.apache.hadoop.mapreduce.Mapper;

import org.apache.hadoop.mapreduce.Reducer;

import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class Temperature {

static class TempMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

@Override

public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { System.out.print("Before Mapper: " + key + ", " + value);

String line = String();

String year = line.substring(0, 4);

int temperature = Integer.parseInt(line.substring(8));

context.write(new Text(year), new IntWritable(temperature));

System.out.println("======" + "After Mapper:" + new Text(year) + ", " + new IntWritable(temperature));

}

static class TempReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

@Override

public void reduce(Text key, Iterable<IntWritable> values, Context context)

throws IOException, InterruptedException {

int maxValue = Integer.MIN_VALUE;

StringBuffer sb = new StringBuffer();

for (IntWritable value : values) {

maxValue = Math.max(maxValue, ());

sb.append(value).append(",");

}

System.out.print("Before Reduce: " + key + ", " + sb.toString());

context.write(key, new IntWritable(maxValue));

System.out.println("======" + "After Reduce: " + key + ", " + maxValue);

}

hadoop分布式集搭建public static void main(String[] args) throws Exception {

String dst = "hdfs://node01:8020/user/";

String dstOut = "hdfs://node01:8020/user/zzy/output";

Configuration hadoopConfig = new Configuration();

hadoopConfig.set("fs.hdfs.impl", org.apache.hadoop.hdfs.Name()); hadoopConfig.set("fs.file.impl", org.apache.hadoop.fs.Name());

Job job = new Job(hadoopConfig);

// job.setJarByClass(NewMaxTemperature.class);

FileInputFormat.addInputPath(job, new Path(dst));

FileOutputFormat.setOutputPath(job, new Path(dstOut));

job.setMapperClass(TempMapper.class); job.setReducerClass(TempReducer.class); job.setOutputKeyClass(Text.class);

job.setOutputValueClass(IntWritable.class); job.waitForCompletion(true);

System.out.println("Finished");

}

运⾏结果：

688IT编程网

Hadoop实验-HDFS与Mapreduce操作

发表评论

推荐文章

java正则表达式选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

热门文章

利用正则表达式实现文本数据提取与处理

正则表达式零宽断言详解

文本匹配规则

excel中使用正则

1-31正则表达式

anki之高级筛选

BUAA_OO_2021_第一单元总结

insert语句递增写法

sublime text 3在行前插入递增数字序号的方法

字符串只允许数字和英文的正则

powerbuilder 正则表达式

Shell脚本编写的高级技巧利用正则表达式进行字符串匹配

JAVA正则表达式的三种模式:贪婪,勉强和占有的讨论

go regexp匹配规则

oracle regexp_substr 实现原理

基本的元字符回溯引用和前后查匹配模式

elasticsearch query dsl正则

oracle sql正则表达式

GA-设置目标

仅匹配全角片假名的正则表达式

最新文章

java正则表达式选择题

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

vue validate 正则验证小数长度

标签列表

688IT编程网

Hadoop实验-HDFS与Mapreduce操作

发表评论

推荐文章

java正则表达式 选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

热门文章

利用正则表达式实现文本数据提取与处理

正则表达式零宽断言详解

文本匹配规则

excel中使用正则

1-31正则表达式

anki之高级筛选

BUAA_OO_2021_第一单元总结

insert语句递增写法

sublime text 3在行前插入递增数字序号的方法

字符串只允许数字和英文的正则

powerbuilder 正则表达式

Shell脚本编写的高级技巧利用正则表达式进行字符串匹配

JAVA正则表达式的三种模式:贪婪,勉强和占有的讨论

go regexp匹配规则

oracle regexp_substr 实现原理

基本的元字符 回溯引用和前后查 匹配模式

elasticsearch query dsl正则

oracle sql正则表达式

GA-设置目标

仅匹配全角片假名的正则表达式

最新文章

java正则表达式 选择题

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

vue validate 正则验证小数长度

标签列表

java正则表达式选择题

非零金额正则表达式

基本的元字符回溯引用和前后查匹配模式

java正则表达式选择题

非零金额正则表达式