hadoop 临时文件回收机制--688IT编程网

hadoop 临时文件回收机制

英文回答：

Hadoop provides a mechanism for recovering temporary files that are created during the execution of MapReduce jobs. These temporary files are used to store intermediate data that is generated during the map and reduce phases of the job. By default, Hadoop will delete these temporary files once the job has completed successfully. However, if the job fails or is interrupted, the temporary files may be left behind on the Hadoop cluster.

To prevent these orphaned temporary files from accumulating on the cluster, Hadoop provides a mechanism for recovering them. This mechanism is based on the use of a temporary file recovery directory. When a MapReduce job is started, Hadoop creates a temporary file recovery directory on the local filesystem of the jobtracker. This directory is used to store the temporary files that are created during the execution of the job.

truncated file If the job fails or is interrupted, the jobtracker will attempt to recover the temporary files fro

m the temporary file recovery directory. The jobtracker will do this by scanning the directory for files that have the same name as the temporary files that were created during the execution of the job. If the jobtracker finds a matching file, it will copy the file to the jobtracker's local filesystem and then delete the file from the temporary file recovery directory.

Once the jobtracker has recovered the temporary files, it will attempt to restart the job. The job will be restarted from the point at which it failed or was interrupted.

The temporary file recovery mechanism is a valuable feature that can help to prevent the accumulation of orphaned temporary files on the Hadoop cluster. This mechanism can also help to improve the performance of Hadoop jobs by reducing the amount of time that is spent restarting jobs that have failed or been interrupted.

中文回答：

Hadoop提供了一种机制来恢复在MapReduce作业执行期间创建的临时文件。这些临时文

件用于存储在作业的map和reduce阶段期间生成的中间数据。默认情况下，Hadoop会在作业成功完成后删除这些临时文件。但是，如果作业失败或中断，临时文件可能会残留在Hadoop集上。

为了防止这些孤立的临时文件在集上累积，Hadoop提供了一种恢复机制。该机制基于临时文件恢复目录的使用。当MapReduce作业启动时，Hadoop会在作业跟踪器的本地文件系统上创建一个临时文件恢复目录。该目录用于存储在作业执行期间创建的临时文件。

如果作业失败或中断，作业跟踪器将尝试从临时文件恢复目录中恢复临时文件。作业跟踪器将通过扫描目录中与作业执行期间创建的临时文件同名的文件来执行此操作。如果作业跟踪器到匹配的文件，它将该文件复制到作业跟踪器的本地文件系统，然后从临时文件恢复目录中删除该文件。

一旦作业跟踪器恢复了临时文件，它将尝试重新启动该作业。作业将从其失败或中断的点重新启动。

临时文件恢复机制是一个有价值的功能，它可以帮助防止孤立的临时文件在Hadoop集

上累积。此机制还可以通过减少重新启动失败或中断作业所花费的时间来帮助提高Hadoop作业的性能。

688IT编程网

hadoop 临时文件回收机制

发表评论

推荐文章

java正则表达式选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

热门文章

利用正则表达式实现文本数据提取与处理

正则表达式零宽断言详解

文本匹配规则

excel中使用正则

1-31正则表达式

anki之高级筛选

BUAA_OO_2021_第一单元总结

insert语句递增写法

sublime text 3在行前插入递增数字序号的方法

字符串只允许数字和英文的正则

powerbuilder 正则表达式

Shell脚本编写的高级技巧利用正则表达式进行字符串匹配

JAVA正则表达式的三种模式:贪婪,勉强和占有的讨论

go regexp匹配规则

oracle regexp_substr 实现原理

基本的元字符回溯引用和前后查匹配模式

elasticsearch query dsl正则

oracle sql正则表达式

GA-设置目标

仅匹配全角片假名的正则表达式

最新文章

java正则表达式选择题

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

vue validate 正则验证小数长度

标签列表

688IT编程网

hadoop 临时文件回收机制

发表评论

推荐文章

java正则表达式 选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

热门文章

利用正则表达式实现文本数据提取与处理

正则表达式零宽断言详解

文本匹配规则

excel中使用正则

1-31正则表达式

anki之高级筛选

BUAA_OO_2021_第一单元总结

insert语句递增写法

sublime text 3在行前插入递增数字序号的方法

字符串只允许数字和英文的正则

powerbuilder 正则表达式

Shell脚本编写的高级技巧利用正则表达式进行字符串匹配

JAVA正则表达式的三种模式:贪婪,勉强和占有的讨论

go regexp匹配规则

oracle regexp_substr 实现原理

基本的元字符 回溯引用和前后查 匹配模式

elasticsearch query dsl正则

oracle sql正则表达式

GA-设置目标

仅匹配全角片假名的正则表达式

最新文章

java正则表达式 选择题

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

vue数字相加小数点变长-概述说明以及解释

vue validate 正则验证小数长度

标签列表

java正则表达式选择题

非零金额正则表达式

基本的元字符回溯引用和前后查匹配模式

java正则表达式选择题

非零金额正则表达式