Java – Hadoop: can you silently discard failed map tasks?
•
Java
I am using Hadoop MapReduce to process a large amount of data The problem is, ocassionally, a corrupted file causes the map task to throw a Java heap space error or something like that
If possible, if possible, it would be good to give up what any map task is doing, kill it, and then continue working. Don't mind the lost data I don't want the whole M / r work to fail
How is this possible in Hadoop?
Solution
You can modify MapReduce max.map. failures. Percent parameter The default value is 0 Increasing this parameter will allow a certain percentage of mapping tasks to fail without causing the job to fail
You can use mapred site This parameter can be set in XML (which will be applied to all jobs) or job by job (which may be safer)
The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
二维码