복붙노트

[HADOOP] 하둡 작업이 다양한 입력에 실패하는 이유 디버깅

HADOOP

하둡 작업이 다양한 입력에 실패하는 이유 디버깅

거기에 내가 실행하기 위해 노력하고있어 하둡 작업, 그리고 내 장난감 데이터의 28 반복으로 입력을 지정할 때 나는 모든 것을 충돌 29에 크랭크 때 모든하지만, 완벽하게 작동합니다.

내 생각은 28 반복 작동하지만 29로 코드의 논리에 아무 잘못이 없기 때문이다.

다음과 같이 입력 데이터의이 반복은 모습입니다 (반복이 아닌 입력 파일과 혼동하는, 오히려 즉 0101, 그 긴 숫자 열 앞에 붙은 사람들의 수를 나타냅니다) :

> evm --debug --code 7f00000000000000000000000000000000000000000000000000000000000000027f00000000000000000000000000000000000000000000000000000000000000027f00000000000000000000000000000000000000000000000000000000000000020101 run
opAdd      3903
opAdd      425
opStop       15
#### TRACE ####
PUSH32          pc=00000000 gas=9999999997 cost=3

PUSH32          pc=00000033 gas=9999999994 cost=3
Stack:
00000000  0000000000000000000000000000000000000000000000000000000000000002

PUSH32          pc=00000066 gas=9999999991 cost=3
Stack:
00000000  0000000000000000000000000000000000000000000000000000000000000002
00000001  0000000000000000000000000000000000000000000000000000000000000002

ADD             pc=00000099 gas=9999999988 cost=3
Stack:
00000000  0000000000000000000000000000000000000000000000000000000000000002
00000001  0000000000000000000000000000000000000000000000000000000000000002
00000002  0000000000000000000000000000000000000000000000000000000000000002

ADD             pc=00000100 gas=9999999985 cost=3
Stack:
00000000  0000000000000000000000000000000000000000000000000000000000000004
00000001  0000000000000000000000000000000000000000000000000000000000000002

STOP            pc=00000101 gas=9999999985 cost=0
Stack:
00000000  0000000000000000000000000000000000000000000000000000000000000006

#### LOGS ####
0x%  

실행중인 작업의 코드는 다음과 같습니다 :

import java.io.*;
import java.util.ArrayList;
import java.io.IOException;
import java.util.StringTokenizer;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import org.apache.hadoop.mapreduce.lib.input.FileSplit;
import org.apache.log4j.Logger;

public class ExecutionTimeTracker {

  public static class TokenizerMapper
       extends Mapper<Object, Text, Text, IntWritable>{

    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();
    private final Logger LOG = org.apache.log4j.Logger.getLogger(this.getClass());

    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {
      StringTokenizer itr = new StringTokenizer(value.toString());

      FileSplit fileSplit = (FileSplit)context.getInputSplit();
      String filename = fileSplit.getPath().getName();

      //to write the file name as key
      Text text = new Text();
      text.set(filename);
      LOG.warn("fileName: " + filename);

      try {
          // command execution
          Runtime rt = Runtime.getRuntime();
          String evmDir = "/home/ubuntu/go/src/github.com/ethereum/go-ethereum/build/bin/evm";
          String command = evmDir + " --debug --code " + value.toString() + " run";
          Process proc = Runtime.getRuntime().exec(command);
          BufferedReader stdInput = new BufferedReader(new InputStreamReader(proc.getInputStream()));

          // output data struct
          ArrayList<String> consoleOutput = new ArrayList<String>();
          String s = null;
          while ((s = stdInput.readLine()) != null) {
              consoleOutput.add(s);
          }
          for (String p : consoleOutput) {
              Pattern pattern = Pattern.compile("([A-Za-z]+)([ \t]+)(\\d+)");
              Matcher matcher = pattern.matcher(p);
              while (matcher.find()) {
                  String groupThree = matcher.group(3);
                  IntWritable writeValue = new IntWritable(Integer.parseInt(groupThree));
                  context.write(text, writeValue);
              }
          }
          // close to prevent memory leak
          stdInput.close();
      } catch (IOException e) {
          LOG.warn("Exception Encountered!");
          LOG.warn(e);
      }
    }
  }

  public static class IntSumReducer
       extends Reducer<Text,IntWritable,Text,IntWritable> {
    private IntWritable result = new IntWritable();

    public void reduce(Text key, Iterable<IntWritable> values,
                       Context context
                       ) throws IOException, InterruptedException {
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
      }
      result.set(sum);
      context.write(key, result);
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    Job job = Job.getInstance(conf, "ExecutionTimeTracker");
    job.setJarByClass(ExecutionTimeTracker.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

성공적인 작업에 대한 출력은 아래에서 찾을 수 있습니다 :

17/10/13 02:17:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/10/13 02:17:11 INFO client.RMProxy: Connecting to ResourceManager at master/172.31.46.70:8032
17/10/13 02:17:11 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
17/10/13 02:17:11 INFO input.FileInputFormat: Total input files to process : 1
17/10/13 02:17:12 INFO mapreduce.JobSubmitter: number of splits:1
17/10/13 02:17:12 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1507833515636_0006
17/10/13 02:17:12 INFO impl.YarnClientImpl: Submitted application application_1507833515636_0006
17/10/13 02:17:12 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1507833515636_0006/
17/10/13 02:17:12 INFO mapreduce.Job: Running job: job_1507833515636_0006
17/10/13 02:17:22 INFO mapreduce.Job: Job job_1507833515636_0006 running in uber mode : true
17/10/13 02:17:22 INFO mapreduce.Job:  map 100% reduce 0%
17/10/13 02:17:25 INFO mapreduce.Job:  map 100% reduce 100%
17/10/13 02:17:26 INFO mapreduce.Job: Job job_1507833515636_0006 completed successfully
17/10/13 02:17:26 INFO mapreduce.Job: Counters: 52
    File System Counters
        FILE: Number of bytes read=64
        FILE: Number of bytes written=112
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=4042
        HDFS: Number of bytes written=295682
        HDFS: Number of read operations=35
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=8
    Job Counters 
        Launched map tasks=1
        Launched reduce tasks=1
        Other local map tasks=1
        Total time spent by all maps in occupied slots (ms)=501
        Total time spent by all reduces in occupied slots (ms)=2415
        TOTAL_LAUNCHED_UBERTASKS=2
        NUM_UBER_SUBMAPS=1
        NUM_UBER_SUBREDUCES=1
        Total time spent by all map tasks (ms)=501
        Total time spent by all reduce tasks (ms)=2415
        Total vcore-milliseconds taken by all map tasks=501
        Total vcore-milliseconds taken by all reduce tasks=2415
        Total megabyte-milliseconds taken by all map tasks=513024
        Total megabyte-milliseconds taken by all reduce tasks=2472960
    Map-Reduce Framework
        Map input records=1
        Map output records=56
        Map output bytes=448
        Map output materialized bytes=16
        Input split bytes=96
        Combine input records=56
        Combine output records=1
        Reduce input groups=1
        Reduce shuffle bytes=16
        Reduce input records=1
        Reduce output records=1
        Spilled Records=2
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=42
        CPU time spent (ms)=2060
        Physical memory (bytes) snapshot=971251712
        Virtual memory (bytes) snapshot=5902385152
        Total committed heap usage (bytes)=745537536
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=1903
    File Output Format Counters 
        Bytes Written=10

이 작업을 수행 슬레이브 노드의 전체 로그는 여기에서 찾을 수 있습니다.

다음은 실패한 작업의 출력은 다음과 같습니다

17/10/12 20:42:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/10/12 20:42:42 INFO client.RMProxy: Connecting to ResourceManager at master/xxx.xxx.xxx.xxx:8032
17/10/12 20:42:42 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
17/10/12 20:42:42 INFO input.FileInputFormat: Total input files to process : 1
17/10/12 20:42:43 INFO mapreduce.JobSubmitter: number of splits:1
17/10/12 20:42:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1507833515636_0005
17/10/12 20:42:44 INFO impl.YarnClientImpl: Submitted application application_1507833515636_0005
17/10/12 20:42:44 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1507833515636_0005/
17/10/12 20:42:44 INFO mapreduce.Job: Running job: job_1507833515636_0005
17/10/12 20:42:49 INFO mapreduce.Job: Job job_1507833515636_0005 running in uber mode : true
17/10/12 20:42:49 INFO mapreduce.Job:  map 0% reduce 0%
17/10/12 20:43:01 INFO mapreduce.Job:  map 67% reduce 0%
17/10/12 20:53:19 INFO mapreduce.Job:  map 100% reduce 100%
17/10/12 20:53:19 INFO mapreduce.Job: Job job_1507833515636_0005 failed with state FAILED due to: Task failed task_1507833515636_0005_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

17/10/12 20:53:19 INFO mapreduce.Job: Counters: 18
    Job Counters 
        Failed map tasks=1
        Killed reduce tasks=1
        Launched map tasks=1
        Launched reduce tasks=1
        Other local map tasks=1
        Total time spent by all maps in occupied slots (ms)=629774
        Total time spent by all reduces in occupied slots (ms)=1
        TOTAL_LAUNCHED_UBERTASKS=1
        NUM_UBER_SUBMAPS=1
        Total time spent by all map tasks (ms)=629774
        Total time spent by all reduce tasks (ms)=1
        Total vcore-milliseconds taken by all map tasks=629774
        Total vcore-milliseconds taken by all reduce tasks=1
        Total megabyte-milliseconds taken by all map tasks=644888576
        Total megabyte-milliseconds taken by all reduce tasks=1024
    Map-Reduce Framework
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0

전체 출력 오류 로그는 작업을 실행 슬레이브 노드에 의해 기록 된대로, 여기에서 찾을 수 있습니다.

이러한 작업은 동네 짱 모드에서 실행되기 때문에, 즉이의 잠재적 인 원인의 대부분은 아직로서 however- 문제 - 미연에 방지해야한다, 나는 모든 제안과 통찰력에 열려있는 특정 문제 -에 내 손가락을 넣어 할 수 없었어요! :)

아마 각 개별 컨테이너의 메모리 경계 함께 할 수있는 뭔가가?

여기 내 구성 파일의 모양 무엇 :

mapred-site.xml 파일 :

<configuration>
  <property>
     <name>mapreduce.framework.name</name>
     <value>yarn</value>
  </property>
  <property>
     <name>mapreduce.job.ubertask.enable</name>
     <value>true</value>
  </property>
</configuration>

원사를 site.xml :

<configuration>
  <property>
     <name>yarn.nodemanager.aux-services</name>
     <value>mapreduce_shuffle</value>
  </property>
  <property>
     <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
     <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>master</value>
  </property>
  <property>
    <name>yarn.log-aggregation-enable</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>40960</value>
  </property>
  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>2048</value>
  </property>
  <property>
    <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>2.1</value>
  </property>
<property>
   <name>yarn.nodemanager.vmem-check-enabled</name>
   <value>false</value>
   <description>Whether virtual memory limits will be enforced for containers</description>
</property>
</configuration>

HDFS-site.xml 파일 :

<configuration>
  <property>
     <name>dfs.replication</name>
     <value>1</value>
  </property>
  <property>
     <name>dfs.namenode.name.dir</name>
     <value>file:/usr/local/hadoop_work/hdfs/namenode</value>
  </property>
  <property>
    <name>dfs.namenode.checkpoint.dir</name>
    <value>file:/usr/local/hadoop_work/hdfs/namesecondary</value>
  </property>
  <property>
     <name>dfs.datanode.data.dir</name>
     <value>file:/usr/local/hadoop_work/hdfs/datanode</value>
  </property>
  <property>
    <name>dfs.secondary.http.address</name>
    <value>xxx.xxx.xxx.xxx:50090</value>
  </property>
<property> 
<name>dfs.block.size</name> 
<value>134217728</value> 
<description>Block size</description> 
</property>
</configuration>

해결법

    from https://stackoverflow.com/questions/46721969/debugging-why-a-hadoop-job-fails-with-varying-input by cc-by-sa and MIT license