선택할 수있는 하나 - 하둡 프로그램의 드라이버를 작성하기 위해 여러 가지 방법?

나는 하둡 프로그램의 드라이버 방법을 쓸 수있는 여러 가지 방법이 있다는 것을 발견했다.

다음 방법은 야후 하둡 튜토리얼에 나와있다

 public void run(String inputPath, String outputPath) throws Exception {
    JobConf conf = new JobConf(WordCount.class);
    conf.setJobName("wordcount");

    // the keys are words (strings)
    conf.setOutputKeyClass(Text.class);
    // the values are counts (ints)
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(MapClass.class);
    conf.setReducerClass(Reduce.class);

    FileInputFormat.addInputPath(conf, new Path(inputPath));
    FileOutputFormat.setOutputPath(conf, new Path(outputPath));

    JobClient.runJob(conf);
  }

이 방법은 하둡에서 Oreilly에 의해 확실한 가이드 2012 책을 주어집니다.

public static void main(String[] args) throws Exception {
  if (args.length != 2) {
    System.err.println("Usage: MaxTemperature <input path> <output path>");
    System.exit(-1);
  }
  Job job = new Job();
  job.setJarByClass(MaxTemperature.class);
  job.setJobName("Max temperature");
  FileInputFormat.addInputPath(job, new Path(args[0]));
  FileOutputFormat.setOutputPath(job, new Path(args[1]));
  job.setMapperClass(MaxTemperatureMapper.class);
  job.setReducerClass(MaxTemperatureReducer.class);
  job.setOutputKeyClass(Text.class);
  job.setOutputValueClass(IntWritable.class);
  System.exit(job.waitForCompletion(true) ? 0 : 1);
}

Oreilly 책에서 주어진 프로그램을 시도하는 동안 나는 작업 클래스의 생성자가되지 않습니다 것을 발견했다. Oreilly 책은 하둡 2 (실)을 기반으로 나는 그들이 사용되지 않는 클래스를 사용하는 것을보고 놀랐습니다.

나는 방법 모두가 사용하는 알고 싶습니다?

해결법

==============================
1.내가, 우리가 -D와 같은 하둡 항아리 옵션을 사용할 수 있습니다 우리는 run () 메소드를 오버라이드 (override)으로 이동 이전 approach.If를 사용 -libjars는 -files 등.,.이 모든 매우 필요한 거의 모든 하둡 프로젝트입니다. 우리는 main () 메소드를 통해 사용할 수 있는지 확실하지 않습니다.

내가, 우리가 -D와 같은 하둡 항아리 옵션을 사용할 수 있습니다 우리는 run () 메소드를 오버라이드 (override)으로 이동 이전 approach.If를 사용 -libjars는 -files 등.,.이 모든 매우 필요한 거의 모든 하둡 프로젝트입니다. 우리는 main () 메소드를 통해 사용할 수 있는지 확실하지 않습니다.

==============================

2.첫 번째 (야후) 블록에 약간 다른 - 당신이 GenericOptionsParser의 장점을 취하는 ToolRunner / 도구 클래스를 사용한다 (Eswara의 대답에 명시된 바와 같이)

첫 번째 (야후) 블록에 약간 다른 - 당신이 GenericOptionsParser의 장점을 취하는 ToolRunner / 도구 클래스를 사용한다 (Eswara의 대답에 명시된 바와 같이)

템플릿 패턴 같은 것입니다 :

import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class ToolExample extends Configured implements Tool {

    @Override
    public int run(String[] args) throws Exception {
        // old API
        JobConf jobConf = new JobConf(getConf());

        // new API
        Job job = new Job(getConf());

        // rest of your config here

        // determine success / failure (depending on your choice of old / new api)
        // return 0 for success, non-zero for an error
        return 0;
    }

    public static void main(String args[]) throws Exception {
        System.exit(ToolRunner.run(new ToolExample(), args));
    }
}

from https://stackoverflow.com/questions/16184227/multiple-ways-to-write-driver-of-hadoop-program-which-one-to-choose by cc-by-sa and MIT license

'HADOOP' 카테고리의 다른 글

[HADOOP] 오류 하둡 파티션을 사용하는 동안 (0)	2019.09.25
[HADOOP] 하둡 HDFS에 데이터 보존 (0)	2019.09.25
[HADOOP] 아파치 Giraph 복잡한 값으로 정점 (0)	2019.09.25
[HADOOP] 하둡 2.X에서 Nutch (0)	2019.09.25
[HADOOP] 하둡은 예 항아리를 사전 설치 (0)	2019.09.25

복붙노트

[HADOOP] 선택할 수있는 하나 - 하둡 프로그램의 드라이버를 작성하기 위해 여러 가지 방법?

선택할 수있는 하나 - 하둡 프로그램의 드라이버를 작성하기 위해 여러 가지 방법?

해결법

2.첫 번째 (야후) 블록에 약간 다른 - 당신이 GenericOptionsParser의 장점을 취하는 ToolRunner / 도구 클래스를 사용한다 (Eswara의 대답에 명시된 바와 같이)

'HADOOP' 카테고리의 다른 글

티스토리툴바