[HADOOP] 원사로 구성된 원격 클러스터에 mapreduce 작업을 제출하는 방법은 무엇입니까?
HADOOP원사로 구성된 원격 클러스터에 mapreduce 작업을 제출하는 방법은 무엇입니까?
이클립스에서 간단한 mapreduce 프로그램을 실행하려고합니다. 다음은 내 프로그램입니다.
package wordcount;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://quickstart.cloudera:8020");
conf.set("mapreduce.framework.name", "yarn");
conf.set("yarn.resourcemanager.address", "quickstart.cloudera:8032");
conf.set("yarn.app.mapreduce.am.staging-dir", "/user");
Job job = Job.getInstance(conf);
job.setJarByClass(WordCount.class);
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/hadoop-mapreduce-client-app-2.6.0-cdh5.7.0.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/hadoop-yarn-common-2.6.0-cdh5.7.0.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/hadoop-common-2.6.0-cdh5.7.0.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/hadoop-yarn-api-2.6.0-cdh5.7.0.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/hadoop-mapreduce-client-core-2.6.0-cdh5.7.0.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/hadoop-mapreduce-client-common-2.6.0-cdh5.7.0.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/commons-logging-1.2.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/guava-15.0.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/commons-collections-3.2.2.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/protobuf-java-2.5.0.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/commons-configuration-1.7.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/commons-lang-2.6.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/log4j-1.2.16.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/slf4j-api-1.7.5.jar"));
job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/slf4j-log4j12-1.7.5.jar"));
job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path("/user/cloudera/prasad/test.txt"));
FileOutputFormat.setOutputPath(job, new Path("/user/cloudera/prasad/wordout2"));
job.waitForCompletion(true);
}
}
처음에 위의 프로그램을 실행할 때 컨테이너 로그에 ClassNotFoundExceptions이 발생하여 프로그램에 작성된 모든 해당 jar을 추가했습니다. 이제 컨테이너 로그에 오류가 표시되지 않지만 mapreduce 작업이 실패합니다.
그러나 리소스 관리자는 아래 오류를 표시합니다
Exception from container-launch with container ID: container_1473338609943_0003_01_000001 and exit code: 1
ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
at org.apache.hadoop.util.Shell.run(Shell.java:478)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
응용 프로그램 로그를 클릭하면 다음 메시지 만 표시하는 것이 표시되지 않습니다.
Log Type: stderr
Log Upload Time: Thu Sep 08 05:26:35 -0700 2016
Log Length: 243
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.impl.MetricsSystemImpl).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Log Type: stdout
Log Upload Time: Thu Sep 08 05:26:35 -0700 2016
Log Length: 0
내 프로그램에 어떤 문제가 있는지 알려주십시오.
다음은 내가 사용하는 lib의 화면입니다.
해결법
from https://stackoverflow.com/questions/39396103/how-to-sumit-a-mapreduce-job-to-remote-cluster-configured-with-yarn by cc-by-sa and MIT license
'HADOOP' 카테고리의 다른 글
[HADOOP] 하둡 FileSystem.getFS ()가 약 2 분 동안 일시 중지됨 (0) | 2019.09.12 |
---|---|
[HADOOP] 연결 예외에서 hadoop fs -mkdir이 실패했습니다. (0) | 2019.09.12 |
[HADOOP] 설명 표는 Hue Hive Avro 형식의 열 주석에 대한 "직렬 디시리얼라이저"를 보여줍니다. (0) | 2019.09.12 |
[HADOOP] 가방 안에 랭크? (0) | 2019.09.12 |
[HADOOP] Windows에서 HBase가 독립형 모드로 시작되지 않음 (0) | 2019.09.12 |