복붙노트

[HADOOP] (HCatOutputFormat를 초기화) 맵리 듀스에서 하이브 쓰기

HADOOP

(HCatOutputFormat를 초기화) 맵리 듀스에서 하이브 쓰기

나는 HBase를 데이터를로드 및 하이브로를 덤프해야 MR 스크립트를 썼다. HBase를 연결하는 것은 내가 HIVE 테이블에 데이터를 저장하려고하면, 나는 오류 메시지가 다음받을 괜찮습니다,하지만 :

 Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.JavaMain], main() threw exception, org.apache.hive.hcatalog.common.HCatException : 2004 : HCatOutputFormat not initialized, setOutput has to be called
  org.apache.oozie.action.hadoop.JavaMainException: org.apache.hive.hcatalog.common.HCatException : 2004 : HCatOutputFormat not initialized, setOutput has to be called
  at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:58)
  at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:38)
  at org.apache.oozie.action.hadoop.JavaMain.main(JavaMain.java:36)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
  Caused by: org.apache.hive.hcatalog.common.HCatException : 2004 : HCatOutputFormat not initialized, setOutput has to be called
  at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:118)
  at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getTableSchema(HCatBaseOutputFormat.java:61)
  at com.nrholding.t0_mr.main.DumpProductViewsAggHive.run(DumpProductViewsAggHive.java:254)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
  at com.nrholding.t0_mr.main.DumpProductViewsAggHive.main(DumpProductViewsAggHive.java:268)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:55)
  ... 15 more

나는 그것을 확인했다 :

여기 내 실행 방법은 다음과 같다 :

@Override
public int run(String[] args) throws Exception {

    // Create configuration
    Configuration conf = this.getConf();
    String databaseName = null;
    String tableName = "test";

    // Parse arguments
    String[] otherArgs = new GenericOptionsParser(conf,args).getRemainingArgs();
    getParams(otherArgs);

    // It is better to specify zookeeper quorum in CLI parameter -D hbase.zookeeper.quorum=zookeeper servers
    conf.set( "hbase.zookeeper.quorum",
    "cz-dc1-s-132.mall.local,cz-dc1-s-133.mall.local,"
    + "cz-dc1-s-134.mall.local,cz-dc1-s-135.mall.local,"
    + "cz-dc1-s-136.mall.local");

    // Create job
    Job job = Job.getInstance(conf, NAME);
    job.setJarByClass(DumpProductViewsAggHive.class);


    // Setup MapReduce job
    job.setReducerClass(Reducer.class);
    //job.setNumReduceTasks(0); // If reducer is not needed

    // Specify key / value
    job.setOutputKeyClass(Writable.class);
    job.setOutputValueClass(DefaultHCatRecord.class);

    // Input
    getInput(null, dateFrom, dateTo, job, caching, table);

    // Output
    // Ignore the key for the reducer output; emitting an HCatalog record as value
    job.setOutputFormatClass(HCatOutputFormat.class);

    HCatOutputFormat.setOutput(job, OutputJobInfo.create(databaseName, tableName, null));
    HCatSchema s = HCatOutputFormat.getTableSchema(conf);
    System.err.println("INFO: output schema explicitly set for writing:" + s);
    HCatOutputFormat.setSchema(job, s);

    // Execute job and return status
    return job.waitForCompletion(true) ? 0 : 1;
}

어떻게 도와 어떤 생각을 가지고 있습니까? 감사합니다!

해결법

  1. ==============================

    1.사용하다:

    사용하다:

    HCatSchema s = HCatOutputFormat.getTableSchema(job.getConfiguration());
    
  2. ==============================

    2.좋아, 나는 감가 상각 방법을 사용 :

    좋아, 나는 감가 상각 방법을 사용 :

    HCatSchema s = HCatOutputFormat.getTableSchema(job);
    

    대신에:

    HCatSchema s = HCatOutputFormat.getTableSchema(conf);
    

    그리고 그것은 일을 솔기.

  3. from https://stackoverflow.com/questions/24558943/writing-to-hive-from-mapreduce-initialize-hcatoutputformat by cc-by-sa and MIT license