Apache Hadoop 데이터 출력을 MySQL 데이터베이스에 저장

map-reduce 프로그램의 출력을 데이터베이스에 저장해야하므로 어떤 방법이 있습니까?

그렇다면 요구 사항에 따라 출력을 여러 열과 테이블에 저장할 수 있습니까 ??

몇 가지 해결책을 제안하십시오.

고맙습니다..

해결법

==============================

1.이 블로그에 좋은 예가 나와 있으며, 시도해 보았습니다. 코드의 가장 중요한 부분을 인용합니다.

이 블로그에 좋은 예가 나와 있으며, 시도해 보았습니다. 코드의 가장 중요한 부분을 인용합니다.

먼저 저장하려는 데이터를 나타내는 클래스를 작성해야합니다. 클래스는 DBWritable 인터페이스를 구현해야합니다.

public class DBOutputWritable implements Writable, DBWritable
{
   private String name;
   private int count;

   public DBOutputWritable(String name, int count) {
     this.name = name;
     this.count = count;
   }

   public void readFields(DataInput in) throws IOException {   }

   public void readFields(ResultSet rs) throws SQLException {
     name = rs.getString(1);
     count = rs.getInt(2);
   }

   public void write(DataOutput out) throws IOException {    }

   public void write(PreparedStatement ps) throws SQLException {
     ps.setString(1, name);
     ps.setInt(2, count);
   }
}

Reducer에서 이전에 정의 된 클래스의 객체를 만듭니다.

public class Reduce extends Reducer<Text, IntWritable, DBOutputWritable, NullWritable> {

   protected void reduce(Text key, Iterable<IntWritable> values, Context ctx) {
     int sum = 0;

     for(IntWritable value : values) {
       sum += value.get();
     }

     try {
       ctx.write(new DBOutputWritable(key.toString(), sum), NullWritable.get());
     } catch(IOException e) {
       e.printStackTrace();
     } catch(InterruptedException e) {
       e.printStackTrace();
     }
   }
}

마지막으로 DB에 대한 연결을 구성하고 (클래스 경로에 DB 커넥터를 추가하는 것을 잊지 마십시오) 매퍼 및 리듀서의 입력 / 출력 데이터 유형을 등록해야합니다.

public class Main
{
   public static void main(String[] args) throws Exception
   {
     Configuration conf = new Configuration();
     DBConfiguration.configureDB(conf,
     "com.mysql.jdbc.Driver",   // driver class
     "jdbc:mysql://localhost:3306/testDb", // db url
     "user",    // username
     "password"); //password

     Job job = new Job(conf);
     job.setJarByClass(Main.class);
     job.setMapperClass(Map.class); // your mapper - not shown in this example
     job.setReducerClass(Reduce.class);
     job.setMapOutputKeyClass(Text.class); // your mapper - not shown in this example
     job.setMapOutputValueClass(IntWritable.class); // your mapper - not shown in this example
     job.setOutputKeyClass(DBOutputWritable.class); // reducer's KEYOUT
     job.setOutputValueClass(NullWritable.class);   // reducer's VALUEOUT
     job.setInputFormatClass(...);
     job.setOutputFormatClass(DBOutputFormat.class);

     DBInputFormat.setInput(...);

     DBOutputFormat.setOutput(
     job,
     "output",    // output table name
     new String[] { "name", "count" }   //table columns
     );

     System.exit(job.waitForCompletion(true) ? 0 : 1);
   }
}

from https://stackoverflow.com/questions/18351475/storing-apache-hadoop-data-output-to-mysql-database by cc-by-sa and MIT license

'HADOOP' 카테고리의 다른 글

[HADOOP] 하둡 YARN 대 원사 패키지 관리자 명령 충돌 (0)	2019.08.09
[HADOOP] hadoop 클러스터의 모든 노드에서 pyspark 작업을 실행할 수 없습니다. (0)	2019.08.09
[HADOOP] 액션 북의 Mahout에서 예제를 실행하는 방법 (0)	2019.08.09
[HADOOP] Spark to Oozie 공유 라이브러리 추가 (0)	2019.08.09
[HADOOP] 부분 집계와 결합기 중 어느 것이 더 빠릅니까? (0)	2019.08.09

복붙노트

[HADOOP] Apache Hadoop 데이터 출력을 MySQL 데이터베이스에 저장

Apache Hadoop 데이터 출력을 MySQL 데이터베이스에 저장

해결법

1.이 블로그에 좋은 예가 나와 있으며, 시도해 보았습니다. 코드의 가장 중요한 부분을 인용합니다.

'HADOOP' 카테고리의 다른 글

티스토리툴바