System에서 Hbase MapReduce로 텍스트 파일 읽기

텍스트 파일에서 Map Reduce로 데이터를로드해야하지만 웹을 검색했지만 내 작업에 적합한 솔루션을 찾지 못했습니다.

시스템에서 text / csv 파일을 읽고 HBASE 테이블에 데이터를 저장하는 방법이나 클래스가 있습니까?

해결법

==============================

1.텍스트 파일을 읽기 위해서는 우선 텍스트 파일이 hdfs에 있어야합니다. 작업에 대한 입력 형식 및 출력 형식을 지정해야합니다.

텍스트 파일을 읽기 위해서는 우선 텍스트 파일이 hdfs에 있어야합니다. 작업에 대한 입력 형식 및 출력 형식을 지정해야합니다.

Job job = new Job(conf, "example");
FileInputFormat.addInputPath(job, new Path("PATH to text file"));
job.setInputFormatClass(TextInputFormat.class);
job.setMapperClass(YourMapper.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
TableMapReduceUtil.initTableReducerJob("hbase_table_name", YourReducer.class, job);
job.waitForCompletion(true);

YourReducer는 org.apache.hadoop.hbase.mapreduce.TableReducer 를 확장해야합니다.

샘플 감속기 코드

public class YourReducer extends TableReducer<Text, Text, Text> {    
private byte[] rawUpdateColumnFamily = Bytes.toBytes("colName");
/**
* Called once at the beginning of the task.
*/
@Override
protected void setup(Context context) throws IOException, InterruptedException {
// something that need to be done at start of reducer
}

@Override
public void reduce(Text keyin, Iterable<Text> values, Context context) throws IOException, InterruptedException {
// aggregate counts
int valuesCount = 0;
for (Text val : values) {
   valuesCount += 1;
   // put date in table
   Put put = new Put(keyin.toString().getBytes());
   long explicitTimeInMs = new Date().getTime();
   put.add(rawUpdateColumnFamily, Bytes.toBytes("colName"), explicitTimeInMs,val.toString().getBytes());
   context.write(keyin, put);


      }
    }
}

샘플 매퍼 클래스

public static class YourMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
        word.set(tokenizer.nextToken());
        context.write(word, one);
        }
    }
}

from https://stackoverflow.com/questions/12246464/read-text-file-from-system-to-hbase-mapreduce by cc-by-sa and MIT license

'HADOOP' 카테고리의 다른 글

[HADOOP] Hive가 저장 프로 시저를 지원하지 않는 이유는 무엇입니까? (0)	2019.07.23
[HADOOP] SQOOP에서 증분 가져 오기를 어떻게 자동화 할 수 있습니까? (0)	2019.07.23
[HADOOP] 하이브 테이블에서 JSON-SerDe 사용하기 (0)	2019.07.23
[HADOOP] 하둡 가상 클러스터 vs 단일 시스템 (0)	2019.07.23
[HADOOP] 어떻게 아파치 돼지를 사용하여 hadoop 클러스터에 파일을로드 할 수 있습니까? (0)	2019.07.23

복붙노트

[HADOOP] System에서 Hbase MapReduce로 텍스트 파일 읽기

System에서 Hbase MapReduce로 텍스트 파일 읽기

해결법

1.텍스트 파일을 읽기 위해서는 우선 텍스트 파일이 hdfs에 있어야합니다. 작업에 대한 입력 형식 및 출력 형식을 지정해야합니다.

'HADOOP' 카테고리의 다른 글

티스토리툴바