hadoop에서 nameservice에 대한 활성 namenode를 가져 오는 명령은 무엇입니까?

명령 :

hdfs haadmin -getServiceState machine-98

기계 이름을 아는 경우에만 작동합니다. 다음과 같은 명령이 있습니까?

hdfs haadmin -getServiceState <nameservice>

활성 namenode의 IP / 호스트 이름을 알려줄 수 있습니까?

해결법

==============================

1.namenodes를 출력하려면 다음 명령을 사용하십시오 :

namenodes를 출력하려면 다음 명령을 사용하십시오 :

hdfs getconf -namenodes

보조 namenodes를 출력하려면 :

hdfs getconf -secondaryNameNodes

백업 namenodes를 출력하려면 :

hdfs getconf -backupNodes

참고 :이 명령은 Hadoop 2.4.0을 사용하여 테스트되었습니다.

10-31-2014 업데이트 :

다음은 설정 파일에서 Hadoop HA에 관련된 NameNodes를 읽고 hdfs haadmin 명령을 사용하여 Hadoop HA 중 어느 것이 활성인지를 결정하는 python 스크립트입니다. 이 스크립트는 HA를 구성하지 않았으므로 완전히 테스트되지 않았습니다. Hadoop HA 문서를 기반으로 한 샘플 파일을 사용하여 구문 분석을 테스트했습니다. 필요에 따라 언제든지 사용하고 수정하십시오.

#!/usr/bin/env python
# coding: UTF-8
import xml.etree.ElementTree as ET
import subprocess as SP
if __name__ == "__main__":
    hdfsSiteConfigFile = "/etc/hadoop/conf/hdfs-site.xml"

    tree = ET.parse(hdfsSiteConfigFile)
    root = tree.getroot()
    hasHadoopHAElement = False
    activeNameNode = None
    for property in root:
        if "dfs.ha.namenodes" in property.find("name").text:
            hasHadoopHAElement = True
            nameserviceId = property.find("name").text[len("dfs.ha.namenodes")+1:]
            nameNodes = property.find("value").text.split(",")
            for node in nameNodes:
                #get the namenode machine address then check if it is active node
                for n in root:
                    prefix = "dfs.namenode.rpc-address." + nameserviceId + "."
                    elementText = n.find("name").text
                    if prefix in elementText:
                        nodeAddress = n.find("value").text.split(":")[0]                

                        args = ["hdfs haadmin -getServiceState " + node]  
                        p = SP.Popen(args, shell=True, stdout=SP.PIPE, stderr=SP.PIPE)

                        for line in p.stdout.readlines():
                            if "active" in line.lower():
                                print "Active NameNode: " + node
                                break;
                        for err in p.stderr.readlines():
                            print "Error executing Hadoop HA command: ",err
            break            
    if not hasHadoopHAElement:
        print "Hadoop High-Availability configuration not found!"

==============================
2.이것을 발견 :

이것을 발견 :

https://gist.github.com/cnauroth/7ff52e9f80e7d856ddb3

이것은 다른 hadoop 배포본이 http : // namenode : 50070 / jmx를 사용할 수 있는지 확신 할 수는 없지만 CDH5 namenode의 기본 동작을합니다. 그렇지 않은 경우 Jolokia를 배포하여 추가 할 수 있다고 생각합니다.

예:
```
curl 'http://namenode1.example.com:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus'
{
  "beans" : [ {
    "name" : "Hadoop:service=NameNode,name=NameNodeStatus",
    "modelerType" : "org.apache.hadoop.hdfs.server.namenode.NameNode",
    "State" : "active",
    "NNRole" : "NameNode",
    "HostAndPort" : "namenode1.example.com:8020",
    "SecurityEnabled" : true,
    "LastHATransitionTime" : 1436283324548
  } ]
```
그래서 각 namenode에 대해 하나의 http 요청을 시작하여 (이것은 빠르다) 우리는 어느 것이 활성 상태인지 파악할 수있다.

또한 WebHDFS REST API를 비활성 namenode에 연결하면 403 금지됨과 다음 JSON이 표시됩니다.
```
{"RemoteException":{"exception":"StandbyException","javaClassName":"org.apache.hadoop.ipc.StandbyException","message":"Operation category READ is not supported in state standby"}}
```
==============================
3.hdfs cli 호출로 bash에서이 작업을 수행 할 수도 있습니다. 주의해야 할 점은 API에 대한 호출이 연속적으로 발생하기 때문에 시간이 조금 더 걸리지 만 일부에서는 python 스크립트를 사용하는 것이 더 바람직 할 수 있습니다.

hdfs cli 호출로 bash에서이 작업을 수행 할 수도 있습니다. 주의해야 할 점은 API에 대한 호출이 연속적으로 발생하기 때문에 시간이 조금 더 걸리지 만 일부에서는 python 스크립트를 사용하는 것이 더 바람직 할 수 있습니다.

이것은 Hadoop 2.6.0에서 테스트되었습니다.
```
get_active_nn(){
   ha_name=$1 #Needs the NameServiceID
   ha_ns_nodes=$(hdfs getconf -confKey dfs.ha.namenodes.${ha_name})
   active=""
   for node in $(echo ${ha_ns_nodes//,/ }); do
     state=$(hdfs haadmin -getServiceState $node)
     if [ "$state" == "active" ]; then
       active=$(hdfs getconf -confKey dfs.namenode.rpc-address.${ha_name}.${node})
       break
     fi
   done
   if [ -z "$active" ]; then
     >&2 echo "ERROR: no active namenode found for ${ha_name}"
     exit 1
   else
     echo $active
   fi
}
```

==============================

4.기존의 모든 대답을 읽은 후 아무도 다음 세 단계를 결합하지 못했습니다.

기존의 모든 대답을 읽은 후 아무도 다음 세 단계를 결합하지 못했습니다.

아래의 솔루션은 hdfs getconf 호출과 노드 상태에 대한 JMX 서비스 호출을 결합합니다.

#!/usr/bin/env python

from subprocess import check_output
import urllib, json, sys

def get_name_nodes(clusterName):
    ha_ns_nodes=check_output(['hdfs', 'getconf', '-confKey',
        'dfs.ha.namenodes.' + clusterName])
    nodes = ha_ns_nodes.strip().split(',')
    nodeHosts = []
    for n in nodes:
        nodeHosts.append(get_node_hostport(clusterName, n))

    return nodeHosts

def get_node_hostport(clusterName, nodename):
    hostPort=check_output(
        ['hdfs','getconf','-confKey',
         'dfs.namenode.rpc-address.{0}.{1}'.format(clusterName, nodename)])
    return hostPort.strip()

def is_node_active(nn):
    jmxPort = 50070
    host, port = nn.split(':')
    url = "http://{0}:{1}/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus".format(
            host, jmxPort)
    nnstatus = urllib.urlopen(url)
    parsed = json.load(nnstatus)

    return parsed.get('beans', [{}])[0].get('State', '') == 'active'

def get_active_namenode(clusterName):
    for n in get_name_nodes(clusterName):
        if is_node_active(n):
            return n

clusterName = (sys.argv[1] if len(sys.argv) > 1 else None)
if not clusterName:
    raise Exception("Specify cluster name.")

print 'Cluster: {0}'.format(clusterName)
print "Nodes: {0}".format(get_name_nodes(clusterName))
print "Active Name Node: {0}".format(get_active_namenode(clusterName))

==============================
5.자바 API에서 HAUtil.getAddressOfActive (fileSystem)를 사용할 수 있습니다.

자바 API에서 HAUtil.getAddressOfActive (fileSystem)를 사용할 수 있습니다.
==============================
6.고 가용성 Hadoop 클러스터에는 활성 노드 하나와 대기 노드 하나가있는 두 개의 노드 노드가 있습니다.

고 가용성 Hadoop 클러스터에는 활성 노드 하나와 대기 노드 하나가있는 두 개의 노드 노드가 있습니다.

활성 namenode를 찾으려면 각 namenode에서 test hdfs 명령을 실행하고 성공적인 실행에 해당하는 활성 이름 노드를 찾으십시오.

아래 명령은 이름 노드가 활성 상태이고 성공적 노드 노드 인 경우 실패합니다.
```
hadoop fs -test -e hdfs://<Name node>/
```
유닉스 스크립트
```
active_node=''
if hadoop fs -test -e hdfs://<NameNode-1>/ ; then
active_node='<NameNode-1>'
elif hadoop fs -test -e hdfs://<NameNode-2>/ ; then
active_node='<NameNode-2>'
fi

echo "Active Dev Name node : $active_node"
```
==============================
7.curl 명령을 사용하여 활성 및 보조 네임 노드를 찾을 수 있습니다 예를 들면

curl 명령을 사용하여 활성 및 보조 네임 노드를 찾을 수 있습니다 예를 들면

문안 인사

==============================

8.

#!/usr/bin/python

import subprocess
import sys
import os, errno


def getActiveNameNode () :

    cmd_string="hdfs getconf -namenodes"
    process = subprocess.Popen(cmd_string, shell=True, stdout=subprocess.PIPE)
    out, err = process.communicate()
    NameNodes = out
    Value = NameNodes.split(" ")
    for val in Value :
        cmd_str="hadoop fs -test -e hdfs://"+val
        process = subprocess.Popen(cmd_str, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        out, err = process.communicate()
        if (err != "") :
            return val

def main():

    out = getActiveNameNode()
    print(out)

if __name__ == '__main__':
    main()

from https://stackoverflow.com/questions/26648214/any-command-to-get-active-namenode-for-nameservice-in-hadoop by cc-by-sa and MIT license

'HADOOP' 카테고리의 다른 글

[HADOOP] 하이브 란 무엇입니까? 데이터베이스입니까? [닫은] (0)	2019.06.14
[HADOOP] HIVE 중첩 ARRAY MAP 데이터 형식 (0)	2019.06.14
[HADOOP] Hadoop mapreduce : MapReduce 작업 내에서 매퍼를 연결하는 드라이버 (0)	2019.06.14
[HADOOP] Mapreduce Combiner (0)	2019.06.14
[HADOOP] hadoop hdfs 형식화가 블록 풀에서 오류를 가져 오지 못했습니다. (0)	2019.06.14

복붙노트

[HADOOP] hadoop에서 nameservice에 대한 활성 namenode를 가져 오는 명령은 무엇입니까?

hadoop에서 nameservice에 대한 활성 namenode를 가져 오는 명령은 무엇입니까?

해결법

1.namenodes를 출력하려면 다음 명령을 사용하십시오 :

2.이것을 발견 :

3.hdfs cli 호출로 bash에서이 작업을 수행 할 수도 있습니다. 주의해야 할 점은 API에 대한 호출이 연속적으로 발생하기 때문에 시간이 조금 더 걸리지 만 일부에서는 python 스크립트를 사용하는 것이 더 바람직 할 수 있습니다.

4.기존의 모든 대답을 읽은 후 아무도 다음 세 단계를 결합하지 못했습니다.

5.자바 API에서 HAUtil.getAddressOfActive (fileSystem)를 사용할 수 있습니다.

6.고 가용성 Hadoop 클러스터에는 활성 노드 하나와 대기 노드 하나가있는 두 개의 노드 노드가 있습니다.

7.curl 명령을 사용하여 활성 및 보조 네임 노드를 찾을 수 있습니다 예를 들면

8.

'HADOOP' 카테고리의 다른 글

티스토리툴바