Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS安装服务报错 #559

Open
1 task done
misteruly opened this issue May 20, 2024 · 4 comments
Open
1 task done

HDFS安装服务报错 #559

misteruly opened this issue May 20, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@misteruly
Copy link

misteruly commented May 20, 2024

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

ddp1安装DataNode报错:

can not find log file

ddp2安装DataNode报错:

can not find log file

ddp2安装 NameNode报错:

2024-05-20 14:26:15,345 INFO ipc.Client: Retrying connect to server: ddp1/192.168.4.180:8485. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-20 14:26:15,349 WARN namenode.NameNode: Encountered exception during format
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 2 successful responses:
192.168.4.182:8485: false
192.168.4.181:8485: false
1 exceptions thrown:
192.168.4.180:8485: Call From ddp2/192.168.4.181 to ddp1:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
	at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
	at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:305)
	at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:282)
	at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:1185)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:212)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1274)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1726)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
2024-05-20 14:26:15,392 INFO namenode.FSNamesystem: Stopping services started for active state
2024-05-20 14:26:15,392 INFO namenode.FSNamesystem: Stopping services started for standby state
2024-05-20 14:26:15,392 ERROR namenode.NameNode: Failed to start namenode.
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 2 successful responses:
192.168.4.182:8485: false
192.168.4.181:8485: false
1 exceptions thrown:
192.168.4.180:8485: Call From ddp2/192.168.4.181 to ddp1:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
	at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
	at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:305)
	at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.hasSomeData(QuorumJournalManager.java:282)
	at org.apache.hadoop.hdfs.server.common.Storage.confirmFormat(Storage.java:1185)
	at org.apache.hadoop.hdfs.server.namenode.FSImage.confirmFormat(FSImage.java:212)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1274)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1726)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
2024-05-20 14:26:15,394 INFO util.ExitUtil: Exiting with status 1: org.apache.hadoop.hdfs.qjournal.client.QuorumException: Unable to check if JNs are ready for formatting. 2 successful responses:
192.168.4.182:8485: false
192.168.4.181:8485: false
1 exceptions thrown:
192.168.4.180:8485: Call From ddp2/192.168.4.181 to ddp1:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
2024-05-20 14:26:15,395 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ddp2/192.168.4.181
************************************************************/
Last login: Mon May 20 14:16:00 CST 2024 on pts/0

[ERROR] 2024-05-20 14:26:15 TaskLogLogger-HDFS-NameNode:[197] - 
[INFO] 2024-05-20 15:36:16 TaskLogLogger-HDFS-NameNode:[86] - Remote package md5 is a307e097d66da00636e44cd32148a13a
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[91] - Local md5 is a307e097d66da00636e44cd32148a13a
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[82] - Start to configure service role NameNode
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[263] - size is :1
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[266] - config set value to /data/dfs/nn
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[263] - size is :1
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[266] - config set value to /data/dfs/dn
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[263] - size is :4
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[266] - config set value to RULE:[2:$1/$2@$0]([ndj]n\/.*@HADOOP\.COM)s/.*/hdfs/
RULE:[2:$1/$2@$0]([rn]m\/.*@HADOOP\.COM)s/.*/yarn/
RULE:[2:$1/$2@$0](jhs\/.*@HADOOP\.COM)s/.*/mapred/
DEFAULT
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 15:36:18 TaskLogLogger-HDFS-NameNode:[58] - Start to execute format namenode
[ERROR] 2024-05-20 15:39:18 TaskLogLogger-HDFS-NameNode:[70] - Namenode format failed
[INFO] 2024-05-20 15:44:21 TaskLogLogger-HDFS-NameNode:[182] - 2024-05-20 15:36:20,104 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = ddp2/192.168.4.181
STARTUP_MSG:   args = [-format, smhadoop]
STARTUP_MSG:   version = 3.3.3

2024-05-20 15:36:20,725 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2024-05-20 15:36:20,737 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000
2024-05-20 15:36:20,738 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
2024-05-20 15:36:20,741 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
2024-05-20 15:36:20,741 INFO blockmanagement.BlockManager: The block deletion will start around 2024 May 20 15:36:20
2024-05-20 15:36:20,742 INFO util.GSet: Computing capacity for map BlocksMap
2024-05-20 15:36:20,742 INFO util.GSet: VM type       = 64-bit
2024-05-20 15:36:20,743 INFO util.GSet: 2.0% max memory 7.7 GB = 157.0 MB
2024-05-20 15:36:20,743 INFO util.GSet: capacity      = 2^24 = 16777216 entries
2024-05-20 15:36:20,763 INFO blockmanagement.BlockManager: Storage policy satisfier is disabled
2024-05-20 15:36:20,763 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false
2024-05-20 15:36:20,769 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.999
2024-05-20 15:36:20,769 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0
2024-05-20 15:36:20,769 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000
2024-05-20 15:36:20,770 INFO blockmanagement.BlockManager: defaultReplication         = 3
2024-05-20 15:36:20,770 INFO blockmanagement.BlockManager: maxReplication             = 512
2024-05-20 15:36:20,770 INFO blockmanagement.BlockManager: minReplication             = 1
2024-05-20 15:36:20,770 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
2024-05-20 15:36:20,770 INFO blockmanagement.BlockManager: redundancyRecheckInterval  = 3000ms
2024-05-20 15:36:20,770 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
2024-05-20 15:36:20,770 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
2024-05-20 15:36:20,793 INFO namenode.FSDirectory: GLOBAL serial map: bits=29 maxEntries=536870911
2024-05-20 15:36:20,793 INFO namenode.FSDirectory: USER serial map: bits=24 maxEntries=16777215
2024-05-20 15:36:20,793 INFO namenode.FSDirectory: GROUP serial map: bits=24 maxEntries=16777215
2024-05-20 15:36:20,793 INFO namenode.FSDirectory: XATTR serial map: bits=24 maxEntries=16777215
2024-05-20 15:36:20,805 INFO util.GSet: Computing capacity for map INodeMap
2024-05-20 15:36:20,805 INFO util.GSet: VM type       = 64-bit
2024-05-20 15:36:20,805 INFO util.GSet: 1.0% max memory 7.7 GB = 78.5 MB
2024-05-20 15:36:20,805 INFO util.GSet: capacity      = 2^23 = 8388608 entries
2024-05-20 15:36:20,811 INFO namenode.FSDirectory: ACLs enabled? true
2024-05-20 15:36:20,811 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true
2024-05-20 15:36:20,811 INFO namenode.FSDirectory: XAttrs enabled? true
2024-05-20 15:36:20,812 INFO namenode.NameNode: Caching file names occurring more than 10 times
2024-05-20 15:36:20,817 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRootDescendant: true, maxSnapshotLimit: 65536
2024-05-20 15:36:20,819 INFO snapshot.SnapshotManager: SkipList is disabled
2024-05-20 15:36:20,823 INFO util.GSet: Computing capacity for map cachedBlocks
2024-05-20 15:36:20,823 INFO util.GSet: VM type       = 64-bit
2024-05-20 15:36:20,823 INFO util.GSet: 0.25% max memory 7.7 GB = 19.6 MB
2024-05-20 15:36:20,823 INFO util.GSet: capacity      = 2^21 = 2097152 entries
2024-05-20 15:36:20,832 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
2024-05-20 15:36:20,832 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
2024-05-20 15:36:20,832 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
2024-05-20 15:36:20,838 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
2024-05-20 15:36:20,838 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2024-05-20 15:36:20,840 INFO util.GSet: Computing capacity for map NameNodeRetryCache
2024-05-20 15:36:20,840 INFO util.GSet: VM type       = 64-bit
2024-05-20 15:36:20,840 INFO util.GSet: 0.029999999329447746% max memory 7.7 GB = 2.4 MB
2024-05-20 15:36:20,840 INFO util.GSet: capacity      = 2^18 = 262144 entries
Re-format filesystem in Storage Directory root= /data/dfs/nn; location= null ? (Y or N) Killed
Last login: Mon May 20 15:30:01 CST 2024 on pts/0

[ERROR] 2024-05-20 15:44:21 TaskLogLogger-HDFS-NameNode:[197] - 
[INFO] 2024-05-20 15:46:13 TaskLogLogger-HDFS-NameNode:[86] - Remote package md5 is a307e097d66da00636e44cd32148a13a
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[91] - Local md5 is a307e097d66da00636e44cd32148a13a
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[82] - Start to configure service role NameNode
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[263] - size is :1
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[266] - config set value to /data/dfs/nn
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[263] - size is :1
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[266] - config set value to /data/dfs/dn
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[263] - size is :4
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[266] - config set value to RULE:[2:$1/$2@$0]([ndj]n\/.*@HADOOP\.COM)s/.*/hdfs/
RULE:[2:$1/$2@$0]([rn]m\/.*@HADOOP\.COM)s/.*/yarn/
RULE:[2:$1/$2@$0](jhs\/.*@HADOOP\.COM)s/.*/mapred/
DEFAULT
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 15:46:16 TaskLogLogger-HDFS-NameNode:[58] - Start to execute format namenode
[ERROR] 2024-05-20 15:49:16 TaskLogLogger-HDFS-NameNode:[70] - Namenode format failed

ddp2 安装 ZKFC报错:

[INFO] 2024-05-20 13:37:23 TaskLogLogger-HDFS-NameNode:[86] - Remote package md5 is a307e097d66da00636e44cd32148a13a
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[91] - Local md5 is a307e097d66da00636e44cd32148a13a
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[82] - Start to configure service role NameNode
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[263] - size is :1
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[266] - config set value to /data/dfs/nn
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[263] - size is :1
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[266] - config set value to /data/dfs/dn
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[263] - size is :4
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[266] - config set value to RULE:[2:$1/$2@$0]([ndj]n\/.*@HADOOP\.COM)s/.*/hdfs/
RULE:[2:$1/$2@$0]([rn]m\/.*@HADOOP\.COM)s/.*/yarn/
RULE:[2:$1/$2@$0](jhs\/.*@HADOOP\.COM)s/.*/mapred/
DEFAULT
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[45] - Start to execute hdfs namenode -bootstrapStandby
[ERROR] 2024-05-20 13:37:39 TaskLogLogger-HDFS-NameNode:[54] - Namenode standby failed
[INFO] 2024-05-20 13:37:39 TaskLogLogger-HDFS-NameNode:[182] - 2024-05-20 13:37:27,815 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = ddp3/192.168.4.182
STARTUP_MSG:   args = [-bootstrapStandby]
STARTUP_MSG:   version = 3.3.3
STARTUP_MSG:   classpath = /opt/datasophon/hadoop-3.3.3/etc/hadoop:/opt/datasophon/hadoop-3.3.3/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar:/opt/datasophon/hadoop-3.3.3/share/hadoop/common/lib/snappy-java-1.1.8.2.jar:/opt/datasophon/hadoop-3.3.3/share/hadoop/common/lib/jsr305-3.0.2.jar:/opt/datasophon/hadoop-3.3.3/share/hadoop/yarn/sources:/opt/datasophon/hadoop-3.3.3/share/hadoop/yarn/test:/opt/datasophon/hadoop-3.3.3/share/hadoop/yarn/timelineservice:/opt/datasophon/hadoop-3.3.3/share/hadoop/yarn/webapps:/opt/datasophon/hadoop-3.3.3/share/hadoop/yarn/yarn-service-examples:/opt/datasophon/hadoop-3.3.3/jmx/jmx_prometheus_javaagent-0.16.1.jar
STARTUP_MSG:   build = https://github.com/apache/hadoop.git -r d37586cbda38c338d9fe481addda5a05fb516f71; compiled by 'stevel' on 2022-05-09T16:36Z
STARTUP_MSG:   java = 1.8.0_333
************************************************************/
2024-05-20 13:37:27,821 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2024-05-20 13:37:27,906 INFO namenode.NameNode: createNameNode [-bootstrapStandby]
2024-05-20 13:37:28,084 INFO ha.BootstrapStandby: Found nn: nn1, ipc: ddp2/192.168.4.181:8020
2024-05-20 13:37:28,483 INFO common.Util: Assuming 'file' scheme for path /data/dfs/nn in configuration.
2024-05-20 13:37:28,496 INFO common.Util: Assuming 'file' scheme for path /data/dfs/nn in configuration.
2024-05-20 13:37:29,665 INFO ipc.Client: Retrying connect to server: ddp2/192.168.4.181:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-20 13:37:38,674 INFO ipc.Client: Retrying connect to server: ddp2/192.168.4.181:8020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-20 13:37:38,680 WARN ha.BootstrapStandby: Unable to fetch namespace information from remote NN at ddp2/192.168.4.181:8020: Call From ddp3/192.168.4.182 to ddp2:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
2024-05-20 13:37:38,680 ERROR ha.BootstrapStandby: Unable to fetch namespace information from any remote NN. Possible NameNodes: [RemoteNameNodeInfo [nnId=nn1, ipcAddress=ddp2/192.168.4.181:8020, httpAddress=http://ddp2:9870]]
2024-05-20 13:37:38,682 INFO util.ExitUtil: Exiting with status 2: ExitException
2024-05-20 13:37:38,683 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ddp3/192.168.4.182
************************************************************/

[ERROR] 2024-05-20 13:37:39 TaskLogLogger-HDFS-NameNode:[197] - 

ddp3安装安装 NameNode报错:

[INFO] 2024-05-20 13:37:23 TaskLogLogger-HDFS-NameNode:[86] - Remote package md5 is a307e097d66da00636e44cd32148a13a
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[91] - Local md5 is a307e097d66da00636e44cd32148a13a
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[82] - Start to configure service role NameNode
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[263] - size is :1
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[266] - config set value to /data/dfs/nn
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[263] - size is :1
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[266] - config set value to /data/dfs/dn
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[117] - Convert boolean and integer to string
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[263] - size is :4
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[266] - config set value to RULE:[2:$1/$2@$0]([ndj]n\/.*@HADOOP\.COM)s/.*/hdfs/
RULE:[2:$1/$2@$0]([rn]m\/.*@HADOOP\.COM)s/.*/yarn/
RULE:[2:$1/$2@$0](jhs\/.*@HADOOP\.COM)s/.*/mapred/
DEFAULT
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[181] - configure success
[INFO] 2024-05-20 13:37:25 TaskLogLogger-HDFS-NameNode:[45] - Start to execute hdfs namenode -bootstrapStandby
[ERROR] 2024-05-20 13:37:39 TaskLogLogger-HDFS-NameNode:[54] - Namenode standby failed
[INFO] 2024-05-20 13:37:39 TaskLogLogger-HDFS-NameNode:[182] - 2024-05-20 13:37:27,815 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = ddp3/192.168.4.182
STARTUP_MSG:   args = [-bootstrapStandby]
STARTUP_MSG:   version = 3.3.3
STARTUP_MSG:   classpath = /opt/datasophon/hadoop-3.3.3/etc/hadoop:/opt/datasophon/hadoop-3.3.3/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar:/opt/datasophon/hadoop-3.3.3/share/hadoop/common/lib/snappy-3.3.3/share/hadoop/yarn/webapps:/opt/datasophon/hadoop-3.3.3/share/hadoop/yarn/yarn-service-examples:/opt/datasophon/hadoop-3.3.3/jmx/jmx_prometheus_javaagent-0.16.1.jar
STARTUP_MSG:   build = https://github.com/apache/hadoop.git -r d37586cbda38c338d9fe481addda5a05fb516f71; compiled by 'stevel' on 2022-05-09T16:36Z
STARTUP_MSG:   java = 1.8.0_333
************************************************************/
2024-05-20 13:37:27,821 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2024-05-20 13:37:27,906 INFO namenode.NameNode: createNameNode [-bootstrapStandby]
2024-05-20 13:37:28,084 INFO ha.BootstrapStandby: Found nn: nn1, ipc: ddp2/192.168.4.181:8020
2024-05-20 13:37:28,483 INFO common.Util: Assuming 'file' scheme for path /data/dfs/nn in configuration.
2024-05-20 13:37:28,496 INFO common.Util: Assuming 'file' scheme for path /data/dfs/nn in configuration.
2024-05-20 13:37:29,665 INFO ipc.Client: Retrying connect to server: ddp2/192.168.4.181:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-20 13:37:30,666 INFO ipc.Client: Retrying connect to server: ddp2/192.168.4.181:8020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-20 13:37:38,674 INFO ipc.Client: Retrying connect to server: ddp2/192.168.4.181:8020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-20 13:37:38,680 WARN ha.BootstrapStandby: Unable to fetch namespace information from remote NN at ddp2/192.168.4.181:8020: Call From ddp3/192.168.4.182 to ddp2:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
2024-05-20 13:37:38,680 ERROR ha.BootstrapStandby: Unable to fetch namespace information from any remote NN. Possible NameNodes: [RemoteNameNodeInfo [nnId=nn1, ipcAddress=ddp2/192.168.4.181:8020, httpAddress=http://ddp2:9870]]
2024-05-20 13:37:38,682 INFO util.ExitUtil: Exiting with status 2: ExitException
2024-05-20 13:37:38,683 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ddp3/192.168.4.182
************************************************************/

[ERROR] 2024-05-20 13:37:39 TaskLogLogger-HDFS-NameNode:[197] - 

ddp3安装 DataNode报错:

can not find log file

ddp3安装 ZKFC报错:

can not find log file

[root@ddp1 data]# jps
49890 QuorumPeerMain
2757 DataSophonApplicationServer
50316 Jps
3022 WorkerApplicationServer
[root@ddp1 data]#
[root@ddp1 data]#
[root@ddp1 data]# jps
49890 QuorumPeerMain
2757 DataSophonApplicationServer
50440 JournalNode
50985 Jps
3022 WorkerApplicationServer
[root@ddp1 data]#
[root@ddp1 data]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.4.180 ddp1
192.168.4.181 ddp2
192.168.4.182 ddp3

[root@ddp2 logs]# jps
33795 NameNode
33971 DFSZKFailoverController
34135 Jps
2844 WorkerApplicationServer
33725 JournalNode
19407 QuorumPeerMain
[root@ddp2 logs]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.4.180 ddp1
192.168.4.181 ddp2
192.168.4.182 ddp3

[root@ddp3 ~]# jps
19477 QuorumPeerMain
36393 Jps
10683 WorkerApplicationServer
36283 JournalNode
[root@ddp3 ~]#
[root@ddp3 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.4.180 ddp1
192.168.4.181 ddp2
192.168.4.182 ddp3


都是默认配置
![image](https://github.com/datavane/datasophon/assets/31399968/ad6e4fc1-5f1d-4a0f-aca7-ecd247f2ec5d)


rm -rf  /data/tmp &&  rm -rf  /data/dfs  &&  rm -rf  /opt/datasophon/hadoop-3.3.3  &&  rm -rf  /opt/datasophon/hdfs   &&  rm -rf  /home/*      我删除了目录重新安装也是报这个错




### What you expected to happen

HDFS一次性安装成功

### How to reproduce

OS: Centos 7.9
Version: DataSophon-1.2.1

### Anything else

_No response_

### Version

dev

### Are you willing to submit PR?

- [X] Yes I am willing to submit a PR!

### Code of Conduct

- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
@misteruly misteruly added the bug Something isn't working label May 20, 2024
@ToolsOnKeys
Copy link

新增ipc.client.connect.max.retries和ipc.client.connect.retry.interval参数
作用:防止出现journalnode服务ConnectException

@yuhuang123456
Copy link

新增ipc.client.connect.max.retries和ipc.client.connect.retry.interval参数 作用:防止出现journalnode服务ConnectException

这个1.2.1版本怎么接入

@ToolsOnKeys
Copy link

新增ipc.client.connect.max.retries和ipc.client.connect.retry.interval参数 作用:防止出现journalnode服务ConnectException

这个1.2.1版本怎么接入

详见 #569

@yuhuang123456
Copy link

yuhuang123456 commented Jun 27, 2024

新增ipc.client.connect.max.retries和ipc.client.connect.retry.interval参数作用:防止出现journalnode服务ConnectException

这个1.2.1版本怎么呈现

详见#569
1.2.1版本尝试在hdfs下面的core-site里面加入重试的value。但是还是一样报连接异常.我本地是3.3.6版本,从3.3.3复制了etc/hadoop文件下复制了whitelist和blacklist,fair-scheduler.xml三个文件到3.3.6.
另外服务器是不能联网的,安装一些环境也只能离线安装,不知道是不是这个影响了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants