Hadoop基础

2024-04-13 00:40| 来源: 网络整理| 查看: 265

　　　　　　　　　　　　　　　　　　Hadoop基础-完全分布式模式部署yarn日志聚集功能

　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　作者：尹正杰

　　其实我们不用配置也可以在服务器后台通过命令行的形式查看相应的日志，但为了更方便查看日志，我们可以将其配置成通过webUI的形式访问日志，本篇博客会手把手的教你如何实操。如果你的集群配置比较低的话，并不建议开启日志，但是一般的大数据集群，服务器配置应该都不低，不过最好根据实际情况考虑。

一.查看日志信息

1>.通过web界面查看日志信息

2>.webUI默认是无法查看到日志信息

3>.通过命令行查看

　　日志默认存放在安装hadoop目录的logs文件夹中，其实我们不用配置web页面也可以查看相应的日志，下图就是通过命令行的形式查看日志。

二.配置日志聚集功能

1>.停止hadoop集群

[yinzhengjie@s101 ~]$ stop-dfs.sh Stopping namenodes on [s101 s105] s101: stopping namenode s105: stopping namenode s103: stopping datanode s104: stopping datanode s102: stopping datanode Stopping journal nodes [s102 s103 s104] s102: stopping journalnode s104: stopping journalnode s103: stopping journalnode Stopping ZK Failover Controllers on NN hosts [s101 s105] s101: stopping zkfc s105: stopping zkfc [yinzhengjie@s101 ~]$ 停止hdfs分布式文件系统（[yinzhengjie@s101 ~]$ stop-dfs.sh ）

[yinzhengjie@s101 ~]$ stop-yarn.sh stopping yarn daemons s101: stopping resourcemanager s105: no resourcemanager to stop s102: stopping nodemanager s104: stopping nodemanager s103: stopping nodemanager s102: nodemanager did not stop gracefully after 5 seconds: killing with kill -9 s104: nodemanager did not stop gracefully after 5 seconds: killing with kill -9 s103: nodemanager did not stop gracefully after 5 seconds: killing with kill -9 no proxyserver to stop [yinzhengjie@s101 ~]$ 停止yarn集群（[yinzhengjie@s101 ~]$ stop-yarn.sh ）

[yinzhengjie@s101 ~]$ mr-jobhistory-daemon.sh stop historyserver stopping historyserver [yinzhengjie@s101 ~]$ 停止yarn日志服务（[yinzhengjie@s101 ~]$ mr-jobhistory-daemon.sh stop historyserver）

[yinzhengjie@s101 ~]$ more `which xcall.sh` #!/bin/bash #@author :yinzhengjie #blog:http://www.cnblogs.com/yinzhengjie #EMAIL:[email protected] #判断用户是否传参 if [ $# -lt 1 ];then echo "请输入参数" exit fi #获取用户输入的命令 cmd=$@ for (( i=101;i.修改“yarn-site.xml”配置文件

1 [yinzhengjie@s101 ~]$ more /soft/hadoop/etc/hadoop/yarn-site.xml 2 3 4 5 yarn.resourcemanager.hostname 6 s101 7 8 9 yarn.nodemanager.aux-services 10 mapreduce_shuffle 11 12 13 14 yarn.nodemanager.pmem-check-enabled 15 false 16 17 18 19 yarn.nodemanager.vmem-check-enabled 20 false 21 22 23 24 yarn.log-aggregation-enable 25 true 26 27 28 29 30 yarn.log-aggregation.retain-seconds 31 604800 32 33 34 35 36 37 38 61 [yinzhengjie@s101 ~]$

3>.分发配置文件到各个节点

[yinzhengjie@s101 ~]$ more `which xrsync.sh` #!/bin/bash #@author :yinzhengjie #blog:http://www.cnblogs.com/yinzhengjie #EMAIL:[email protected] #判断用户是否传参 if [ $# -lt 1 ];then echo "请输入参数"; exit fi #获取文件路径 file=$@ #获取子路径 filename=`basename $file` #获取父路径 dirpath=`dirname $file` #获取完整路径 cd $dirpath fullpath=`pwd -P` #同步文件到DataNode for (( i=102;i.启动hadoop集群

[yinzhengjie@s101 ~]$ start-dfs.sh Starting namenodes on [s101 s105] s105: starting namenode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-namenode-s105.out s101: starting namenode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-namenode-s101.out s103: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s103.out s102: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s102.out s104: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s104.out Starting journal nodes [s102 s103 s104] s102: starting journalnode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-journalnode-s102.out s103: starting journalnode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-journalnode-s103.out s104: starting journalnode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-journalnode-s104.out Starting ZK Failover Controllers on NN hosts [s101 s105] s105: starting zkfc, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-zkfc-s105.out s101: starting zkfc, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-zkfc-s101.out [yinzhengjie@s101 ~]$ 启动hdfs分布式文件系统（[yinzhengjie@s101 ~]$ start-dfs.sh ）

[yinzhengjie@s101 ~]$ start-yarn.sh starting yarn daemons s101: starting resourcemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-resourcemanager-s101.out s105: starting resourcemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-resourcemanager-s105.out s102: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-nodemanager-s102.out s103: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-nodemanager-s103.out s104: starting nodemanager, logging to /soft/hadoop-2.7.3/logs/yarn-yinzhengjie-nodemanager-s104.out [yinzhengjie@s101 ~]$ 启动yarn集群（[yinzhengjie@s101 ~]$ start-yarn.sh ）

[yinzhengjie@s101 ~]$ mr-jobhistory-daemon.sh start historyserver starting historyserver, logging to /soft/hadoop-2.7.3/logs/mapred-yinzhengjie-historyserver-s101.out [yinzhengjie@s101 ~]$ 启动yarn日志服务（[yinzhengjie@s101 ~]$ mr-jobhistory-daemon.sh start historyserver）

[yinzhengjie@s101 ~]$ xcall.sh jps ============= s101 jps ============ 7648 DFSZKFailoverController 8210 Jps 7333 NameNode 8123 JobHistoryServer 7836 ResourceManager 命令执行成功 ============= s102 jps ============ 4131 DataNode 4217 JournalNode 4363 NodeManager 4491 Jps 2335 QuorumPeerMain 命令执行成功 ============= s103 jps ============ 4384 JournalNode 4529 NodeManager 4661 Jps 4298 DataNode 2333 QuorumPeerMain 命令执行成功 ============= s104 jps ============ 4324 JournalNode 4470 NodeManager 2328 QuorumPeerMain 4601 Jps 4238 DataNode 命令执行成功 ============= s105 jps ============ 4482 NameNode 4778 Jps 4590 DFSZKFailoverController 命令执行成功 [yinzhengjie@s101 ~]$ 检查集群是否启动成功（[yinzhengjie@s101 ~]$ xcall.sh jps）

5>.在yarn上执行MapReduce程序

[yinzhengjie@s101 ~]$ hdfs dfs -rm -R /yinzhengjie/data/output 18/08/21 09:22:52 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes. Deleted /yinzhengjie/data/output [yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ hadoop jar /soft/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /yinzhengjie/data/input /yinzhengjie/data/output 18/08/21 09:23:10 INFO client.RMProxy: Connecting to ResourceManager at s101/172.30.1.101:8032 18/08/21 09:23:11 INFO input.FileInputFormat: Total input paths to process : 1 18/08/21 09:23:11 INFO mapreduce.JobSubmitter: number of splits:1 18/08/21 09:23:12 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1534857666985_0001 18/08/21 09:23:12 INFO impl.YarnClientImpl: Submitted application application_1534857666985_0001 18/08/21 09:23:12 INFO mapreduce.Job: The url to track the job: http://s101:8088/proxy/application_1534857666985_0001/ 18/08/21 09:23:12 INFO mapreduce.Job: Running job: job_1534857666985_0001 18/08/21 09:23:29 INFO mapreduce.Job: Job job_1534857666985_0001 running in uber mode : false 18/08/21 09:23:29 INFO mapreduce.Job: map 0% reduce 0% 18/08/21 09:23:45 INFO mapreduce.Job: map 100% reduce 0% 18/08/21 09:23:56 INFO mapreduce.Job: map 100% reduce 100% 18/08/21 09:23:56 INFO mapreduce.Job: Job job_1534857666985_0001 completed successfully 18/08/21 09:23:56 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=4469 FILE: Number of bytes written=249687 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=3931 HDFS: Number of bytes written=3315 HDFS: Number of read operations=6 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=13048 Total time spent by all reduces in occupied slots (ms)=7445 Total time spent by all map tasks (ms)=13048 Total time spent by all reduce tasks (ms)=7445 Total vcore-milliseconds taken by all map tasks=13048 Total vcore-milliseconds taken by all reduce tasks=7445 Total megabyte-milliseconds taken by all map tasks=13361152 Total megabyte-milliseconds taken by all reduce tasks=7623680 Map-Reduce Framework Map input records=104 Map output records=497 Map output bytes=5733 Map output materialized bytes=4469 Input split bytes=114 Combine input records=497 Combine output records=288 Reduce input groups=288 Reduce shuffle bytes=4469 Reduce input records=288 Reduce output records=288 Spilled Records=576 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=98 CPU time spent (ms)=1180 Physical memory (bytes) snapshot=442728448 Virtual memory (bytes) snapshot=4217950208 Total committed heap usage (bytes)=290455552 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=3817 File Output Format Counters Bytes Written=3315 [yinzhengjie@s101 ~]$

6>.查看yarn的记录信息

7>.查看历史日志

8>.查看日志信息

【本文地址】

公司简介

联系我们