hbase increment 使用使用出的错,求帮助

Hadoop,Hbase,Zookeeper错误日志及部分解决办法_服务器应用_Linux公社-Linux系统门户网站
你好,游客
Hadoop,Hbase,Zookeeper错误日志及部分解决办法
来源:Linux社区&
作者:matraxa
这是一位网友收集的,hbase,zookeeper错误日志及部分解决办法,以备以后遇到问题作为参考之用。
hadoop-0.20.2 & hbase-0.90.1集群启动错误问题解决:
问题如下:org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol org.apache.hadoop.hdfs.protocol.ClientProtocol version mismatch. (client = 42, server = 41)& &&&&&&& at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:364)&&&&&&& at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113)&&&&&&& at org.apache.hadoop.hdfs.DFSClient.&init&(DFSClient.java:215)&&&&&&& at org.apache.hadoop.hdfs.DFSClient.&init&(DFSClient.java:177)
……………………………………
00:14:41,550 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
初步认为是hadoop-0.20.2 & hbase-0.90.1版本问题造成的,hbase-0.90.1/lib中hadoop-core-0.20-append- r1056497.jar使用的hadoop-core-0.20,因此将其替换为hadoop-0.20.2-core.jar即可
Hbase重启之后就无法启动的现象:当我们使用Hbase 0.20.2的时候,遇到了2个奇怪的问题。
我们使用了数台机器构建了一个集群,并且按照Hadoop/Hbase的"Getting Started"安装配置了Hadoop和Hbase。之后能够正常启动Hadoop和Hbase,并且创建table和插入数据。
不过,当我们访问Master的页面时: http://10.37.17.252:60010/master.jsp ,我们发现了第一个问题:在regionserver区域,出现了2个127.0.0.1的regionserver,但是我们并没有在conf/regionservers将master设置为regionserver:Region ServersAddress Start Code Load127.0.0.1:3321075 requests=0, regions=0, usedHeap=0, maxHeap=0127.0.0.1:3321096 requests=0, regions=0, usedHeap=0, maxHeap=0
………………………………
但是,虽然出现了以上的怪现象,但是hbase似乎仍然能够正常工作。只是,当我们打算重启hbase的时候,我们发现了第二个问题:我们尝试运行bin/stop-hbase.sh,之后,又运行启动hbase的脚本:bin/ start-hbase.sh,这一次,当我们访问master页面的时候http://10.37.17.252:60010/master.jsp,出现了如下的错误:HTTP ERROR: 500Trying to contact region server null for region , row ", but failed after 3 attempts.Exceptions:
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /10.37.17.248:60020 after attempts=1
…………………………
此时,Hbase能够进入shell,但是无法执行任何操作。于是再次试图关闭hbase的时候,却发现无法停止master,那个"stop master"后面的“.”出现了许许多多,仍然无法停止master节点。于是我们不得不强制kill掉master。Hbase就这样挂掉了。。。。。
Hbase重启之后就无法启动的原因分析:经过多方排查,最后我在尝试使用netstat -an查看端口占用情况的时候发现:在WAMDM1节点上,regionserver占用的60020端口占用为:127.0.0.1:60020而在WAMDM2节点上, regionserver占用的60020端口占用为:10.37.17.249:60020我感觉颇为蹊跷,之后便检查/etc/hosts文件,果然发现在WAMDM1和WAMDM2下的hosts文件不同。在WAMDM1的hosts文件中的内容为:
127.0.0.1 WAMDM1 localhost.localdomain localhost10.37.17.248 WAMDM1. WAMDM110.37.17.249 WAMDM2. WAMDM210.37.17.250 WAMDM3. WAMDM310.37.17.251 WAMDM4. WAMDM410.37.17.252 WAMDM5. WAMDM5
大家注意第一行。我们在使用配置Hadoop/Hbase的时候,常常使用主机名来代替IP使用,但是在WAMDM1的机器上,WAMDM1被映射为127.0.0.1,于是master和regionserver之间的通信就出错。这也就是为什么我们经常在日志中以及错误提示中看到如下信息:
Server at /10.37.17.248:60020 could not be reached after 1 tries, giving up.org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed setting up proxy to /10.37.17.248:60020 after attempts=1
& at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:424)
………………
Hbase重启之后就无法启动的解决办法:于是,我将所有机器的/etc/hosts文件,都改为了如下配置:
127.0.0.1 localhost10.37.17.248 WAMDM1. WAMDM110.37.17.249 WAMDM2. WAMDM210.37.17.250 WAMDM3. WAMDM310.37.17.251 WAMDM4. WAMDM410.37.17.252 WAMDM5. WAMDM5# The following lines are desirable for IPv6 capable hosts::1 localhost ip6-localhost ip6-loopbackfe00::0 ip6-localnetff00::0 ip6-mcastprefixff02::1 ip6-allnodesff02::2 ip6-allroutersff02::3 ip6-allhosts
同时,为了保险起见,我在stop-hbase.sh中,也加入停止regionservers的命令(虽然在网上查不到停止regionservers的必要性,以及这个脚本存在bug的证据,但是这样改经过测试是没有问题的):
"$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" --hosts "${HBASE_REGIONSERVERS}" stop regionserver&
这个命令需要加在stop master之前。至于不加入这个停止regionservers的命令是否可行,在未来我会进一步测试。
通过以上修改,出现两个127.0.0.1的Regionserver的问题,以及Hbase重启就挂的问题得到彻底解决!
Hbase重启之后就无法启动的问题解决之后的反思:从这次问题解决中吸取如下教训:
在配置分布式系统的时候,一定要注意各个机器之间配置的统一性,包括主机名(hosts文件)、用户名、Hadoop/Hbase各种配置文件等,对于不一致的情况,一定要特别仔细的检查,然后统一起来。已经不止一次在这方面吃亏了,希望大家切记!!!!
相关资讯 & & &
& (05月16日)
& (02月17日)
& (12/16/:47)
& (03月11日)
& (01月30日)
& (12/01/:40)
   同意评论声明
   发表
尊重网上道德,遵守中华人民共和国的各项有关法律法规
承担一切因您的行为而直接或间接导致的民事或刑事法律责任
本站管理人员有权保留或删除其管辖留言中的任意内容
本站有权在网站内转载或引用您的评论
参与本评论即表明您已经阅读并接受上述条款The page is temporarily unavailable
nginx error!
The page you are looking for is temporarily unavailable.
Please try again later.
Website Administrator
Something has triggered an error on your
This is the default error page for
nginx that is distributed with
It is located
/usr/share/nginx/html/50x.html
You should customize this error page for your own
site or edit the error_page directive in
the nginx configuration file
/etc/nginx/nginx.conf.The page is temporarily unavailable
nginx error!
The page you are looking for is temporarily unavailable.
Please try again later.
Website Administrator
Something has triggered an error on your
This is the default error page for
nginx that is distributed with
It is located
/usr/share/nginx/html/50x.html
You should customize this error page for your own
site or edit the error_page directive in
the nginx configuration file
/etc/nginx/nginx.conf.The page is temporarily unavailable
nginx error!
The page you are looking for is temporarily unavailable.
Please try again later.
Website Administrator
Something has triggered an error on your
This is the default error page for
nginx that is distributed with
It is located
/usr/share/nginx/html/50x.html
You should customize this error page for your own
site or edit the error_page directive in
the nginx configuration file
/etc/nginx/nginx.conf.The page is temporarily unavailable
nginx error!
The page you are looking for is temporarily unavailable.
Please try again later.
Website Administrator
Something has triggered an error on your
This is the default error page for
nginx that is distributed with
It is located
/usr/share/nginx/html/50x.html
You should customize this error page for your own
site or edit the error_page directive in
the nginx configuration file
/etc/nginx/nginx.conf.

我要回帖

更多关于 hbase increment 使用 的文章

 

随机推荐