翁谌缜 发表于 2025-6-4 22:23:54

Zookeeper Java客户端连接慢、超时问题Ad-Hoc检查清单

TL;DR

排查思路:

[*]首先确认你的设备到zookeeper的连通性是OK的,可通过命令echo srvr | nc HOST 2181,检查是否可以正常打印节点信息。windows用户可以在命令行输入telnet HOST 2181连接后输入srvr然后回车。
[*]若步骤1检查OK,但就是连接慢或者超时,则通过启动进程连接zk期间执行jstack -l 获取线程dump信息进一步分析,若不想一一排查,可尝试下面方法快速试下:
[*]本地配置hosts文件,添加下面条目:127.0.0.1   localhost mbpro.local
::1         localhost mbpro.local注意:mbpro.local要替换为hostname命令的输出
[*]启动Java进程时配置系统属性:-Djava.net.preferIPv4Stack=true -Dzookeeper.sasl.client=false

case记录

Zookeeper Java客户端初始化打印日志记录环境信息,卡在InetAddress.getLocalHost().getCanonicalHostName()方法
"ZookeeperServiceUrlProvider-1" #1 prio=5 os_prio=0 tid=0x00000000033b0800 nid=0x432c runnable
   java.lang.Thread.State: RUNNABLE
        at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1253)
        at java.net.InetAddress.getHostFromNameService(InetAddress.java:634)
        at java.net.InetAddress.getCanonicalHostName(InetAddress.java:588)
        at org.apache.zookeeper.Environment.list(Environment.java:62)
        at org.apache.zookeeper.Environment.logEnv(Environment.java:98)
        at org.apache.zookeeper.ZooKeeper.<clinit>(ZooKeeper.java:97)
        at ...解决方法:本地配置hosts文件,添加下面条目
127.0.0.1   localhost mbpro.local
::1         localhost mbpro.local注意:mbpro.local要替换为hostname命令的输出
DNS解析响应慢导致zk客户端SendThread卡住较长时间
"ZookeeperServiceUrlProvider-1-SendThread()@9231" daemon prio=5 tid=0x1b nid=NA waiting
java.lang.Thread.State: WAITING
          at java.lang.Object.wait(Object.java:-1)
          at java.lang.Object.wait(Object.java:502)
          at java.net.InetAddress.checkLookupTable(InetAddress.java:1393)
          at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1310)
          at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
          at java.net.InetAddress.getAllByName0(InetAddress.java:1253)
          at java.net.InetAddress.getHostFromNameService(InetAddress.java:634)
          at java.net.InetAddress.getHostName(InetAddress.java:559)
          at java.net.InetAddress.getHostName(InetAddress.java:531)
          at java.net.InetSocketAddress$InetSocketAddressHolder.getHostName(InetSocketAddress.java:82)
          at java.net.InetSocketAddress$InetSocketAddressHolder.access$600(InetSocketAddress.java:56)
          at java.net.InetSocketAddress.getHostName(InetSocketAddress.java:345)
          at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:998)
          at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1060)解决方法:

[*]进一步排查DNS解析响应慢的问题
[*]切换至使用IP地址方式连接
客户端尝试SASL认证方式触发DNS解析,DNS解析响应慢导致超时
"ZookeeperServiceUrlProvider-1-SendThread(HOST:2181)" #34 daemon prio=5 os_prio=0 tid=0x0000024d6f94a000 nid=0xa984 runnable
   java.lang.Thread.State: RUNNABLE
      at java.net.Inet4AddressImpl.getHostByAddr(Native Method)
      at java.net.InetAddress$2.getHostByAddr(InetAddress.java:933)
      at java.net.InetAddress.getHostFromNameService(InetAddress.java:618)
      at java.net.InetAddress.getHostName(InetAddress.java:560)
      at java.net.InetAddress.getHostName(InetAddress.java:532)
      at java.net.InetSocketAddress$InetSocketAddressHolder.getHostName(InetSocketAddress.java:82)
      at java.net.InetSocketAddress$InetSocketAddressHolder.access$600(InetSocketAddress.java:56)
      at java.net.InetSocketAddress.getHostName(InetSocketAddress.java:345)
      at org.apache.zookeeper.SaslServerPrincipal$WrapperInetSocketAddress.getHostName(SaslServerPrincipal.java:105)
      at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:59)
      at org.apache.zookeeper.SaslServerPrincipal.getServerPrincipal(SaslServerPrincipal.java:41)
      at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1161)
      at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1211)

   Locked ownable synchronizers:
      - None解决方法:启动Java进程时配置-Dzookeeper.sasl.client=false禁用SASL(如果你不知道SASL是什么,那你大概率并不需要它)
详见:SASL Client-Server mutual authentication
参考链接


[*]https://stackoverflow.com/a/39698914
关于作者

作者 萧易客 一线深耕消息中间件,RPC框架多年,欢迎评论区或通过邮件交流。
微信公众号: 萧易客
github id: shawyeok

来源:程序园用户自行投稿发布,如果侵权,请联系站长删除
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!
页: [1]
查看完整版本: Zookeeper Java客户端连接慢、超时问题Ad-Hoc检查清单