Issue Type: | Bug |
---|---|
Priority: | 3 - Elevated |
Status: | Resolved |
Created at: | 2017-02-08T17:43:24.000Z |
Updated at: | 2019-04-22T23:44:27.714Z |
Created by: | Former user |
---|---|
Reported by: | Former user |
Assigned to: | Former user |
Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2017-02-27T18:41:45.000Z)
After some major networking issues, we were getting alarms like this one:
{ "v": 1, "type": "probe", "user": "8a70155f-752f-eb91-a755-f4edbee5c99f", "probeUuid": "41b83717-42b5-4ad2-b71e-96533b805e12", "clear": false, "data": { "message": "Command failed (exit status: 1).", "value": 3, "details": { "cmd": "/bin/bash -c 'if [[ -f /opt/smartdc/minnow/bin/check-minnow ]]; then /opt/smartdc/minnow/bin/check-minnow; exit $?; fi; let delta=$(date +%s)-$(PATH=/opt/smartdc/minnow/build/node/bin:/opt/smartdc/minnow/node_modules/.bin:$PATH findobjects -h $(cat /opt/smartdc/minnow/etc/config.json | json moray.host) manta_storage hostname=$(hostname)* | json -e _mtime=_mtime/1000 -e _mtime=~~_mtime _mtime) ; test $delta -lt 900'", "exitStatus": 1, "signal": null, "stdout": "", "stderr": "check-minnow.js: record \"090e6300-cdd2-413a-b5e6-9e74fe3a996a.stor.us-east.scloud.host\" is too old (found timestamp \"2017-02-08T16:28:36.429Z\", expecting one from at least \"2017-02-08T16:34:05.572Z\"\n" } }, "machine": "090e6300-cdd2-413a-b5e6-9e74fe3a996a", "uuid": "0504760b-d2a1-ec77-c6bd-c9dd1e74aca7", "time": 1486572545589, "agent": "090e6300-cdd2-413a-b5e6-9e74fe3a996a", "agentAlias": "storage.us-east.scloud.host-090e6300", "relay": "00000000-0000-0000-0000-0cc47adec00c" }
That message reports that the minnow record is too old (about 6 minutes older than the 15-minute grace period). Over on that minnow, I'm seeing these entries in the minnow log:
[2017-02-08T17:03:10.668Z] ERROR: minnow/46973 on 090e6300-cdd2-413a-b5e6-9e74fe3a996a: moray: update failed VError: moray client ("10.136.1.41:55825" to "10.128.0.171:2021"): request failed: server error: connect ECONNREFUSED at FastClientRequest.<anonymous> (/opt/smartdc/minnow/node_modules/moray/lib/rpc.js:105:19) at FastClientRequest.g (events.js:180:16) at FastClientRequest.emit (events.js:95:17) at Object._onImmediate (/opt/smartdc/minnow/node_modules/moray/node_modules/fast/lib/fast_client.js:601:11) at processImmediate [as _immediateCallback] (timers.js:363:15) Caused by: FastRequestError: request failed: server error: connect ECONNREFUSED at FastClient.requestFail (/opt/smartdc/minnow/node_modules/moray/node_modules/fast/lib/fast_client.js:582:22) at FastClient.onMessage (/opt/smartdc/minnow/node_modules/moray/node_modules/fast/lib/fast_client.js:753:7) at FastMessageDecoder.onDecoderMessage (/opt/smartdc/minnow/node_modules/moray/node_modules/fast/lib/fast_client.js:431:48) at FastMessageDecoder.emit (events.js:95:17) at FastMessageDecoder.<anonymous> (_stream_readable.js:765:14) at FastMessageDecoder.emit (events.js:92:17) at emitReadable_ (_stream_readable.js:427:10) at emitReadable (_stream_readable.js:423:5) at readableAddChunk (_stream_readable.js:166:9) at FastMessageDecoder.Readable.push (_stream_readable.js:128:10) at FastMessageDecoder.Transform.push (_stream_transform.js:140:32) Caused by: FastServerError: server error: connect ECONNREFUSED at FastClient.onMessage (/opt/smartdc/minnow/node_modules/moray/node_modules/fast/lib/fast_client.js:753:28) at FastMessageDecoder.onDecoderMessage (/opt/smartdc/minnow/node_modules/moray/node_modules/fast/lib/fast_client.js:431:48) at FastMessageDecoder.emit (events.js:95:17) at FastMessageDecoder.<anonymous> (_stream_readable.js:765:14) at FastMessageDecoder.emit (events.js:92:17) at emitReadable_ (_stream_readable.js:427:10) at emitReadable (_stream_readable.js:423:5) at readableAddChunk (_stream_readable.js:166:9) at FastMessageDecoder.Readable.push (_stream_readable.js:128:10) at FastMessageDecoder.Transform.push (_stream_transform.js:140:32) at FastMessageDecoder.decode (/opt/smartdc/minnow/node_modules/moray/node_modules/fast/lib/fast_protocol.js:454:8) Caused by: Error: connect ECONNREFUSED at errnoException (net.js:901:11) at Object.afterConnect [as oncomplete] (net.js:892:19)
I misread this at first: it's not minnow getting ECONNREFUSED from moray; rather it's the moray server responding to the RPC request with an error indicating that it (moray) had gotten ECONNREFUSED.
The pid for that moray in that zone is:
[root@d63cfaae (moray) ~]$ svcs -p moray-2021 STATE STIME FMRI online Feb_06 svc:/smartdc/application/moray:moray-2021 Feb_06 5374 node
It's not stuck or anything:
[root@d63cfaae (moray) ~]$ pstack 5374 5374: node main.js -f etc/config.json -p 2021 ----------------- lwp# 1 / thread# 1 -------------------- fe9819b7 portfs (6, 5, 8043c04, 400, 1, 8047c04) 083f1a1b uv__io_poll (8ef54e0, 194, f4240, 0, 8ef54f8, 0) + 18b 083e5b1a uv_run (8ef54e0, 0, 8047d14, 0, 5, 8fc8f60) + da 0828b25d _ZN4node5StartEiPPc (6, 8047d14, 8047cc8, 8047ce8, 82802a4, feffb0a8) + 16d 0829b97b main (6, 8047d14, 8047d30, 8047d08, 81efc62, 82802b0) + 1b 081efcc3 _start (6, 8047e04, 8047e09, 8047e11, 8047e14, 8047e24) + 83
But here's a big red flag:
[root@d63cfaae (moray) ~]$ moraystat.d ------PG------ --------------MORAY--------------- PID CONN QLEN OPS GETS FIND PUTS UPDS DELS BTCH RIDX 5382 0 0 0 0 15 5 0 0 1 0 5374 0 0 0 0 18 5 0 0 1 0 5378 12 0 1568 0 383 3 0 0 5 0 5380 16 2 1835 0 452 3 0 0 5 0 5382 0 0 0 0 19 3 0 0 1 0 5374 0 0 0 0 19 7 0 0 1 0 5380 13 0 1838 0 437 5 0 0 17 0 5378 16 0 1536 0 367 2 0 0 16 0 5382 0 0 0 0 16 4 0 0 3 0 5374 0 0 0 0 17 5 0 0 3 0 5378 12 0 1486 0 367 4 0 0 0 0 5380 15 0 1741 1 418 7 0 0 15 0 5382 0 0 0 0 11 3 0 0 0 0 5374 0 0 0 0 15 5 0 0 1 0 5378 11 0 1417 0 350 2 0 0 3 0 5380 15 0 1812 1 446 3 0 0 3 0 5382 0 0 0 0 15 5 0 0 0 0 5374 0 0 0 1 17 3 0 0 2 0 5378 8 0 1514 2 358 4 0 0 18 0 5380 15 0 1843 0 441 1 0 0 18 0
Two of the four moray processes are doing tons of work, while the other two are doing very little and have no working connections to PostgreSQL.
Our bum process is spewing this to its log:
[2017-02-08T17:22:44.467Z] FATAL: moray/pgpool/5374 on d63cfaae-4535-40b1-904c-4c5160ea6aa5: unable to create PG client Error: connect ECONNREFUSED at errnoException (net.js:901:11) at Object.afterConnect [as oncomplete] (net.js:892:19)
Unfortunately, it's not logging the IP it's trying to connect to, but we can guess from the netstat output:
[root@d63cfaae (moray) ~]$ netstat -a -f inet -P tcp -n | grep -v ESTABLISHED TCP: IPv4 Local Address Remote Address Swind Send-Q Rwind Recv-Q State -------------------- -------------------- ----- ------ ----- ------ ----------- *.2020 *.* 0 0 1048576 0 LISTEN *.2023 *.* 0 0 1048576 0 LISTEN *.2024 *.* 0 0 1048576 0 LISTEN *.2022 *.* 0 0 1048576 0 LISTEN *.2021 *.* 0 0 1048576 0 LISTEN 10.128.0.171.36922 10.128.0.139.5432 0 0 1049740 0 SYN_SENT 10.128.0.171.50991 10.128.0.139.5432 0 0 1049740 0 SYN_SENT 10.128.0.171.49023 10.128.0.139.5432 0 0 1049740 0 SYN_SENT 10.128.0.171.47552 10.128.0.139.5432 0 0 1049740 0 SYN_SENT ...
All of the SYN_SENT connections are to 10.128.0.139. From a postgres zone for this shard, we can see that that's the deposed peer for this shard, so ECONNREFUSED is not surprising:
[root@67a2dda3 (postgres) ~]$ manatee-adm peers ROLE PEERNAME IP primary 967a8538-357f-4687-8f8d-266301bfe42e 10.132.0.136 sync c5d5239c-d720-42e3-89d8-0abeba0f51f7 10.136.0.148 deposed 67a2dda3-5e1b-4c75-a522-2dd80d99064d 10.128.0.139
But why didn't Moray notice that the cluster state has changed?
Do the busted moray processes have established connections to ZooKeeper? I picked pid 5374. Its "pfiles" output shows an fd to one of the ZooKeeper peers:
524: S_IFSOCK mode:0666 dev:568,0 ino:53053 uid:0 gid:0 rdev:0,0 O_RDWR|O_NONBLOCK FD_CLOEXEC SOCK_STREAM SO_SNDBUF(128000),SO_RCVBUF(1049800) sockname: AF_INET 10.128.0.171 port: 44776 peername: AF_INET 10.132.0.129 port: 2181
Here's that connection in netstat(1) on the Moray side:
[root@d63cfaae (moray) ~]$ netstat -f inet -P tcp -n | grep 10.132.0.129 10.128.0.171.54965 10.132.0.129.2181 1049792 0 1049800 0 ESTABLISHED 10.128.0.171.44776 10.132.0.129.2181 1049792 0 1049800 0 ESTABLISHED [root@d63cfaae (moray) ~]$
But there's no such connection from ZooKeeper:
[root@aad205d9 (nameservice) ~]$ netstat -f inet -P tcp -n -a | grep 10.128.0.171 10.132.0.129.2181 10.128.0.171.54965 1049792 0 1049800 0 ESTABLISHED [root@aad205d9 (nameservice) ~]$
There are log entries from ZooKeeper last night related to this client, though:
2017-02-08 04:33:17,977 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@213] - Accepted socket connection from /10.128.0.171:44776 2017-02-08 04:33:17,977 [myid:3] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@792] - Connection request from old client /10.128.0.171:44776; will be dropped if server is in r-o mode 2017-02-08 04:33:17,977 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@838] - Client attempting to establish new session at /10.128.0.171:44776 2017-02-08 04:33:17,978 [myid:3] - INFO [CommitProcessor:3:ZooKeeperServer@604] - Established session 0x359943906c30692 with negotiated timeout 40000 for client /10.128.0.171:44776 2017-02-08 04:34:00,002 [myid:3] - INFO [CommitProcessor:3:NIOServerCnxn@1000] - Closed socket connection for client /10.128.0.171:44776 which had sessionid 0x359943906c30692
On the Moray side, there's a gap in log entries from this pid from 2017-02-08T03:39:07.043Z to 2017-02-08T04:40:47.658Z, and we don't know why that is (but moray doesn't log under normal operation either, so we don't know if entries are missing or there just weren't any). But we do have kernel state for this connection. First, a fixed point mapping lbolt to wall clock time:
> *lb_info::print -t lbolt_info_t; !date lbolt_info_t { union id = { cyclic_id_t lbi_cyclic_id = 0xffffd062de650d40 int64_t lbi_id_pad = 0xffffd062de650d40 } int64_t lbi_thresh_calls = 0x4b int64_t lbi_thresh_interval = 0xa int64_t lbi_debug_ts = 0 int64_t lbi_debug_time = 0x775eb74d int64_t lbi_internal = 0x192a17be2 uint32_t lbi_token = 0 boolean_t lbi_cyc_deactivate = 0x1 (B_TRUE) int64_t lbi_cyc_deac_start = 0x192a17be1 } February 8, 2017 07:51:28 PM UTC
Now, let's get the kernel state for this file descriptor:
> 0t5374::pid2proc | ::pfiles ! grep -w 524 524 SOCK ffffdcd1b2285000 socket: AF_INET 10.128.0.171 44776 remote: AF_INET 10.132.0.129 2181
I'm going to dump everything about this for future reference, but only a few of these fields are useful:
> ffffdcd1b2285000::print vnode_t v_data | ::print -t struct sonode so_proto_handle | ::print -t conn_t conn_t { kmutex_t conn_lock = { void *[1] _opaque = [ 0 ] } uint32_t conn_ref = 0x3 uint32_t conn_flags = 0x40001001 union conn_proto_priv = { tcp_t *cp_tcp = 0xffffddf6bb394a40 struct udp_s *cp_udp = 0xffffddf6bb394a40 struct icmp_s *cp_icmp = 0xffffddf6bb394a40 struct rts_s *cp_rts = 0xffffddf6bb394a40 struct iptun_s *cp_iptun = 0xffffddf6bb394a40 struct sctp_s *cp_sctp = 0xffffddf6bb394a40 void *cp_priv = 0xffffddf6bb394a40 } kcondvar_t conn_cv = { ushort_t _opaque = 0 } uint8_t conn_proto = 0x6 edesc_rpf conn_recv = tcp_input_data edesc_rpf conn_recvicmp = tcp_icmp_input edesc_vpf conn_verifyicmp = tcp_verifyicmp ip_xmit_attr_t *conn_ixa = 0xffffdcd1b0df1a00 crb_t conn_recv_ancillary = { union crbu = { uint32_t crbu_all = 0 struct crbb = { unsigned int:1 crbb_recvdstaddr :1 = 0 unsigned int:1 crbb_recvopts :1 = 0 unsigned int:1 crbb_recvif :1 = 0 unsigned int:1 crbb_recvslla :1 = 0 unsigned int:1 crbb_recvttl :1 = 0 unsigned int:1 crbb_ip_recvpktinfo :1 = 0 unsigned int:1 crbb_ipv6_recvhoplimit :1 = 0 unsigned int:1 crbb_ipv6_recvhopopts :1 = 0 unsigned int:1 crbb_ipv6_recvdstopts :1 = 0 unsigned int:1 crbb_ipv6_recvrthdr :1 = 0 unsigned int:1 crbb_old_ipv6_recvdstopts :1 = 0 unsigned int:1 crbb_ipv6_recvrthdrdstopts :1 = 0 unsigned int:1 crbb_ipv6_recvtclass :1 = 0 unsigned int:1 crbb_recvucred :1 = 0 unsigned int:1 crbb_timestamp :1 = 0 } } } squeue_t *conn_sqp = 0xffffd063194ba6c0 uint_t conn_state_flags = 0 int conn_lingertime = 0 unsigned int:1 conn_on_sqp :1 = 0 unsigned int:1 conn_linger :1 = 0 unsigned int:1 conn_useloopback :1 = 0 unsigned int:1 conn_broadcast :1 = 0 unsigned int:1 conn_reuseaddr :1 = 0 unsigned int:1 conn_keepalive :1 = 0 unsigned int:1 conn_multi_router :1 = 0 unsigned int:1 conn_unspec_src :1 = 0 unsigned int:1 conn_policy_cached :1 = 0x1 unsigned int:1 conn_in_enforce_policy :1 = 0 unsigned int:1 conn_out_enforce_policy :1 = 0 unsigned int:1 conn_debug :1 = 0 unsigned int:1 conn_ipv6_v6only :1 = 0 unsigned int:1 conn_oobinline :1 = 0 unsigned int:1 conn_dgram_errind :1 = 0 unsigned int:1 conn_exclbind :1 = 0 unsigned int:1 conn_mdt_ok :1 = 0 unsigned int:1 conn_allzones :1 = 0 unsigned int:1 conn_ipv6_recvpathmtu :1 = 0 unsigned int:1 conn_mcbc_bind :1 = 0 unsigned int:1 conn_reuseport :1 = 0 unsigned int:11 conn_pad_to_bit_31 :11 = 0 boolean_t conn_blocked = 0 (0) squeue_t *conn_initial_sqp = 0xffffd063194ba6c0 squeue_t *conn_final_sqp = 0 ill_t *conn_dhcpinit_ill = 0 ipsec_latch_t *conn_latch = 0 struct ipsec_policy_s *conn_latch_in_policy = 0 struct ipsec_action_s *conn_latch_in_action = 0 uint_t conn_bound_if = 0 queue_t *conn_rq = 0 queue_t *conn_wq = 0 dev_t conn_dev = 0 vmem_t *conn_minor_arena = 0 ip_helper_stream_info_t *conn_helper_info = 0 cred_t *conn_cred = 0xffffd063257091a0 pid_t conn_cpid = 0x14fe uint64_t conn_open_time = 0x117fa21e5 connf_t *conn_g_fanout = 0xffffd0906ef5ea30 struct conn_s *conn_g_next = 0xffffd1d179e5b840 struct conn_s *conn_g_prev = 0xffffd0beeebac180 struct ipsec_policy_head_s *conn_policy = 0 in6_addr_t conn_bound_addr_v6 = :: connf_t *conn_fanout = 0xfffff0ad54c8b300 struct conn_s *conn_next = 0 struct conn_s *conn_prev = 0 struct connua_v6addr = { in6_addr_t connua_laddr = ::ffff:10.128.0.171 in6_addr_t connua_faddr = ::ffff:10.132.0.129 } in6_addr_t conn_saddr_v6 = ::ffff:10.128.0.171 union u_port = { uint32_t connu_ports2 = 0xe8ae8508 struct connu_ports = { in_port_t connu_fport = 0x8508 in_port_t connu_lport = 0xe8ae } } uint_t conn_incoming_ifindex = 0 ill_t *conn_oper_pending_ill = 0 krwlock_t conn_ilg_lock = { void *[1] _opaque = [ 0 ] } ilg_t *conn_ilg = 0 kcondvar_t conn_refcv = { ushort_t _opaque = 0 } struct conn_s *conn_drain_next = 0 struct conn_s *conn_drain_prev = 0 idl_t *conn_idl = 0 mblk_t *conn_ipsec_opt_mp = 0 zoneid_t conn_zoneid = 0 int conn_rtaware = 0 kcondvar_t conn_sq_cv = { ushort_t _opaque = 0 } sock_upcalls_t *conn_upcalls = so_upcalls sock_upper_handle_t conn_upper_handle = 0xffffdcd5933cf7b0 unsigned int:2 conn_mlp_type :2 = 0 unsigned int:1 conn_anon_mlp :1 = 0 unsigned int:1 conn_anon_port :1 = 0 unsigned int:2 conn_mac_mode :2 = 0 unsigned int:1 conn_anon_priv_bind :1 = 0 unsigned int:1 conn_zone_is_global :1 = 0 unsigned int:1 conn_isvrrp :1 = 0 unsigned int:23 conn_spare :23 = 0 boolean_t conn_flow_cntrld = 0 (0) netstack_t *conn_netstack = 0xffffd090ce6ea900 ushort_t conn_ipversion = 0x4 sa_family_t conn_family = 0x2 uint_t conn_so_type = 0x2 uint_t conn_sndbuf = 0x1f400 uint_t conn_rcvbuf = 0x1004c8 uint_t conn_wroff = 0x78 uint_t conn_sndlowat = 0x3200 uint_t conn_rcvlowat = 0 uint8_t conn_default_ttl = 0x40 uint32_t conn_flowinfo = 0 in6_addr_t conn_v6lastdst = :: ushort_t conn_lastipversion = 0 in_port_t conn_lastdstport = 0 uint32_t conn_lastflowinfo = 0 uint_t conn_lastscopeid = 0 uint_t conn_lastsrcid = 0 in6_addr_t conn_v6lastsrc = :: ip_pkt_t conn_xmit_ipp = { uint_t ipp_fields = 0 in6_addr_t ipp_addr = :: uint_t ipp_unicast_hops = 0x40 uint_t ipp_hoplimit = 0 uint_t ipp_hopoptslen = 0 uint_t ipp_rthdrdstoptslen = 0 uint_t ipp_rthdrlen = 0 uint_t ipp_dstoptslen = 0 uint_t ipp_fraghdrlen = 0 ip6_hbh_t *ipp_hopopts = 0 ip6_dest_t *ipp_rthdrdstopts = 0 ip6_rthdr_t *ipp_rthdr = 0 ip6_dest_t *ipp_dstopts = 0 ip6_frag_t *ipp_fraghdr = 0 uint8_t ipp_tclass = 0 uint8_t ipp_type_of_service = 0 uint_t ipp_ipv4_options_len = 0 uint8_t *ipp_ipv4_options = 0 uint_t ipp_label_len_v4 = 0 uint8_t *ipp_label_v4 = 0 uint_t ipp_label_len_v6 = 0 uint8_t *ipp_label_v6 = 0 } uint8_t *conn_ht_iphc = 0xffffe3a84d7ac830 uint_t conn_ht_iphc_allocated = 0x50 uint_t conn_ht_iphc_len = 0x34 uint8_t *conn_ht_ulp = 0xffffe3a84d7ac844 uint_t conn_ht_ulp_len = 0x20 uint32_t conn_sum = 0 uint32_t conn_ioctlref = 0 } > ffffdcd1b2285000::print vnode_t v_data | ::print -t struct sonode so_proto_handle | ::print -t conn_t conn_proto_priv.cp_tcp[] tcp_t conn_proto_priv.cp_tcp = { struct tcp_s *conn_proto_priv.cp_tcp->tcp_time_wait_next = 0 struct tcp_s *conn_proto_priv.cp_tcp->tcp_time_wait_prev = 0 int64_t conn_proto_priv.cp_tcp->tcp_time_wait_expire = 0 struct conn_s *conn_proto_priv.cp_tcp->tcp_connp = 0xffffddf6bb394740 tcp_stack_t *conn_proto_priv.cp_tcp->tcp_tcps = 0xffffd0bee63a5200 int32_t conn_proto_priv.cp_tcp->tcp_state = 0 int32_t conn_proto_priv.cp_tcp->tcp_rcv_ws = 0x5 int32_t conn_proto_priv.cp_tcp->tcp_snd_ws = 0x5 uint32_t conn_proto_priv.cp_tcp->tcp_ts_recent = 0x178f8126 clock_t conn_proto_priv.cp_tcp->tcp_rto = 0x6c8 int64_t conn_proto_priv.cp_tcp->tcp_last_rcv_lbolt = 0x117fa25df uint32_t conn_proto_priv.cp_tcp->tcp_rto_initial = 0x3e8 uint32_t conn_proto_priv.cp_tcp->tcp_rto_min = 0x190 uint32_t conn_proto_priv.cp_tcp->tcp_rto_max = 0xea60 uint32_t conn_proto_priv.cp_tcp->tcp_snxt = 0xf1ba6d31 uint32_t conn_proto_priv.cp_tcp->tcp_swnd = 0x1004c0 uint32_t conn_proto_priv.cp_tcp->tcp_mss = 0x5a8 uint32_t conn_proto_priv.cp_tcp->tcp_iss = 0xf1ba6d00 uint32_t conn_proto_priv.cp_tcp->tcp_rnxt = 0xc4418b76 uint32_t conn_proto_priv.cp_tcp->tcp_rwnd = 0x1004c8 mblk_t *conn_proto_priv.cp_tcp->tcp_xmit_head = 0 mblk_t *conn_proto_priv.cp_tcp->tcp_xmit_last = 0xffffd0907fb78600 mblk_t *conn_proto_priv.cp_tcp->tcp_xmit_tail = 0 uint32_t conn_proto_priv.cp_tcp->tcp_unsent = 0 uint32_t conn_proto_priv.cp_tcp->tcp_xmit_tail_unsent = 0 uint32_t conn_proto_priv.cp_tcp->tcp_suna = 0xf1ba6d31 uint32_t conn_proto_priv.cp_tcp->tcp_rexmit_nxt = 0xf1ba6d31 uint32_t conn_proto_priv.cp_tcp->tcp_rexmit_max = 0xf1ba6d31 uint32_t conn_proto_priv.cp_tcp->tcp_cwnd = 0xb50 int32_t conn_proto_priv.cp_tcp->tcp_cwnd_cnt = 0 uint32_t conn_proto_priv.cp_tcp->tcp_ibsegs = 0 uint32_t conn_proto_priv.cp_tcp->tcp_obsegs = 0 uint32_t conn_proto_priv.cp_tcp->tcp_naglim = 0x1 uint32_t conn_proto_priv.cp_tcp->tcp_valid_bits = 0 timeout_id_t conn_proto_priv.cp_tcp->tcp_timer_tid = 0 uchar_t conn_proto_priv.cp_tcp->tcp_timer_backoff = 0 int64_t conn_proto_priv.cp_tcp->tcp_last_recv_time = 0x117fa2a51 uint32_t conn_proto_priv.cp_tcp->tcp_init_cwnd = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_urp_last_valid :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_hard_binding :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_fin_acked :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_fin_rcvd :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_fin_sent :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_ordrel_done :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_detached :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_zero_win_probe :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_loopback :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_localnet :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_syn_defense :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_set_timer :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_active_open :1 = 0x1 unsigned int:1 conn_proto_priv.cp_tcp->tcp_rexmit :1 = 0x1 unsigned int:1 conn_proto_priv.cp_tcp->tcp_snd_sack_ok :1 = 0x1 unsigned int:1 conn_proto_priv.cp_tcp->tcp_hwcksum :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_ip_forward_progress :1 = 0x1 unsigned int:1 conn_proto_priv.cp_tcp->tcp_ecn_ok :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_ecn_echo_on :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_ecn_cwr_sent :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_cwr :1 = 0 unsigned int:11 conn_proto_priv.cp_tcp->tcp_pad_to_bit31 :11 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_snd_ts_ok :1 = 0x1 unsigned int:1 conn_proto_priv.cp_tcp->tcp_snd_ws_ok :1 = 0x1 unsigned int:1 conn_proto_priv.cp_tcp->tcp_reserved_port :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_in_free_list :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_snd_zcopy_on :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_snd_zcopy_aware :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_xmit_zc_clean :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_wait_for_eagers :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_accept_error :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_send_discon_ind :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_cork :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_tconnind_started :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_lso :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_is_wnd_shrnk :1 = 0 unsigned int:18 conn_proto_priv.cp_tcp->tcp_pad_to_bit_31 :18 = 0 uint32_t conn_proto_priv.cp_tcp->tcp_initial_pmtu = 0x5dc mblk_t *conn_proto_priv.cp_tcp->tcp_reass_head = 0 mblk_t *conn_proto_priv.cp_tcp->tcp_reass_tail = 0 tcp_sack_info_t conn_proto_priv.cp_tcp->tcp_sack_info = { int32_t tcp_pipe = 0 tcp_seq tcp_fack = 0 tcp_seq tcp_sack_snxt = 0 int32_t tcp_max_sack_blk = 0x3 int32_t tcp_num_sack_blk = 0 sack_blk_t [5] tcp_sack_list = [ sack_blk_t { tcp_seq begin = 0 tcp_seq end = 0 }, sack_blk_t { tcp_seq begin = 0 tcp_seq end = 0 }, sack_blk_t { tcp_seq begin = 0 tcp_seq end = 0 }, sack_blk_t { tcp_seq begin = 0 tcp_seq end = 0 }, sack_blk_t { tcp_seq begin = 0 tcp_seq end = 0 }, ] int32_t tcp_num_notsack_blk = 0 uint32_t tcp_cnt_notsack_list = 0 notsack_blk_t *tcp_notsack_list = 0 } mblk_t *conn_proto_priv.cp_tcp->tcp_rcv_list = 0 mblk_t *conn_proto_priv.cp_tcp->tcp_rcv_last_head = 0 mblk_t *conn_proto_priv.cp_tcp->tcp_rcv_last_tail = 0 uint32_t conn_proto_priv.cp_tcp->tcp_rcv_cnt = 0 uint32_t conn_proto_priv.cp_tcp->tcp_cwnd_ssthresh = 0xb50 uint32_t conn_proto_priv.cp_tcp->tcp_cwnd_max = 0x1004c8 uint32_t conn_proto_priv.cp_tcp->tcp_csuna = 0xf1ba6d31 clock_t conn_proto_priv.cp_tcp->tcp_rtt_sa = 0x121e clock_t conn_proto_priv.cp_tcp->tcp_rtt_sd = 0x3f5 clock_t conn_proto_priv.cp_tcp->tcp_rtt_update = 0x1 clock_t conn_proto_priv.cp_tcp->tcp_ms_we_have_waited = 0 uint32_t conn_proto_priv.cp_tcp->tcp_swl1 = 0xc4418b9e uint32_t conn_proto_priv.cp_tcp->tcp_swl2 = 0xf1ba6d31 uint32_t conn_proto_priv.cp_tcp->tcp_rack = 0xc4418b76 uint32_t conn_proto_priv.cp_tcp->tcp_rack_cnt = 0 uint32_t conn_proto_priv.cp_tcp->tcp_rack_cur_max = 0 uint32_t conn_proto_priv.cp_tcp->tcp_rack_abs_max = 0x2 timeout_id_t conn_proto_priv.cp_tcp->tcp_ack_tid = 0 timeout_id_t conn_proto_priv.cp_tcp->tcp_push_tid = 0 uint32_t conn_proto_priv.cp_tcp->tcp_max_swnd = 0x1004c0 struct tcp_s *conn_proto_priv.cp_tcp->tcp_listener = 0 uint32_t conn_proto_priv.cp_tcp->tcp_irs = 0xc4418b75 uint32_t conn_proto_priv.cp_tcp->tcp_fss = 0xf1ba6cff uint32_t conn_proto_priv.cp_tcp->tcp_urg = 0 clock_t conn_proto_priv.cp_tcp->tcp_first_timer_threshold = 0x2710 clock_t conn_proto_priv.cp_tcp->tcp_second_timer_threshold = 0x493e0 clock_t conn_proto_priv.cp_tcp->tcp_first_ctimer_threshold = 0x2710 clock_t conn_proto_priv.cp_tcp->tcp_second_ctimer_threshold = 0x2bf20 uint32_t conn_proto_priv.cp_tcp->tcp_urp_last = 0 mblk_t *conn_proto_priv.cp_tcp->tcp_urp_mp = 0 mblk_t *conn_proto_priv.cp_tcp->tcp_urp_mark_mp = 0 int conn_proto_priv.cp_tcp->tcp_conn_req_cnt_q0 = 0 int conn_proto_priv.cp_tcp->tcp_conn_req_cnt_q = 0 int conn_proto_priv.cp_tcp->tcp_conn_req_max = 0 t_scalar_t conn_proto_priv.cp_tcp->tcp_conn_req_seqnum = 0 kmutex_t conn_proto_priv.cp_tcp->tcp_eager_lock = { void *[1] _opaque = [ 0 ] } struct tcp_s *conn_proto_priv.cp_tcp->tcp_eager_next_q = 0 struct tcp_s *conn_proto_priv.cp_tcp->tcp_eager_last_q = 0 struct tcp_s *conn_proto_priv.cp_tcp->tcp_eager_next_q0 = 0 struct tcp_s *conn_proto_priv.cp_tcp->tcp_eager_prev_q0 = 0 boolean_t conn_proto_priv.cp_tcp->tcp_conn_def_q0 = 0 (0) union conn_proto_priv.cp_tcp->tcp_conn = { mblk_t *tcp_eager_conn_ind = 0 mblk_t *tcp_opts_conn_req = 0 } uint32_t conn_proto_priv.cp_tcp->tcp_syn_rcvd_timeout = 0 int32_t conn_proto_priv.cp_tcp->tcp_ka_last_intrvl = 0 timeout_id_t conn_proto_priv.cp_tcp->tcp_ka_tid = 0 uint32_t conn_proto_priv.cp_tcp->tcp_ka_interval = 0x6ddd00 uint32_t conn_proto_priv.cp_tcp->tcp_ka_rinterval = 0 uint32_t conn_proto_priv.cp_tcp->tcp_ka_abort_thres = 0x75300 uint32_t conn_proto_priv.cp_tcp->tcp_ka_cnt = 0 int32_t conn_proto_priv.cp_tcp->tcp_client_errno = 0 ipha_t *conn_proto_priv.cp_tcp->tcp_ipha = 0xffffe3a84d7ac830 ip6_t *conn_proto_priv.cp_tcp->tcp_ip6h = 0 tcpha_t *conn_proto_priv.cp_tcp->tcp_tcpha = 0xffffe3a84d7ac844 uint16_t conn_proto_priv.cp_tcp->tcp_last_sent_len = 0x30 uint16_t conn_proto_priv.cp_tcp->tcp_last_recv_len = 0 uint16_t conn_proto_priv.cp_tcp->tcp_dupack_cnt = 0 kmutex_t *conn_proto_priv.cp_tcp->tcp_acceptor_lockp = 0 mblk_t *conn_proto_priv.cp_tcp->tcp_ordrel_mp = 0 t_uscalar_t conn_proto_priv.cp_tcp->tcp_acceptor_id = 0 int conn_proto_priv.cp_tcp->tcp_ipsec_overhead = 0 uint_t conn_proto_priv.cp_tcp->tcp_recvifindex = 0 uint_t conn_proto_priv.cp_tcp->tcp_recvhops = 0 uint_t conn_proto_priv.cp_tcp->tcp_recvtclass = 0 ip6_hbh_t *conn_proto_priv.cp_tcp->tcp_hopopts = 0 ip6_dest_t *conn_proto_priv.cp_tcp->tcp_dstopts = 0 ip6_dest_t *conn_proto_priv.cp_tcp->tcp_rthdrdstopts = 0 ip6_rthdr_t *conn_proto_priv.cp_tcp->tcp_rthdr = 0 uint_t conn_proto_priv.cp_tcp->tcp_hopoptslen = 0 uint_t conn_proto_priv.cp_tcp->tcp_dstoptslen = 0 uint_t conn_proto_priv.cp_tcp->tcp_rthdrdstoptslen = 0 uint_t conn_proto_priv.cp_tcp->tcp_rthdrlen = 0 mblk_t *conn_proto_priv.cp_tcp->tcp_timercache = 0xffffd0909077d148 kmutex_t conn_proto_priv.cp_tcp->tcp_closelock = { void *[1] _opaque = [ 0 ] } kcondvar_t conn_proto_priv.cp_tcp->tcp_closecv = { ushort_t _opaque = 0 } uint8_t conn_proto_priv.cp_tcp->tcp_closed = 0 uint8_t conn_proto_priv.cp_tcp->tcp_closeflags = 0 mblk_t conn_proto_priv.cp_tcp->tcp_closemp = { struct msgb *b_next = 0 struct msgb *b_prev = 0 struct msgb *b_cont = 0 unsigned char *b_rptr = 0 unsigned char *b_wptr = 0 struct datab *b_datap = 0 unsigned char b_band = 0 unsigned char b_tag = 0 unsigned short b_flag = 0 queue_t *b_queue = 0 } timeout_id_t conn_proto_priv.cp_tcp->tcp_linger_tid = 0 struct tcp_s *conn_proto_priv.cp_tcp->tcp_acceptor_hash = 0 struct tcp_s **conn_proto_priv.cp_tcp->tcp_ptpahn = 0 struct tcp_s *conn_proto_priv.cp_tcp->tcp_bind_hash = 0xffffd0bf08f733c0 struct tcp_s *conn_proto_priv.cp_tcp->tcp_bind_hash_port = 0 struct tcp_s **conn_proto_priv.cp_tcp->tcp_ptpbhn = 0xffffd09087e9fa00 struct tcp_rg_s *conn_proto_priv.cp_tcp->tcp_rg_bind = 0 uint_t conn_proto_priv.cp_tcp->tcp_maxpsz_multiplier = 0xa uint32_t conn_proto_priv.cp_tcp->tcp_lso_max = 0 uint32_t conn_proto_priv.cp_tcp->tcp_ofo_fin_seq = 0 uint32_t conn_proto_priv.cp_tcp->tcp_cwr_snd_max = 0 struct tcp_s *conn_proto_priv.cp_tcp->tcp_saved_listener = 0 uint32_t conn_proto_priv.cp_tcp->tcp_in_ack_unsent = 0 struct tcp_s *conn_proto_priv.cp_tcp->tcp_loopback_peer = 0 mblk_t *conn_proto_priv.cp_tcp->tcp_fused_sigurg_mp = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_fused :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_unfusable :1 = 0 unsigned int:1 conn_proto_priv.cp_tcp->tcp_fused_sigurg :1 = 0 unsigned int:29 conn_proto_priv.cp_tcp->tcp_fuse_to_bit_31 :29 = 0 kmutex_t conn_proto_priv.cp_tcp->tcp_non_sq_lock = { void *[1] _opaque = [ 0 ] } boolean_t conn_proto_priv.cp_tcp->tcp_issocket = 0x1 (B_TRUE) uint32_t conn_proto_priv.cp_tcp->tcp_squeue_bytes = 0 boolean_t conn_proto_priv.cp_tcp->tcp_closemp_used = 0 (0) struct tcp_s *conn_proto_priv.cp_tcp->tcp_eager_prev_drop_q0 = 0 struct tcp_s *conn_proto_priv.cp_tcp->tcp_eager_next_drop_q0 = 0 boolean_t conn_proto_priv.cp_tcp->tcp_flow_stopped = 0 (0) uint32_t conn_proto_priv.cp_tcp->tcp_snxt_shrunk = 0 sock_connid_t conn_proto_priv.cp_tcp->tcp_connid = 0x1 mblk_t *conn_proto_priv.cp_tcp->tcp_rsrv_mp = 0xffffd0a4b5f6f7a0 kmutex_t conn_proto_priv.cp_tcp->tcp_rsrv_mp_lock = { void *[1] _opaque = [ 0 ] } struct tcp_listen_cnt_s *conn_proto_priv.cp_tcp->tcp_listen_cnt = 0 timeout_id_t conn_proto_priv.cp_tcp->tcp_reass_tid = 0 uint32_t conn_proto_priv.cp_tcp->tcp_fin_wait_2_flush_interval = 0xea60 }
The key fields are conn_open_time = 0x117fa21e5 and tcp_last_recv_time = 0x117fa2a51. We can compute how long ago this was by subtracting them from the lbolt_internal we printed above, also minus lbolt_debug_time (because that's how the kernel appears to compute lbolt):
> 0x192a17be2-0x775eb74d-0x117fa2a51=E 55089732
and we can subtract this from the timestamp we recorded above for lbolt 0x192a17be2-0x775eb74d.
> new Date(new Date('February 8, 2017 07:51:28 PM UTC').getTime() - 55089732).toISOString() '2017-02-08T04:33:18.268Z'
From this, we have that the connection was opened close to 2017-02-08T04:33:16.112Z and last received a packet close to 2017-02-08T04:33:18.268Z. That's reasonably consistent with the ZooKeeper logs above, which showed that ZK accepted the connection at 2017-02-08 04:33:17.977, and it timed it out after 40 seconds of inactivity at 2017-02-08T04:34:00.002Z.
It looks from this like the moray process established a connection around 04:33:16Z, but then didn't do anything with it. 40 seconds later, ZK timed it out and closed it, but the Moray side never found out, which suggests a partition. But the fact that the Moray side didn't separately time out the connection suggests that it had not send any unacknowledged data over it, which basically always seems like a bug for a ZooKeeper client (which is supposed to heartbeat every few seconds).
We have a core file from this moray process. Hopefully that will shed some light on why this process isn't trying to send anything over that connection.
Also of note: tcp_ibsegs and tcp_obsegs are both 0, but tcp_last_sent_len is 48, and we're not sure what these mean.
In the core dump, we can see that the current zk client ConnectionManager is stuck waiting for the ConnectResponse frame:
> ::findjsobjects -c ConnectionManager | ::findjsobjects | ::jsprint state socket.destroyed socket._connecting socket._bytesDispatched socket.bytesRead connectTimeoutHandler.ontimeout 1 false false 48 0 null
This is a ConnectionManager that's in STATE_CONNECTING (1), whose socket is neither destroyed nor connecting, and whose connect timeout has been cleared. It has sent 48 bytes (the size of a ConnectRequest) and received none, so it's definitely in the indefinite stuck state where it will send no more traffic until it receives a ConnectResponse from ZK.
FWIW I've added a zk-stuck
analyzer to thoth to identify gcores of processes that are in this state.
Alex took a look at the core file and confirmed that this appears to be the same root cause as MANATEE-288: the node zookeeper client is stuck in the state transition where it has a connected TCP socket but has not received a response from ZooKeeper to its L7 connect request.
I checked whether the shard 1 Moray processes in east1b and east1c were affected by this. I think I logged in and ran moraystat.d and compared it to the output above. All four processes had working connections to PostgreSQL. As a result, I was able to mitigate the problem on shard 1 by taking the east1a zone out of service by running "svcadm disable -s registrar" there.
After doing that, we found we were still seeing request failures at muskie, and they were reporting ECONNREFUSED errors from node-fast. That's consistent with the same issue, and there's no reason to expect that it would be limited to shard 1. These appeared to be spread across all shards. I went to go see which Moray processes were affected using this command:
[root@headnode (us-east-1c) ~]# manta-oneach -s moray 'grep -c ECONNREFUSED /var/log/moray.log' SERVICE ZONE OUTPUT moray 02cbb82e 0 moray 187de16b 0 moray 1ea718ae 0 moray 23a267ae 0 moray 246790af 0 moray 2e130d3a 0 moray 2fc31cf5 0 moray 36dfa71a 0 moray 3b603936 0 moray 3e71cad1 0 moray 40a6317c 0 moray 46a37d98 0 moray 4783cdfe 0 moray 54d6acea 0 moray 6769cb7d 0 moray 8757af0d 0 moray 9bedb32a 0 moray 9cf77575 0 moray a249f546 0 moray a25c4080 0 moray a8855e3f 0 moray b70034da 0 moray c2ff63c0 0 moray cfa28605 0 moray f4bcc227 0
east1b looked just like east1c -- no errors in any process -- but east1a showed one moray process in each of four zones:
[root@headnode (us-east-1a) ~]# manta-oneach -s moray 'grep -c ECONNREFUSED /var/log/moray.log' SERVICE ZONE OUTPUT moray 0089fcd7 0 moray 08b021cf 0 moray 0a351171 867 moray 17b37a11 0 moray 1d11226b 87 moray 2bd7a375 0 moray 2d309372 0 moray 2dd7cb5b 228 moray 2eb18f5c 0 moray 4228c63b 0 moray 4e6e76f0 0 moray 59fd11a8 167 moray 9187bb05 0 moray 97d2dea1 0 moray 98d70f0e 0 moray ae75f0f5 0 moray b5b915ab 0 moray c62ea126 0 moray cf21edd3 0 moray d373891f 0 moray d63cfaae 0 moray d7fd4e9d 0 moray dd07da56 0 moray e143ca38 0 moray f2b69942 0
Specifically, it's these pids:
[root@headnode (us-east-1a) ~]# manta-oneach -s moray 'grep ECONNREFUSED /var/log/moray.log | json -ga pid msg | sort | uniq -c' SERVICE ZONE OUTPUT moray 0089fcd7 moray 08b021cf moray 0a351171 2000 4832 unable to create PG client moray 17b37a11 moray 1d11226b 4171 16018 unable to create PG client moray 2bd7a375 moray 2d309372 moray 2dd7cb5b 4811 22252 unable to create PG client moray 2eb18f5c moray 4228c63b moray 4e6e76f0 moray 59fd11a8 3081 54191 unable to create PG client moray 9187bb05 moray 97d2dea1 moray 98d70f0e moray ae75f0f5 moray b5b915ab moray c62ea126 moray cf21edd3 moray d373891f moray d63cfaae moray d7fd4e9d moray dd07da56 moray e143ca38 moray f2b69942
I modified the invocation to give me the full zonename:
[root@headnode (us-east-1a) ~]# manta-oneach -s moray 'grep -c ECONNREFUSED /var/log/moray.log > /dev/null && zonename' SERVICE ZONE OUTPUT moray 0089fcd7 moray 08b021cf moray 0a351171 0a351171-0080-443e-9e0e-a26f99f56dac moray 17b37a11 moray 1d11226b 1d11226b-bae1-4667-a3cb-20775a981f01 moray 2bd7a375 moray 2d309372 moray 2dd7cb5b 2dd7cb5b-3a38-443a-b04b-826645263e41 moray 2eb18f5c moray 4228c63b moray 4e6e76f0 moray 59fd11a8 59fd11a8-3556-4990-92b2-e293deaadd74 moray 9187bb05 moray 97d2dea1 moray 98d70f0e moray ae75f0f5 moray b5b915ab moray c62ea126 moray cf21edd3 moray d373891f moray d63cfaae moray d7fd4e9d moray dd07da56 moray e143ca38 moray f2b69942
Then I disabled registrar on each of these to take them out of service:
[root@headnode (us-east-1a) ~]# manta-oneach -z 0a351171-0080-443e-9e0e-a26f99f56dac,1d11226b-bae1-4667-a3cb-20775a981f01,2dd7cb5b-3a38-443a-b04b-826645263e41,59fd11a8-3556-4990-92b2-e293deaadd74 'svcadm disable registrar' SERVICE ZONE OUTPUT moray 0a351171 moray 1d11226b moray 2dd7cb5b moray 59fd11a8
Shortly thereafter, the error rate reported by mlive went to about zero.
For reference, these were shards 13, 20, 22, and 16, respectively. One question is: why did requests for many shards seem affected, if only these four were affected? PUT requests require fetching from two shards: that of the parent directory, and that of the object itself. The log only includes one of these shards.
There are a number of additional tickets to file related to this issue, msotly to improve debugging.
I've restarted the affected moray processes and re-enabled registrar.
So I've come up with a horrifying way to reproduce this in the lab easily:
First, set this up on a nameservice instance (it will stop the entire zookeeper for 1 sec every time a new client connects!)
svcs -p zookeeper | awk '$3 == "java" { print $2; exit(0); }' | xargs \ dtrace -w \ -n 'pid$1::accept:return { newsock[arg1] = 1; }' \ -n 'pid$1::write:entry /newsock[arg0] == 1 && (arg2 == 40 || arg2 == 41)/ { newsock[arg0] = 0; stop(); system("sleep 1; prun %d", pid); }' \ -n 'pid$1::close:entry /newsock[arg0] == 1/ { newsock[arg0] = 0; }'
Then do this to a moray:
ipdadm remove drop && svcadm disable moray-2021 && sleep 5 && \ svcadm enable moray-2021 && sleep 0.9 && svcs -p moray-2021 | awk '$3 == "node" { print $2 }' | xargs pfiles | grep 2181 && \ ipdadm drop 100 && sleep 300 && ipdadm remove drop
(make sure the grep shows the IP of the nameservice you started dtrace on. if it doesn't, ctrl+c this and try again until you get the right one)
Now after ~5 min, with a little luck you'll have a fully reconnected (and stuck) zk client in your moray-2021 process:
> ::findjsobjects -c ConnectionManager | ::findjsobjects | ::jsprint state socket.destroyed socket._connecting socket._bytesDispatched socket.bytesRead connectTimeoutHandler.ontimeout undefined undefined undefined undefined undefined undefined 1 false false 48 0 null 33564671 undefined undefined undefined undefined undefined
Now that it's stuck, all you have to do is trigger a failover in one of your manatee clusters (exercise left to the reader) and ta-da!
For the fix for this, (part 1) https://cr.joyent.us/#/c/1529/
Testing:
node-manatee commit c2ad9ae (branch master, by Alex Wilson)
MORAY-398 moray caught connecting to deposed peer
Reviewed by: Cody Peter Mello <cody.mello@joyent.com>
Approved by: David Pacheco <dap@joyent.com>
moray commit 257556c (branch master, by Alex Wilson)
MORAY-398 moray caught connecting to deposed peer
Reviewed by: David Pacheco <dap@joyent.com>
Reviewed by: Cody Peter Mello <cody.mello@joyent.com>
Approved by: David Pacheco <dap@joyent.com>