OS-6435: cred reference count leak leads to zone livelock


Issue Type:Bug
Priority:2 - Critical
Created at:2017-11-01T18:39:38.000Z
Updated at:2023-06-09T14:25:49.589Z


Created by:Former user
Reported by:Former user
Assigned to:Former user


Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2017-11-07T21:55:59.248Z)

Fix Versions

2017-11-09 Edge (Release Date: 2017-11-09)

Related Issues


We encountered a system where a zone failed to be destroyed. From CNAPI
this was due to a task time out. If we look at the system in question,
there are some interesting things about zoneadm:

[root@RA515435 (us-sw-1) /var/adm]# ptree $(pgrep -x zoneadm)
4047  /usr/bin/ctrun -l child -o noorphan /usr/vm/sbin/vmadmd
  4048  /usr/node/bin/node --abort_on_uncaught_exception /usr/vm/sbin/vmadmd
    12224 /usr/sbin/zoneadm -u 8d5bc9a7-838c-4b7a-c7da-941ec4a23f4d boot -X
15658 /usr/node/bin/node --abort_on_uncaught_exception /usr/sbin/vmadm start 8d5bc9a7
  15743 /usr/sbin/zoneadm -u 8d5bc9a7-838c-4b7a-c7da-941ec4a23f4d boot -X
16619 /usr/node/bin/node --abort_on_uncaught_exception /usr/sbin/vmadm start 8d5bc9a7
  16699 /usr/sbin/zoneadm -u 8d5bc9a7-838c-4b7a-c7da-941ec4a23f4d boot -X
23595 /usr/node/bin/node --abort_on_uncaught_exception /usr/sbin/vmadm delete 8d5bc9a
  23603 /usr/sbin/zoneadm -u 8d5bc9a7-838c-4b7a-c7da-941ec4a23f4d halt -X
27286 /usr/node/bin/node --abort_on_uncaught_exception /usr/sbin/vmadm delete 8d5bc9a
  27340 /usr/sbin/zoneadm -u 8d5bc9a7-838c-4b7a-c7da-941ec4a23f4d halt -X
28260 /usr/node/bin/node --abort_on_uncaught_exception /usr/sbin/vmadm delete 8d5bc9a
  28265 /usr/sbin/zoneadm -u 8d5bc9a7-838c-4b7a-c7da-941ec4a23f4d halt -X
42018 /usr/node/bin/node --abort_on_uncaught_exception /usr/sbin/vmadm delete 8d5bc9a
  42023 /usr/sbin/zoneadm -u 8d5bc9a7-838c-4b7a-c7da-941ec4a23f4d halt -X
98978 /usr/node/bin/node --abort_on_uncaught_exception /usr/sbin/vmadm delete 8d5bc9a
  98996 /usr/sbin/zoneadm -u 8d5bc9a7-838c-4b7a-c7da-941ec4a23f4d halt -X
30042 /usr/node/bin/node --abort_on_uncaught_exception /usr/sbin/vmadm start f8abe08e
  30128 /usr/sbin/zoneadm -u f8abe08e-d4c0-ccf2-96f3-c0908995be72 boot -X

Note how we have a combination of zone boot and halts. Let's take 15743
for example. It's trying to halt the zone that 12224 is trying to boot.
The zoneadmd for this zone is: 12226. Let's see what that zoneadmd is
actually doing:

> 0t12226::pid2proc | ::walk thread | ::stacks
THREAD           STATE    SOBJ                COUNT
fffffea39c80c820 SLEEP    CV                      9

fffffeebdc7c8800 SLEEP    CV                      1

ffffff0a506e3c20 SLEEP    CV                      1

ffffff8bb82d7040 SLEEP    SHUTTLE                 1

So, interestingly, it's blocked trying to create the zone and allocate
an ID. If we look at the code, the first thing to see is how many zones
there are and if any ids are still being used by netstacks.

> ::walk zone ! wc -l
> *netstack_head::list netstack_t netstack_next ! wc -l

So we have equal numbers of zones and netstacks. So we're not hitting
the async netstack reference case. So, the question is where is it.
While reading through the zone_create() code, I saw an interesting thing
mentioned. That in some cases the zone is freed by a final cred
reference being freed. So with that in mind, I decided to walk the cred
cache and group the zones that exist.

> ::walk cred_cache | ::print cred_t cr_zone ! sort | uniq | wc -l

Well, that's suspicious. I next put together a list of all the zones
there by running the following command and went ahead and did some
additional analysis:

> ::walk cred_cache | ::printf "%p\n" cred_t cr_zone ! sort | uniq > /var/tmp/rm/zonelist
> ::cat /var/tmp/rm/zonelist | ::printf "%s\n" zone_t zone_name ! sort | uniq -c
mdb: failed to read pointer at 0: no mapping for address
mdb: failed to print member 'zone_name'
   1 00c5db23-bb1c-4a50-948c-362f165230dc
   1 05aa165f-9bc7-ca30-f2f3-d4e36cce97a8
   1 180bcac4-f73a-484b-ad43-a78d1476b290
   1 1b3f849d-fab4-c77b-cbf9-9555851c5377
   1 20f8c763-af35-c8d2-fb0a-e235254ae52d
   1 2a97f13a-7510-e87a-9b32-adcf9c18121a
   1 40526b4c-c796-6abd-9f52-9717a83069f0
   1 6ac565b0-75b5-cf29-e0a7-c4b747ddb2ce
   1 71a339c8-072c-4b37-8a27-bcd15e80694c
 110 8015613e-7656-651f-ace0-a6332dc5aa6d
   1 8807738d-fb0b-4aa9-969f-b79d1ad62de0
7486 8d5bc9a7-838c-4b7a-c7da-941ec4a23f4d
   1 96c824b5-0f51-47a1-96f4-d7f1ca030451
   1 d80db6ce-ab7a-465e-be9b-83413b871147
   1 dd091608-430f-c4ef-ce27-fd9dbd0be456
   1 ddf6eb2b-eb6a-6665-813c-bcf5706aa666
   1 e73c1dac-c1f8-e25c-de4f-aa6f4f7b8e24
   1 f1fa76d5-701c-ef6f-97dd-9ab059022c75
2384 f2c24edb-2a05-c6e0-b905-9b6ccaae5513
   1 f338d368-f226-eded-d2d4-c77417c2c923
   1 f9dd9f41-a675-ef76-ff0e-a5684bcae121
   1 global
> ::cat /var/tmp/rm/zonelist | ::print zone_t zone_brand ! sort | uniq -c
mdb: failed to read zone_brand pointer at 230: no mapping for address
9987 zone_brand = lx_brand
  12 zone_brand = native_brand

Now that's rather suspicious. So we had a few zones that ended up
leaking all of their connections. I also assembled a list of all the
cred_t structures that matched those three zones. With that in hand, I
started looking at the data in the cred_t. They all had the same basic
id set.

Based on other work that folks did, we were able to find that these
zones had ended up restarting quite a lot. Based on this, there seems to
be something related to the zone's coming and going that's causing us to
leak a cred_t and not find this.


Comment by Former user
Created at 2017-11-07T16:58:30.728Z

The leak is 100% reproducible for both lx and native zones. I tested the fix on my SmartOS VM. I rebooted both lx and native zones about 10times each, then shut them down. At the end there is only the global zone being referenced by creds in the cred_cache (as it should be).

Comment by Jira Bot
Created at 2017-11-07T21:43:32.683Z

illumos-joyent commit 7354012d871a98cfeba6ab962af30b16d0455e5f (branch master, by Jerry Jelinek)

OS-6435 cred reference count leak leads to zone livelock
Reviewed by: Patrick Mooney <patrick.mooney@joyent.com>
Approved by: Patrick Mooney <patrick.mooney@joyent.com>