OS-8638

Want ability to specify zfs_arc_max

Status:
Resolved
Created:
2025-03-24T14:44:42.319-0400
Updated:
2025-12-15T10:00:06.869-0500

Description

On regular illumos suggests:

I checked the ARC Size, and it was too high

# kstat -p zfs:0:arcstats:size
zfs:0:arcstats:size     15032385536

So I decided to set the ZFS ARC Max, which was new to me, but after going over the docs, all I needed to do was

# echo 'set zfs:zfs_arc_max = 4294967296' >> /etc/system

and then finally reboot

It’s harder to modify /etc/system on SmartOS so we will have to figure out the best approach here.

Comments (12)

Dan McDonald commented on 2025-11-04T10:07:28.254-0500:

I’ve prototyped modifications to arc.c in the kernel, introduced a new ioctl(ZFS_IOC_ARC), and a rudimentary zfscache(8) command that works as follows:

zfscache → prints arc max/min from three sources: current, system-default, /etc/system tuned.
zfscache -l BYTES -u BYTES → Sets arc_c_min (-l) and arc_c_max (-u). 0 means don’t change, unless it’s `-l 0 -u 0xffffffffffffffff` (UINT64_MAX) which will hard-reset to “system-default”. Kernel change records “system default”.

Testing notes will follow.

Dan McDonald commented on 2025-11-04T10:32:04.708-0500:

The test machine for this first batch is an 8GiB NUC (alder-lake). It boots with an /etc/system set with `zfs_arc_max` set to 2GiB.

The new zfscache command reports:

[root@alder-lake ~]# zfscache
arc_c_min: 121477760
arc_c_max: 2147483648
system default arc_c_min: 121477760
system default arc_c_max: 6700834816
/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 2147483648
[root@alder-lake ~]# 


For future status checking, I use the triple of:

zfscache ; arcstat ; kstat -p zfs::arcstats:c_max zfs::arcstats:c_min zfs::arcstats:c
[root@alder-lake ~]# zfscache ; arcstat ; kstat -p zfs::arcstats:c_max zfs::arcstats:c_min zfs::arcstats:c 
arc_c_min: 121477760
arc_c_max: 2147483648
system default arc_c_min: 121477760
system default arc_c_max: 6700834816
/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 2147483648
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c  
15:31:42   96K   16K     17   16K   17   250   98   16K   17    98M  2.0G  
zfs:0:arcstats:c_max    2147483648
zfs:0:arcstats:c_min    121477760
zfs:0:arcstats:c        2147483648
[root@alder-lake ~]# 

Dan McDonald commented on 2025-11-04T10:55:01.642-0500:

Let’s load up the arc. I have a large file that’ll do the trick:

[root@alder-lake ~]# ls -lt /var/crash/volatile/
total 122074059
-rw-r--r--   1 root     root     45833834496 Nov  4 05:18 vmcore.2
-rw-r--r--   1 root     root     2398199 Nov  4 05:10 unix.2
-rw-r--r--   1 root     root     16846290944 Nov  4 05:09 vmdump.2
[root@alder-lake ~]# du -h /var/crash/volatile/*
2.38M	/var/crash/volatile/unix.2
42.5G	/var/crash/volatile/vmcore.2
15.7G	/var/crash/volatile/vmdump.2
[root@alder-lake ~]# zfscache ; arcstat ; kstat -p zfs::arcstats:c_max zfs::arcstats:c_min zfs::arcstats:c 
arc_c_min: 121477760
arc_c_max: 2147483648
system default arc_c_min: 121477760
system default arc_c_max: 6700834816
/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 2147483648
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c  
15:32:54   97K   16K     17   16K   17   250   98   16K   17    98M  2.0G  
zfs:0:arcstats:c_max	2147483648
zfs:0:arcstats:c_min	121477760
zfs:0:arcstats:c	2147483648
[root@alder-lake ~]# digest -a md5 /var/crash/volatile/vmdump.2 
d2974cc04b346c31fdc75aa9e762951d
[root@alder-lake ~]# zfscache ; arcstat ; kstat -p zfs::arcstats:c_max zfs::arcstats:c_min zfs::arcstats:c 
arc_c_min: 121477760
arc_c_max: 2147483648
system default arc_c_min: 121477760
system default arc_c_max: 6700834816
/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 2147483648
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c  
15:34:59  746K  147K     19   18K    2  128K   99   17K    5   2.0G  2.0G  
zfs:0:arcstats:c_max	2147483648
zfs:0:arcstats:c_min	121477760
zfs:0:arcstats:c	2147483648
[root@alder-lake ~]# 

So I’ve filled up ARC quite a bit.

Let’s lower it!

[root@alder-lake ~]# zfscache -u $(( 1536 * 1024 * 1024 )) ; arcstat ; kstat -p zfs::arcstats:c_max zfs::arcstats:c_min zfs::arcstats:c 
arc_c_min: 121477760
arc_c_max: 1610612736
system default arc_c_min: 121477760
system default arc_c_max: 6700834816
/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 2147483648
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c  
15:36:20  747K  147K     19   18K    2  128K   99   17K    5   2.0G  1.5G  
zfs:0:arcstats:c_max	1610612736
zfs:0:arcstats:c_min	121477760
zfs:0:arcstats:c	1610612736
[root@alder-lake ~]# 

So the arc size is still 2.0GiB. If I re-digest the vmdump file, that may hit things in the cache, but it’s so big maybe not?

[root@alder-lake ~]# zfscache ; arcstat ; kstat -p zfs::arcstats:c_max zfs::arcstats:c_min zfs::arcstats:c 
arc_c_min: 121477760
arc_c_max: 1610612736
system default arc_c_min: 121477760
system default arc_c_max: 6700834816
/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 2147483648
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c  
15:44:49  1.4M  279K     19   21K    1  257K   99   21K    3   2.0G  1.5G  
zfs:0:arcstats:c_max	1610612736
zfs:0:arcstats:c_min	121477760
zfs:0:arcstats:c	1610612736
[root@alder-lake ~]# 

Hmmm, arcsz stays the same.

[root@alder-lake ~]# vmstat 1 4
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr bk bk lf rm   in   sy   cs us sy id
 0 0 0 13310792 5361096 470 1129 124 14 51 0 31771 -0 263 8 -1184 2895 2733 4515 2 2 97
 0 0 0 12745464 4766252 3 45 0  0  0  0  0  0  0  0  0 2505  843  588  0  0 100
 0 0 0 12745392 4766176 0 5  0  0  0  0  0  0  0  0  0 2467  694  556  0  0 100
 0 0 0 12745392 4766176 0 5  0  0  0  0  0  0  0  0  0 2331  347  420  0  0 100
[root@alder-lake ~]# 

4-ish GiB… no memory pressure. Let’s try a different large file!

[root@alder-lake ~]# digest -a md5 /var/crash/volatile/vmcore.2 
04b8f97a3715bf290b79f2f90eeccb2a
[root@alder-lake ~]# zfscache ; arcstat ; kstat -p zfs::arcstats:c_max zfs::arcstats:c_min zfs::arcstats:c 
arc_c_min: 121477760
arc_c_max: 1610612736
system default arc_c_min: 121477760
system default arc_c_max: 6700834816
/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 2147483648
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c  
15:52:24  3.2M  629K     19   23K    0  605K   99   23K    1   2.0G  1.5G  
zfs:0:arcstats:c_max	1610612736
zfs:0:arcstats:c_min	121477760
zfs:0:arcstats:c	1610612736
[root@alder-lake ~]# 

Okay then, let’s apply some pressure. mdb-ing a kernel dump and running ::kgrep works rather well.

[root@alder-lake ~]# mdb -k /var/crash/volatile/unix.2 /var/crash/volatile/vmcore.2 
mdb: warning: dump is from SunOS 5.11 joyent_20241112T202951Z; dcmds and macros may not match kernel implementation
Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc apix scsi_vhci ufs ip hook neti sockfs arp usba xhci smbios stmf_sbd stmf zfs mm lofs random idm sata crypto fcp fctl cpc logindmux ptm kvm sppp nsmb smbsrv klmmod nfs vmm ]
> $C
fffffbe31dcf2a50 vpanic()
fffffbe31dcf2a60 pageout_deadman+0x62()
fffffbe31dcf2ad0 clock+0x7b3()
fffffbe31dcf2b60 cyclic_softint+0xe1(fffffffffbc3e000, 1)
fffffbe31dcf2b80 cbe_softclock+0x23(0, 0)
fffffbe31dcf2bd0 av_dispatch_softvect+0x72(a)
fffffbe31dcf2c00 apix_dispatch_softint+0x35(0, 0)
fffffbe31e138840 switch_sp_and_call+0x15()
fffffbe31e138890 apix_do_softint+0x5a(fffffbe31e138900)
fffffbe31e1388f0 apix_do_interrupt+0x2bf(fffffbe31e138900, 1)
fffffbe31e138900 _interrupt+0xc3()
fffffbe31e138a40 checkpage+0x65(fffffbe1a80b7448, 1)
fffffbe31e138b10 pageout_scanner+0x1e5(2)
fffffbe31e138b20 thread_start+0xb()
> fffffffffbc3e000::kgrep
^C
> 
[root@alder-lake ~]# zfscache ; arcstat ; kstat -p zfs::arcstats:c_max zfs::arcstats:c_min zfs::arcstats:c 
arc_c_min: 121477760
arc_c_max: 1610612736
system default arc_c_min: 121477760
system default arc_c_max: 6700834816
/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 2147483648
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c  
15:54:20  5.7M  668K     11   25K    0  643K   99   24K    0   1.4G  1.4G  
zfs:0:arcstats:c_max	1610612736
zfs:0:arcstats:c_min	121477760
zfs:0:arcstats:c	1454194688
[root@alder-lake ~]# 

Dan McDonald commented on 2025-11-07T11:32:36.567-0500:

Okay, so some additional design/implementation notes:

- The -p {profile} option is in zfscache(8). No argument lists available profiles. These are quick-shortcuts allowing the next bullet.
- fs-joyent will now will look for arc_profile=X in bootparams, and act upon it. This allows Triton CNAPI to put an arc_profile into the kernel arguments for a CN or even default.
- The list of profiles is:

[root@alder-lake ~]# zfscache -p
Available profiles
==================
illumos
reset-system-defaults
reset-etc-system
illumos-low
balanced
compute-hvm
compute-hvm-64
[root@alder-lake ~]# 

There are two (min=0, max=SPECIAL) values of SPECIAL: UINT64_MAX is reset-system-default, and (UINT64_MAX - 1) is reset-etc-system.

The fs-joyent handling means the precedent rules are:
- arc_profile in bootparams overrides anything below
- /etc/system overrides anything below
- The default illumos-gate initialization remains otherwise.

At any time you can use zfscache(8) to change things. If shrinking existing limits, the ARC itself may exceed them until memory pressure is applied. This needs to be tested for verification. If raising limits, the ARC will not grow until pressured AND system memory pressure is not around. This also needs to be tested for verification.

Nahum Shalman commented on 2025-11-10T10:29:36.420-0500:

A minimal man page or some self-documentation with a -h flag or whatever would be really nice.

Dan McDonald commented on 2025-11-10T15:48:51.116-0500:

There have been changes due to feedback from code reviews. Updated output is here in the testing notes, and prior output should be viewed as pre-review.

Testing notes.

1.) Simple smoke tests on an 8GiB NUC. Notice how /etc/system is overridden by the presence of an arc_profile in bootparams.

[root@alder-lake ~]# reboot
Connection to alder-lake closed by remote host.
Connection to alder-lake closed.
kebe(~)[255]% ssh root@alder-lake
(root@alder-lake) Password: 
SmartOS (build: 20251110T171533Z)
[root@alder-lake ~]# cat /zones/boot/custom/loader.conf.local 
etc_system_load=YES
etc_system_type=file
etc_system_name=/bootfs/etc/system
etc_system_flags="name=/etc/system"
arc_profile="balanced"
[root@alder-lake ~]# diff /etc/system /zones/boot/bootfs/etc/system
162a163,165
> 
> * XXX KEBE WAS HERE!
> set zfs:zfs_arc_max=2147483648
[root@alder-lake ~]# zfscache
arc_c_min: 121477760
arc_c_max: 3887288320

system default arc_c_min: 121477760
system default arc_c_max: 6700834816
system arc_init()-time of maximum available memory (allmem): 7774576640

/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 2147483648
[root@alder-lake ~]# bootparams | grep arc_profile
arc_profile=balanced
[root@alder-lake ~]# zfscache -h
Usage:
zfscache                   (prints ZFS ARC parameters)
zfscache -h                (prints this message)
zfscache -l BYTES -u BYTES (sets ZFS ARC lower-bound/c_min (-l)
                            and upper-bound/c_max size.
                            NOTE: BYTES is a signed 64-bit
                            value, negative values are reserved)
zfscache -p                (prints available ZFS ARC profiles)
zfscache -p PROFILE        (sets ZFS ARC to PROFILE's specs)
[root@alder-lake ~]# zfscache -p
Available profiles
==================
illumos
  ARC defaults from illumos-gate:
    - Minimum will be either 64MiB, 1GiB, or 1/64 of allowable
      memory if it fits between those two.
    - Maximum will be all allowable memory save 1GiB, or minimum.

reset-system-defaults
  SmartOS ARC defaults: currently the same as illumos-gate.

reset-etc-system
  Use values in /etc/system tunables zfs_arc_min and zfs_arc_max,
  where 0 means keep the existing value.

illumos-low
  ARC defaults for lower-physical-memory situations in illumos,
  or to give some small space to HVMs:
    - Minimum matches "illumos" above
    - Maximum is higher of minimum or 75% of allowable memory.

balanced
  ARC defaults trying to balance native workloads and HVMs:
    - Minimum matches "illumos" above
    - Maximum is higher of minimum or 50% of allowable memory.

compute-hvm
  ARC defaults favoring HVMs:
    - Minimum matches "illumos" above
    - Maximum is higher of minimum or 1/8 of allowable memory.

compute-hvm-64
  ARC defaults favoring HVMs with a 64GiB cap on maximum:
    - Minimum matches "illumos" above
    - Maximum is higher of minimum or 1/8 of allowable memory,
      but capped at 64GiB.

[root@alder-lake ~]# zfscache -l 8192
zfscache: Requested minimum 8192 is too small.

[root@alder-lake ~]# zfscache -u $((16384 * 1024 * 1024))
zfscache: Requested maximum 17179869184 is too large.

[root@alder-lake ~]# zfscache -p restore-etc-system
zfscache: Profile name "restore-etc-system" not found.

Usage:
zfscache                   (prints ZFS ARC parameters)
zfscache -h                (prints this message)
zfscache -l BYTES -u BYTES (sets ZFS ARC lower-bound/c_min (-l)
                            and upper-bound/c_max size.
                            NOTE: BYTES is a signed 64-bit
                            value, negative values are reserved)
zfscache -p                (prints available ZFS ARC profiles)
zfscache -p PROFILE        (sets ZFS ARC to PROFILE's specs)

2.) Use of zfscache(8) to increase arc_c_max:

[root@alder-lake ~]# zfscache -p reset-etc-system
arc_c_min: 121477760
arc_c_max: 2147483648

system default arc_c_min: 121477760
system default arc_c_max: 6700834816
system arc_init()-time of maximum available memory (allmem): 7774576640

/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 2147483648
[root@alder-lake ~]# arcstat ; kstat zfs::arcstats:c\*
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c  
20:42:08  123K   23K     18   23K   18   268   98   22K   18    98M  2.0G  
module: zfs                             instance: 0     
name:   arcstats                        class:    misc
        c                               2147483648
        c_max                           2147483648
        c_min                           121477760
        compressed_size                 36634624
        crtime                          22.518814065

[root@alder-lake ~]# 

3.) After loading the ARC, shrink arc_c_max.

[root@alder-lake ~]# mdb -ke arc_warm/X
arc_warm:
arc_warm:       1               
[root@alder-lake ~]# zfscache -p reset-etc-system
arc_c_min: 121477760
arc_c_max: 2147483648

system default arc_c_min: 121477760
system default arc_c_max: 6700834816
system arc_init()-time of maximum available memory (allmem): 7774576640

/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 2147483648
[root@alder-lake ~]# arcstat ; kstat zfs::arcstats:c\*
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c  
20:47:40  774K  153K     19   24K    3  128K   99   24K    6   2.0G  2.0G  
module: zfs                             instance: 0     
name:   arcstats                        class:    misc
	c                               2147483648
	c_max                           2147483648
	c_min                           121477760
	compressed_size                 2064143872
	crtime                          22.518814065

[root@alder-lake ~]# 

4.) Using Triton’s CNAPI to send an arc_profile to a 128GiB compute node:

[root@moe (kebecloud) ~]# sdc-cnapi /boot/00000000-0000-0000-0000-002590fb5868
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 274
Date: Fri, 07 Nov 2025 00:45:55 GMT
Server: cnapi/1.26.6
x-request-id: f9bef745-63b0-4c3b-b78b-b0630c1fbeba
x-response-time: 15
x-server-name: 6e3af517-0360-46df-bd54-c75830760bd8
Connection: keep-alive

{
  "platform": "20251106T230712Z",
  "kernel_args": {
    "hostname": "larry",
    "rabbitmq": "guest:guest:192.168.4.17:5672",
    "smt_enabled": true,
    "rabbitmq_dns": "guest:guest:rabbitmq.kebecloud.work.kebe.com:5672"
  },
  "kernel_flags": {},
  "boot_modules": [],
  "default_console": "serial",
  "serial": "ttyb"
}
[root@moe (kebecloud) ~]# sdc-cnapi /boot/00000000-0000-0000-0000-002590fb5868  -X POST -d '{ "kernel_args": { "arc_profile": "balanced" } }'
HTTP/1.1 204 No Content
Date: Fri, 07 Nov 2025 00:46:02 GMT
Server: cnapi/1.26.6
x-request-id: a3e4e097-3d5c-4bdc-8883-11c45c66a1a4
x-response-time: 82
x-server-name: 6e3af517-0360-46df-bd54-c75830760bd8
Connection: keep-alive


[root@moe (kebecloud) ~]# sdc-cnapi /boot/00000000-0000-0000-0000-002590fb5868
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 299
Date: Fri, 07 Nov 2025 00:46:04 GMT
Server: cnapi/1.26.6
x-request-id: e5cec8df-5068-4f5a-9361-7223ebe18ff6
x-response-time: 13
x-server-name: 6e3af517-0360-46df-bd54-c75830760bd8
Connection: keep-alive

{
  "platform": "20251106T230712Z",
  "kernel_args": {
    "hostname": "larry",
    "rabbitmq": "guest:guest:192.168.4.17:5672",
    "smt_enabled": true,
    "arc_profile": "balanced",
    "rabbitmq_dns": "guest:guest:rabbitmq.kebecloud.work.kebe.com:5672"
  },
  "kernel_flags": {},
  "boot_modules": [],
  "default_console": "serial",
  "serial": "ttyb"
}
[root@moe (kebecloud) ~]# 

Then let’s see the results on the booted CN:

[root@larry (kebecloud) ~]# bootparams | grep arc_profile
arc_profile=balanced
[root@larry (kebecloud) ~]# zfscache 
arc_c_min: 1073741824
arc_c_max: 68378949632

system default arc_c_min: 1073741824
system default arc_c_max: 135684157440
system arc_init()-time of maximum available memory (allmem): 136757899264

/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 0
[root@larry (kebecloud) ~]# arcstat ; kstat -p zfs::arcstats:c\*
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c  
20:16:50  912M  150M     16  149M   16  740K  100  148M   17    47G   63G  
zfs:0:arcstats:c        68378949632
zfs:0:arcstats:c_max    68378949632
zfs:0:arcstats:c_min    1073741824
zfs:0:arcstats:class    misc
zfs:0:arcstats:compressed_size  44368336896
zfs:0:arcstats:crtime   3024163.294449157
[root@larry (kebecloud) ~]# 

Dan McDonald commented on 2025-11-13T16:13:59.129-0500:

Demonstration of /boot/default being inherited by a new compute node, thank you mass-1 all-VMware-one-machine-three-node Triton.

First let’s setup /boot/default on the headnode and boot the pristine new CN, `cn-2`:

[root@headnode (mass-1) ~]# sdc-server list
HOSTNAME             UUID                                 VERSION    SETUP    STATUS      RAM  ADMIN_IP       
headnode             564d5aa9-490e-fd3a-e08d-cad4f4a5d7b7     7.0     true   running    16383  172.16.64.2    
cn-1                 564df9cb-7b48-9ede-b7a5-c8ad3e7fb3d3     7.0     true   running     4095  172.16.64.33   
[root@headnode (mass-1) ~]# sdc-cnapi /boot/default | json -H
{
  "platform": "20251113T010957Z",
  "kernel_args": {
    "rabbitmq": "guest:guest:172.16.64.15:5672",
    "smt_enabled": true,
    "rabbitmq_dns": "guest:guest:rabbitmq.mass-1.work.kebe.com:5672"
  },
  "kernel_flags": {},
  "boot_modules": [],
  "default_console": "serial",
  "serial": "ttyb"
}
[root@headnode (mass-1) ~]# sdc-cnapi /boot/564d5aa9-490e-fd3a-e08d-cad4f4a5d7b7  | json -H
{
  "platform": "20251113T010957Z",
  "kernel_args": {
    "hostname": "headnode",
    "rabbitmq": "guest:guest:172.16.64.15:5672",
    "smt_enabled": true,
    "rabbitmq_dns": "guest:guest:rabbitmq.mass-1.work.kebe.com:5672"
  },
  "kernel_flags": {},
  "boot_modules": [],
  "default_console": "serial",
  "serial": "ttyb"
}
[root@headnode (mass-1) ~]# sdc-cnapi /boot/564df9cb-7b48-9ede-b7a5-c8ad3e7fb3d3  | json -H
{
  "platform": "20251113T010957Z",
  "kernel_args": {
    "hostname": "cn-1",
    "rabbitmq": "guest:guest:172.16.64.15:5672",
    "smt_enabled": true,
    "rabbitmq_dns": "guest:guest:rabbitmq.mass-1.work.kebe.com:5672"
  },
  "kernel_flags": {},
  "boot_modules": [],
  "default_console": "serial",
  "serial": "ttyb"
}
[root@headnode (mass-1) ~]# sdc-cnapi /boot/default -X POST -d '{ "kernel_args": { "arc_profile": "compute-hvm" } }'
HTTP/1.1 204 No Content
Date: Thu, 13 Nov 2025 20:52:20 GMT
Server: cnapi/1.26.6
x-request-id: af67a182-19cd-458d-97e9-b9f37f647976
x-response-time: 119
x-server-name: ffd57b00-c45c-48f1-a7fd-c832e1d11a07
Connection: keep-alive


[root@headnode (mass-1) ~]# sdc-cnapi /boot/default | json -H
{
  "platform": "20251113T010957Z",
  "kernel_args": {
    "rabbitmq": "guest:guest:172.16.64.15:5672",
    "arc_profile": "compute-hvm",
    "smt_enabled": true,
    "rabbitmq_dns": "guest:guest:rabbitmq.mass-1.work.kebe.com:5672"
  },
  "kernel_flags": {},
  "boot_modules": [],
  "default_console": "serial",
  "serial": "ttyb"
}
[root@headnode (mass-1) ~]# echo "New server to be cn-2 is booting..."
New server to be cn-2 is booting...
[root@headnode (mass-1) ~]# sdc-server list
HOSTNAME             UUID                                 VERSION    SETUP    STATUS      RAM  ADMIN_IP       
00-0c-29-7a-a8-ff    564d0626-c15f-78d0-056e-d4cc6b7aa8ff     7.0    false   running     4095  172.16.64.32   
headnode             564d5aa9-490e-fd3a-e08d-cad4f4a5d7b7     7.0     true   running    16383  172.16.64.2    
cn-1                 564df9cb-7b48-9ede-b7a5-c8ad3e7fb3d3     7.0     true   running     4095  172.16.64.33   
[root@headnode (mass-1) ~]# 

NOTE that the “compute-hvm” profile per above is 1/8 of allowable memory. This is the “allmem” in the output of zfscache.

Now let’s look at the larval CN before it’s even configured. You can login using the SINGLE_USER_ROOT_PASSWORD.txt that’s with the platform image:

00-0c-29-7a-a8-ff ttyb login: root 
Password: 
2025-11-13T20:56:23+00:00 00-0c-29-7a-a8-ff login: [ID 644210 auth.notice] ROOT LOGIN /dev/term/b
SmartOS (build: 20251113T010957Z)
[root@00-0c-29-7a-a8-ff ~]# zfscache
arc_c_min: 67108864
arc_c_max: 468666880

system default arc_c_min: 67108864
system default arc_c_max: 2812001280
system arc_init()-time of maximum available memory (allmem): 3749335040

/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 0
[root@00-0c-29-7a-a8-ff ~]# bootparams | grep arc_profile
arc_profile=compute-hvm
[root@00-0c-29-7a-a8-ff ~]# echo Before setup
Before setup
[root@00-0c-29-7a-a8-ff ~]# 

Looks like it’s already been sent the ARC profile via /boot/default . Let’s set it up now and come back to it.

NOTE that on the CN, this displays mid-setup:

Compute node, installing config files...                done
adding volume: swap                                     done
Failed to set ZFS ARC profile compute-hvm

This is a manifestation of TRITON-2330 but fortunately it is more sound than fury in this case.

First, the headnode’s POV for setup and post-setup:

[root@headnode (mass-1) ~]# sdc-server setup 564d0626-c15f-78d0-056e-d4cc6b7aa8ff hostname=cn2
[root@headnode (mass-1) ~]# sdc-server list
HOSTNAME             UUID                                 VERSION    SETUP    STATUS      RAM  ADMIN_IP       
headnode             564d5aa9-490e-fd3a-e08d-cad4f4a5d7b7     7.0     true   running    16383  172.16.64.2    
cn-1                 564df9cb-7b48-9ede-b7a5-c8ad3e7fb3d3     7.0     true   running     4095  172.16.64.33   
cn2                  564d0626-c15f-78d0-056e-d4cc6b7aa8ff     7.0  running   running     4095  172.16.64.32   
[root@headnode (mass-1) ~]# echo wait a bit first...
wait a bit first...
[root@headnode (mass-1) ~]# sdc-server list
HOSTNAME             UUID                                 VERSION    SETUP    STATUS      RAM  ADMIN_IP       
cn2                  564d0626-c15f-78d0-056e-d4cc6b7aa8ff     7.0     true   running     4095  172.16.64.32   
headnode             564d5aa9-490e-fd3a-e08d-cad4f4a5d7b7     7.0     true   running    16383  172.16.64.2    
cn-1                 564df9cb-7b48-9ede-b7a5-c8ad3e7fb3d3     7.0     true   running     4095  172.16.64.33   
[root@headnode (mass-1) ~]# sdc-cnapi /boot/564d0626-c15f-78d0-056e-d4cc6b7aa8ff | json -H
{
  "platform": "20251113T010957Z",
  "kernel_args": {
    "hostname": "cn2",
    "rabbitmq": "guest:guest:172.16.64.15:5672",
    "arc_profile": "compute-hvm",
    "smt_enabled": true,
    "rabbitmq_dns": "guest:guest:rabbitmq.mass-1.work.kebe.com:5672"
  },
  "kernel_flags": {},
  "boot_modules": [],
  "default_console": "serial",
  "serial": "ttyb"
}
[root@headnode (mass-1) ~]# sdc-cnapi /boot/564d0626-c15f-78d0-056e-d4cc6b7aa8ff | json -H
{
  "platform": "20251113T010957Z",
  "kernel_args": {
    "hostname": "cn2",
    "rabbitmq": "guest:guest:172.16.64.15:5672",
    "arc_profile": "compute-hvm",
    "smt_enabled": true,
    "rabbitmq_dns": "guest:guest:rabbitmq.mass-1.work.kebe.com:5672"
  },
  "kernel_flags": {},
  "boot_modules": [],
  "default_console": "serial",
  "serial": "ttyb"
}
[root@headnode (mass-1) ~]# 

And from the newly-minted cn2:

cn2 ttyb login: root
Password: 
Last login: Thu Nov 13 21:00:42 on term/b
2025-11-13T21:07:07+00:00 cn2 login: [ID 644210 auth.notice] ROOT LOGIN /dev/term/b
SmartOS (build: 20251113T010957Z)
[root@cn2 (mass-1) ~]# bootparams | grep arc_profile
arc_profile=compute-hvm
[root@cn2 (mass-1) ~]# zfscache
arc_c_min: 67108864
arc_c_max: 468666880

system default arc_c_min: 67108864
system default arc_c_max: 2812001280
system arc_init()-time of maximum available memory (allmem): 3749335040

/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 0
[root@cn2 (mass-1) ~]# 

Dan McDonald commented on 2025-11-13T16:15:49.023-0500:

To remove the arc_profile from default:

[root@headnode (mass-1) ~]# sdc-cnapi /boot/default -X POST -d '{ "kernel_args": { "arc_profile": null } }'
HTTP/1.1 204 No Content
Date: Thu, 13 Nov 2025 21:15:24 GMT
Server: cnapi/1.26.6
x-request-id: aa4718e7-da44-4715-b6bf-acacee37a809
x-response-time: 123
x-server-name: ffd57b00-c45c-48f1-a7fd-c832e1d11a07
Connection: keep-alive


[root@headnode (mass-1) ~]# sdc-cnapi /boot/default | json -H
{
  "platform": "20251113T010957Z",
  "kernel_args": {
    "rabbitmq": "guest:guest:172.16.64.15:5672",
    "smt_enabled": true,
    "rabbitmq_dns": "guest:guest:rabbitmq.mass-1.work.kebe.com:5672"
  },
  "kernel_flags": {},
  "boot_modules": [],
  "default_console": "serial",
  "serial": "ttyb"
}
[root@headnode (mass-1) ~]# 

Dan McDonald commented on 2025-11-13T16:16:51.115-0500:

One can do the same for cn2:

[root@headnode (mass-1) ~]# sdc-cnapi /boot/564d0626-c15f-78d0-056e-d4cc6b7aa8ff -X POST -d '{ "kernel_args": { "arc_profile": null } }'
HTTP/1.1 204 No Content
Date: Thu, 13 Nov 2025 21:16:26 GMT
Server: cnapi/1.26.6
x-request-id: 5a15a791-3d23-4d60-a0dc-0396381550b4
x-response-time: 252
x-server-name: ffd57b00-c45c-48f1-a7fd-c832e1d11a07
Connection: keep-alive


[root@headnode (mass-1) ~]# sdc-cnapi /boot/564d0626-c15f-78d0-056e-d4cc6b7aa8ff | json -H
{
  "platform": "20251113T010957Z",
  "kernel_args": {
    "hostname": "cn2",
    "rabbitmq": "guest:guest:172.16.64.15:5672",
    "smt_enabled": true,
    "rabbitmq_dns": "guest:guest:rabbitmq.mass-1.work.kebe.com:5672"
  },
  "kernel_flags": {},
  "boot_modules": [],
  "default_console": "serial",
  "serial": "ttyb"
}
[root@headnode (mass-1) ~]# 

Dan McDonald commented on 2025-11-13T17:07:35.101-0500:

And if you don’t wish to reboot cn2 you can run zfscache in runtime. A reason this issue is not yet closed is because we need more testing during runtime, even if initial successes have been… well… successful. This tiny CN with no load on it won’t need much adjustment anyway, but here it is.

[root@cn2 (mass-1) ~]# zfscache ; arcstat ; kstat zfs::arcstats:c\*
arc_c_min: 67108864
arc_c_max: 468666880

system default arc_c_min: 67108864
system default arc_c_max: 2812001280
system arc_init()-time of maximum available memory (allmem): 3749335040

/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 0
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c  
22:06:51  762K  109K     14  108K   14  1.1K   99  104K   14   361M  446M  
module: zfs                             instance: 0     
name:   arcstats                        class:    misc
        c                               468666880
        c_max                           468666880
        c_min                           67108864
        compressed_size                 271239168
        crtime                          31.054700949

[root@cn2 (mass-1) ~]# zfscache -p illumos ; arcstat ; kstat zfs::arcstats:c\*
arc_c_min: 67108864
arc_c_max: 2675593216

system default arc_c_min: 67108864
system default arc_c_max: 2812001280
system arc_init()-time of maximum available memory (allmem): 3749335040

/etc/system zfs_arc_min: 0
/etc/system zfs_arc_max: 0
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c  
22:07:01  764K  109K     14  108K   14  1.1K   99  104K   14   361M  446M  
module: zfs                             instance: 0     
name:   arcstats                        class:    misc
        c                               468666880
        c_max                           2675593216
        c_min                           67108864
        compressed_size                 271240192
        crtime                          31.054700949

[root@cn2 (mass-1) ~]# 

Dan McDonald commented on 2025-11-17T08:58:08.598-0500:

A reminder: A reason this issue is not yet closed is because we need more testing during runtime, even if initial successes have been… well… successful.

Dan McDonald commented on 2025-12-15T09:55:19.887-0500:

Going to close THIS bug as it’s mechanism, and not a change of defaults. Going to open a new (and will be linked) TRITON- issue to discuss if defaults should change in Triton.