KMEM_CACHE_CREATE(9F) Kernel Functions for Drivers KMEM_CACHE_CREATE(9F)

NAME


kmem_cache_create, kmem_cache_alloc, kmem_cache_free, kmem_cache_destroy,
kmem_cache_set_move - kernel memory cache allocator operations

SYNOPSIS


#include <sys/types.h>
#include <sys/kmem.h>

kmem_cache_t *kmem_cache_create(char *name, size_t bufsize,
size_t align, int (*constructor)(void *, void *, int),
void (*destructor)(void *, void *), void (*reclaim)(void *),
void *private, void *vmp, int cflags);


void kmem_cache_destroy(kmem_cache_t *cp);


void *kmem_cache_alloc(kmem_cache_t *cp, int kmflag);


void kmem_cache_free(kmem_cache_t *cp, void *obj);


void kmem_cache_set_move(kmem_cache_t *cp, kmem_cbrc_t (*move)(void *,
void *, size_t *, void *));


[Synopsis for callback functions:]


int (*constructor)(void *buf, void *user_arg, int kmflags);


void (*destructor)(void *buf, void *user_arg);


kmem_cbrc_t (*move)(void *old, void *new, size_t bufsize,
void *user_arg);


INTERFACE LEVEL


illumos DDI specific (illumos DDI)

PARAMETERS


The parameters for the kmem_cache_* functions are as follows:

name
Descriptive name of a kstat(9S) structure of class
kmem_cache. Names longer than 31 characters are
truncated.


bufsize
Size of the objects it manages.


align
Required object alignment.


constructor
Pointer to an object constructor function. Parameters are
defined below.


destructor
Pointer to an object destructor function. Parameters are
defined below.


reclaim
Drivers should pass NULL.


private
Pass-through argument for constructor/destructor.


vmp
Drivers should pass NULL.


cflags
Drivers must pass 0.


kmflag
Possible flags are:

KM_SLEEP
Allow sleeping (blocking) until memory is
available.


KM_NOSLEEP
Return NULL immediately if memory is not
available, but after an aggressive
reclaiming attempt. Any mention of
KM_NOSLEEP without mentioning
KM_NOSLEEP_LAZY (see below) applies to both
values.


KM_NOSLEEP_LAZY
Return NULL immediately if memory is not
available, without the aggressive
reclaiming attempt. This is actually two
flags combined: (KM_NOSLEEP |
KM_NORMALPRI), the latter flag indicating
not to attempt reclamation before giving up
and returning NULL.


KM_PUSHPAGE
Allow the allocation to use reserved
memory.


obj
Pointer to the object allocated by kmem_cache_alloc().


move
Pointer to an object relocation function. Parameters are
defined below.


The parameters for the callback constructor function are as follows:

void *buf
Pointer to the object to be constructed.


void *user_arg
The private parameter from the call to
kmem_cache_create(); it is typically a pointer to the
soft-state structure.


int kmflags
Propagated kmflag values.


The parameters for the callback destructor function are as follows:

void *buf
Pointer to the object to be deconstructed.


void *user_arg
The private parameter from the call to
kmem_cache_create(); it is typically a pointer to the
soft-state structure.


The parameters for the callback move() function are as follows:

void *old
Pointer to the object to be moved.


void *new
Pointer to the object that serves as the copy
destination for the contents of the old parameter.


size_t bufsize
Size of the object to be moved.


void *user_arg
The private parameter from the call to
kmem_cache_create(); it is typically a pointer to the
soft-state structure.


DESCRIPTION


In many cases, the cost of initializing and destroying an object exceeds
the cost of allocating and freeing memory for it. The functions described
here address this condition.


Object caching is a technique for dealing with objects that are:

o frequently allocated and freed, and

o have setup and initialization costs.


The idea is to allow the allocator and its clients to cooperate to
preserve the invariant portion of an object's initial state, or
constructed state, between uses, so it does not have to be destroyed and
re-created every time the object is used. For example, an object
containing a mutex only needs to have mutex_init() applied once, the
first time the object is allocated. The object can then be freed and
reallocated many times without incurring the expense of mutex_destroy()
and mutex_init() each time. An object's embedded locks, condition
variables, reference counts, lists of other objects, and read-only data
all generally qualify as constructed state. The essential requirement is
that the client must free the object (using kmem_cache_free()) in its
constructed state. The allocator cannot enforce this, so programming
errors will lead to hard-to-find bugs.


A driver should call kmem_cache_create() at the time of _init(9E) or
attach(9E), and call the corresponding kmem_cache_destroy() at the time
of _fini(9E) or detach(9E).


kmem_cache_create() creates a cache of objects, each of size bufsize
bytes, aligned on an align boundary. Drivers not requiring a specific
alignment can pass 0. name identifies the cache for statistics and
debugging. constructor and destructor convert plain memory into objects
and back again; constructor can fail if it needs to allocate memory but
cannot. private is a parameter passed to the constructor and destructor
callbacks to support parameterized caches (for example, a pointer to an
instance of the driver's soft-state structure). To facilitate debugging,
kmem_cache_create() creates a kstat(9S) structure of class kmem_cache and
name name. It returns an opaque pointer to the object cache.


kmem_cache_alloc() gets an object from the cache. The object will be in
its constructed state. kmflag has either KM_SLEEP or KM_NOSLEEP set,
indicating whether it is acceptable to wait for memory if none is
currently available.


A small pool of reserved memory is available to allow the system to
progress toward the goal of freeing additional memory while in a low
memory situation. The KM_PUSHPAGE flag enables use of this reserved
memory pool on an allocation. This flag can be used by drivers that
implement strategy(9E) on memory allocations associated with a single I/O
operation. The driver guarantees that the I/O operation will complete (or
timeout) and, on completion, that the memory will be returned. The
KM_PUSHPAGE flag should be used only in kmem_cache_alloc() calls. All
allocations from a given cache should be consistent in their use of the
flag. A driver that adheres to these restrictions can guarantee progress
in a low memory situation without resorting to complex private allocation
and queuing schemes. If KM_PUSHPAGE is specified, KM_SLEEP can also be
used without causing deadlock.


kmem_cache_free() returns an object to the cache. The object must be in
its constructed state.


kmem_cache_destroy() destroys the cache and releases all associated
resources. All allocated objects must have been previously freed.


kmem_cache_set_move() registers a function that the allocator may call to
move objects from sparsely allocated pages of memory so that the system
can reclaim pages that are tied up by the client. Since caching objects
of the same size and type already makes severe memory fragmentation
unlikely, there is generally no need to register such a function. The
idea is to make it possible to limit worst-case fragmentation in caches
that exhibit a tendency to become highly fragmented. Only clients that
allocate a mix of long- and short-lived objects from the same cache are
prone to exhibit this tendency, making them candidates for a move()
callback.


The move() callback supplies the client with two addresses: the allocated
object that the allocator wants to move and a buffer selected by the
allocator for the client to use as the copy destination. The new
parameter is an allocated, constructed object ready to receive the
contents of the old parameter. The bufsize parameter supplies the size of
the object, in case a single move function handles multiple caches whose
objects differ only in size. Finally, the private parameter passed to the
constructor and destructor is also passed to the move() callback.


Only the client knows about its own data and when it is a good time to
move it. The client cooperates with the allocator to return unused
memory to the system, and the allocator accepts this help at the client's
convenience. When asked to move an object, the client can respond with
any of the following:

typedef enum kmem_cbrc {
KMEM_CBRC_YES,
KMEM_CBRC_NO,
KMEM_CBRC_LATER,
KMEM_CBRC_DONT_NEED,
KMEM_CBRC_DONT_KNOW
} kmem_cbrc_t;


The client must not explicitly free either of the objects passed to the
move() callback, since the allocator wants to free them directly to the
slab layer (bypassing the per-CPU magazine layer). The response tells the
allocator which of the two object parameters to free:

KMEM_CBRC_YES
The client moved the object; the allocator frees
the old parameter.


KMEM_CBRC_NO
The client refused to move the object; the
allocator frees the new parameter (the unused copy
destination).


KMEM_CBRC_LATER
The client is using the object and cannot move it
now; the allocator frees the new parameter (the
unused copy destination). The client should use
KMEM_CBRC_LATER instead of KMEM_CBRC_NO if the
object is likely to become movable soon.


KMEM_CBRC_DONT_NEED
The client no longer needs the object; the
allocator frees both the old and new parameters.
This response is the client's opportunity to be a
model citizen and give back as much as it can.


KMEM_CBRC_DONT_KNOW
The client does not know about the object because:

a)
the client has just allocated the object and
has not yet put it wherever it expects to
find known objects


b)
the client has removed the object from
wherever it expects to find known objects
and is about to free the object


c)
the client has freed the object

In all of these cases above, the allocator frees
the new parameter (the unused copy destination)
and searches for the old parameter in the magazine
layer. If the object is found, it is removed from
the magazine layer and freed to the slab layer so
that it will no longer tie up an entire page of
memory.


Any object passed to the move() callback is guaranteed to have been
touched only by the allocator or by the client. Because memory patterns
applied by the allocator always set at least one of the two lowest order
bits, the bottom two bits of any pointer member (other than char * or
short *, which may not be 8-byte aligned on all platforms) are available
to the client for marking cached objects that the client is about to
free. This way, the client can recognize known objects in the move()
callback by the unmarked (valid) pointer value.


If the client refuses to move an object with either KMEM_CBRC_NO or
KMEM_CBRC_LATER, and that object later becomes movable, the client can
notify the allocator by calling kmem_cache_move_notify(). Alternatively,
the client can simply wait for the allocator to call back again with the
same object address. Responding KMEM_CRBC_NO even once or responding
KMEM_CRBC_LATER too many times for the same object makes the allocator
less likely to call back again for that object.

[Synopsis for notification function:]


void kmem_cache_move_notify(kmem_cache_t *cp, void *obj);


The parameters for the notification function are as follows:

cp
Pointer to the object cache.


obj
Pointer to the object that has become movable since an earlier
refusal to move it.


CONTEXT


Constructors can be invoked during any call to kmem_cache_alloc(), and
will run in that context. Similarly, destructors can be invoked during
any call to kmem_cache_free(), and can also be invoked during
kmem_cache_destroy(). Therefore, the functions that a constructor or
destructor invokes must be appropriate in that context. Furthermore, the
allocator may also call the constructor and destructor on objects still
under its control without client involvement.


kmem_cache_create() and kmem_cache_destroy() must not be called from
interrupt context. kmem_cache_create() can also block for available
memory.


kmem_cache_alloc() can be called from interrupt context only if the
KM_NOSLEEP flag is set. It can be called from user or kernel context with
any valid flag.


kmem_cache_free() can be called from user, kernel, or interrupt context.


kmem_cache_set_move() is called from the same context as
kmem_cache_create(), immediately after kmem_cache_create() and before
allocating any objects from the cache.


The registered move() callback is always invoked in the same global
callback thread dedicated for move requests, guaranteeing that no matter
how many clients register a move() function, the allocator never tries to
move more than one object at a time. Neither the allocator nor the client
can be assumed to know the object's whereabouts at the time of the
callback.

EXAMPLES


Example 1: Object Caching




Consider the following data structure:


struct foo {
kmutex_t foo_lock;
kcondvar_t foo_cv;
struct bar *foo_barlist;
int foo_refcnt;
};


Assume that a foo structure cannot be freed until there are no
outstanding references to it (foo_refcnt == 0) and all of its pending bar
events (whatever they are) have completed (foo_barlist == NULL). The life
cycle of a dynamically allocated foo would be something like this:


foo = kmem_alloc(sizeof (struct foo), KM_SLEEP);
mutex_init(&foo->foo_lock, ...);
cv_init(&foo->foo_cv, ...);
foo->foo_refcnt = 0;
foo->foo_barlist = NULL;
use foo;
ASSERT(foo->foo_barlist == NULL);
ASSERT(foo->foo_refcnt == 0);
cv_destroy(&foo->foo_cv);
mutex_destroy(&foo->foo_lock);
kmem_free(foo);


Notice that between each use of a foo object we perform a sequence of
operations that constitutes nothing but expensive overhead. All of this
overhead (that is, everything other than use foo above) can be eliminated
by object caching.


int
foo_constructor(void *buf, void *arg, int tags)
{
struct foo *foo = buf;
mutex_init(&foo->foo_lock, ...);
cv_init(&foo->foo_cv, ...);
foo->foo_refcnt = 0;
foo->foo_barlist = NULL;
return (0);
}

void
foo_destructor(void *buf, void *arg)
{
struct foo *foo = buf;
ASSERT(foo->foo_barlist == NULL);
ASSERT(foo->foo_refcnt == 0);
cv_destroy(&foo->foo_cv);
mutex_destroy(&foo->foo_lock);
}

user_arg = ddi_get_soft_state(foo_softc, instance);
(void) snprintf(buf, KSTAT_STRLEN, "foo%d_cache",
ddi_get_instance(dip));
foo_cache = kmem_cache_create(buf,
sizeof (struct foo), 0,
foo_constructor, foo_destructor,
NULL, user_arg, 0);


To allocate, use, and free a foo object:


foo = kmem_cache_alloc(foo_cache, KM_SLEEP);
use foo;
kmem_cache_free(foo_cache, foo);


This makes foo allocation fast, because the allocator will usually do
nothing more than fetch an already-constructed foo from the cache.
foo_constructor and foo_destructor will be invoked only to populate and
drain the cache, respectively.


Example 2: Registering a Move Callback




To register a move() callback:


object_cache = kmem_cache_create(...);
kmem_cache_set_move(object_cache, object_move);


RETURN VALUES


If successful, the constructor function must return 0. If KM_NOSLEEP or
KM_NOSLEEP_LAZY is set and memory cannot be allocated without sleeping,
the constructor must return -1. If the constructor takes extraordinary
steps during a KM_NOSLEEP construction, it may not take those for a
KM_NOSLEEP_LAZY construction.


kmem_cache_create() returns a pointer to the allocated cache.


If successful, kmem_cache_alloc() returns a pointer to the allocated
object. If KM_NOSLEEP is set and memory cannot be allocated without
sleeping, kmem_cache_alloc() returns NULL.

ATTRIBUTES


See attributes(7) for descriptions of the following attributes:


+--------------------+-----------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+--------------------+-----------------+
|Interface Stability | Committed |
+--------------------+-----------------+

SEE ALSO


condvar(9F), kmem_alloc(9F), mutex(9F), kstat(9S)


Writing Device Drivers


The Slab Allocator: An Object-Caching Kernel Memory Allocator, Bonwick,
J.; USENIX Summer 1994 Technical Conference (1994).


Magazines and vmem: Extending the Slab Allocator to Many CPUs and
Arbitrary Resources, Bonwick, J. and Adams, J.; USENIX 2001 Technical
Conference (2001).

NOTES


The constructor must be immediately reversible by the destructor, since
the allocator may call the constructor and destructor on objects still
under its control at any time without client involvement.


The constructor must respect the kmflags argument by forwarding it to
allocations made inside the constructor, and must not ASSERT anything
about the given flags.


The user argument forwarded to the constructor must be fully operational
before it is passed to kmem_cache_create().

February 18, 2015 KMEM_CACHE_CREATE(9F)