OS-6344: elide squeue wake-ups when prudent

Details

Issue Type:Improvement
Priority:4 - Normal
Status:Resolved
Created at:2017-09-13T19:48:19.000Z
Updated at:2017-11-07T16:47:51.784Z

People

Created by:Former user
Reported by:Former user
Assigned to:Former user

Resolution

Fixed: A fix for this issue is checked into the tree and tested.
(Resolution Date: 2017-09-21T13:32:05.000Z)

Fix Versions

2017-09-28 Battle Square (Release Date: 2017-09-28)

Related Issues

Description

As part of the testing for OS-6150, I looked into the circumstances which resulted in heavy squeue wake-ups once the arbitrary sq_wait delay was removed. During a localhost-targeted ab(1) test, it turns out that a great many of the wake-ups are incurred during the squeue_sync_exit() call at the end of a TCP connect operation. The wake-up is to process a response packet as part of the connect.

A dtrace script can confirm this:

tcp_connect:entry
{
        self->c = (conn_t *)arg0;
}
tcp_connect:return
/self->c/
{
        self->c = 0;
}

squeue_synch_exit:entry
/self->c/
{
        this->s = self->c->conn_sqp;
        if (this->s->sq_first != NULL) {
                /* conn is stored in b_prev, proc in b_queue */
                this->conn = (conn_t *)this->s->sq_first->b_prev;
                this->proc = (void *)this->s->sq_first->b_queue;
                if (self->c == this->conn) {
                        if (this->s->sq_count == 1) {
                                @elide[this->proc] = count();
                        } else {
                                @optimize[this->proc, this->s->sq_count] = count();
                        }
                } else {
                        @wake[this->s->sq_count] = count();
                }
        } else {
                @nowake = count();
        }
}

END
{
        printa("No wake-up needed: %@u\n", @nowake);
        printa("Possible to elided wake-up:\n %a %@u\n", @elide);

        printf("Possible to optimized wake-up:\n");
        printa("%a %u %@u\n", @optimize);

        printf("Forced wake-up:\n");
        printa("%u %@u\n", @wake);
}

Yielding the following results during an ab test run:

No wake-up needed: 168853
Possible to elided wake-up:
 ip`tcp_input_data 161273
Possible to optimized wake-up:
ip`tcp_input_data 3 2
ip`tcp_input_data 2 894
Forced wake-up:
4 8
3 168
1 3345
2 4720

The noted "possibilities" indicate conditions where the first item on the squeue to be processed during the wake-up belongs to the same connection which is synchronously exiting the squeue. In that case, it seems cheaper to immediately process that single unit of data on our way out of the squeue.

Benchmarking data should be able to confirm any delta in cost and/or performance.

Comments

Comment by Former user
Created at 2017-09-14T20:59:19.000Z

Further investigation revealed that the primary worker wake-up culprits during the ab(1) test are tcp_connect() and tcp_close(). The former incurs one during squeue_synch_exit() since the SYN/ACK response for a localhost connection setup is often rapid enough to make it onto the squeue by the time the connecting thread goes to exit the synchronous context. The latter suffers a similar problem, quickly queued response (although to the FIN instead of the SYN) during the non-synchronous SQUEUE_ENTER_ONE() which specifies SQ_NODRAIN.

In both cases, we can attempt to drain a single queued request if it matches the connection being acted upon. With the initial draft in place, I measured sq_count depth when this work processing for wake-up elision was potentially attempted:

        5                1
        4                8
        3             1954
        2            57862
        1          1857500
        0          3940144

Comment by Former user
Created at 2017-09-14T21:11:10.000Z

With the noted logic in place, performance in the ab(1) test jumps from ~15500 req/s to ~18000 req/s.


Comment by Former user
Created at 2017-09-19T00:06:58.000Z

This was tested using the same methodology as OS-6150: ab(1) measurement to track the decrease in latency for localhost TCP connections as well as high throughput (via iperf) to check for undesirable side effects. The results were the same, save for the increased req/s as noted above.


Comment by Former user
Created at 2017-09-20T21:41:08.000Z

illumos-joyent commit cc6a60c (branch master, by Patrick Mooney)

OS-6344 elide squeue wake-ups when prudent
Reviewed by: Jerry Jelinek <jerry.jelinek@joyent.com>
Reviewed by: Ryan Zezeski <ryan.zeseski@joyent.com>
Approved by: Jerry Jelinek <jerry.jelinek@joyent.com>