10Sstevel@tonic-gate /*
20Sstevel@tonic-gate * CDDL HEADER START
30Sstevel@tonic-gate *
40Sstevel@tonic-gate * The contents of this file are subject to the terms of the
56103Sck142721 * Common Development and Distribution License (the "License").
66103Sck142721 * You may not use this file except in compliance with the License.
70Sstevel@tonic-gate *
80Sstevel@tonic-gate * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
90Sstevel@tonic-gate * or http://www.opensolaris.org/os/licensing.
100Sstevel@tonic-gate * See the License for the specific language governing permissions
110Sstevel@tonic-gate * and limitations under the License.
120Sstevel@tonic-gate *
130Sstevel@tonic-gate * When distributing Covered Code, include this CDDL HEADER in each
140Sstevel@tonic-gate * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
150Sstevel@tonic-gate * If applicable, add the following below this CDDL HEADER, with the
160Sstevel@tonic-gate * fields enclosed by brackets "[]" replaced with your own identifying
170Sstevel@tonic-gate * information: Portions Copyright [yyyy] [name of copyright owner]
180Sstevel@tonic-gate *
190Sstevel@tonic-gate * CDDL HEADER END
200Sstevel@tonic-gate */
216622Sraf
220Sstevel@tonic-gate /*
236103Sck142721 * Copyright 2008 Sun Microsystems, Inc. All rights reserved.
240Sstevel@tonic-gate * Use is subject to license terms.
250Sstevel@tonic-gate */
260Sstevel@tonic-gate
270Sstevel@tonic-gate /*
280Sstevel@tonic-gate * Big Theory Statement for turnstiles.
290Sstevel@tonic-gate *
300Sstevel@tonic-gate * Turnstiles provide blocking and wakeup support, including priority
310Sstevel@tonic-gate * inheritance, for synchronization primitives (e.g. mutexes and rwlocks).
320Sstevel@tonic-gate * Typical usage is as follows:
330Sstevel@tonic-gate *
340Sstevel@tonic-gate * To block on lock 'lp' for read access in foo_enter():
350Sstevel@tonic-gate *
360Sstevel@tonic-gate * ts = turnstile_lookup(lp);
370Sstevel@tonic-gate * [ If the lock is still held, set the waiters bit
380Sstevel@tonic-gate * turnstile_block(ts, TS_READER_Q, lp, &foo_sobj_ops);
390Sstevel@tonic-gate *
400Sstevel@tonic-gate * To wake threads waiting for write access to lock 'lp' in foo_exit():
410Sstevel@tonic-gate *
420Sstevel@tonic-gate * ts = turnstile_lookup(lp);
430Sstevel@tonic-gate * [ Either drop the lock (change owner to NULL) or perform a direct
440Sstevel@tonic-gate * [ handoff (change owner to one of the threads we're about to wake).
450Sstevel@tonic-gate * [ If we're going to wake the last waiter, clear the waiters bit.
460Sstevel@tonic-gate * turnstile_wakeup(ts, TS_WRITER_Q, nwaiters, new_owner or NULL);
470Sstevel@tonic-gate *
480Sstevel@tonic-gate * turnstile_lookup() returns holding the turnstile hash chain lock for lp.
490Sstevel@tonic-gate * Both turnstile_block() and turnstile_wakeup() drop the turnstile lock.
500Sstevel@tonic-gate * To abort a turnstile operation, the client must call turnstile_exit().
510Sstevel@tonic-gate *
520Sstevel@tonic-gate * Requirements of the client:
530Sstevel@tonic-gate *
540Sstevel@tonic-gate * (1) The lock's waiters indicator may be manipulated *only* while
550Sstevel@tonic-gate * holding the turnstile hash chain lock (i.e. under turnstile_lookup()).
560Sstevel@tonic-gate *
570Sstevel@tonic-gate * (2) Once the lock is marked as having waiters, the owner may be
580Sstevel@tonic-gate * changed *only* while holding the turnstile hash chain lock.
590Sstevel@tonic-gate *
600Sstevel@tonic-gate * (3) The caller must never block on an unheld lock.
610Sstevel@tonic-gate *
620Sstevel@tonic-gate * Consequences of these assumptions include the following:
630Sstevel@tonic-gate *
640Sstevel@tonic-gate * (a) It is impossible for a lock to be unheld but have waiters.
650Sstevel@tonic-gate *
660Sstevel@tonic-gate * (b) The priority inheritance code can safely assume that an active
670Sstevel@tonic-gate * turnstile's ts_inheritor never changes until the inheritor calls
680Sstevel@tonic-gate * turnstile_pi_waive().
690Sstevel@tonic-gate *
700Sstevel@tonic-gate * These assumptions simplify the implementation of both turnstiles and
710Sstevel@tonic-gate * their clients.
720Sstevel@tonic-gate *
730Sstevel@tonic-gate * Background on priority inheritance:
740Sstevel@tonic-gate *
750Sstevel@tonic-gate * Priority inheritance allows a thread to "will" its dispatch priority
760Sstevel@tonic-gate * to all the threads blocking it, directly or indirectly. This prevents
770Sstevel@tonic-gate * situations called priority inversions in which a high-priority thread
780Sstevel@tonic-gate * needs a lock held by a low-priority thread, which cannot run because
790Sstevel@tonic-gate * of medium-priority threads. Without PI, the medium-priority threads
800Sstevel@tonic-gate * can starve out the high-priority thread indefinitely. With PI, the
810Sstevel@tonic-gate * low-priority thread becomes high-priority until it releases whatever
820Sstevel@tonic-gate * synchronization object the real high-priority thread is waiting for.
830Sstevel@tonic-gate *
840Sstevel@tonic-gate * How turnstiles work:
850Sstevel@tonic-gate *
860Sstevel@tonic-gate * All active turnstiles reside in a global hash table, turnstile_table[].
870Sstevel@tonic-gate * The address of a synchronization object determines its hash index.
880Sstevel@tonic-gate * Each hash chain is protected by its own dispatcher lock, acquired
890Sstevel@tonic-gate * by turnstile_lookup(). This lock protects the hash chain linkage, the
900Sstevel@tonic-gate * contents of all turnstiles on the hash chain, and the waiters bits of
910Sstevel@tonic-gate * every synchronization object in the system that hashes to the same chain.
920Sstevel@tonic-gate * Giving the lock such broad scope simplifies the interactions between
930Sstevel@tonic-gate * the turnstile code and its clients considerably. The blocking path
940Sstevel@tonic-gate * is rare enough that this has no impact on scalability. (If it ever
950Sstevel@tonic-gate * does, it's almost surely a second-order effect -- the real problem
960Sstevel@tonic-gate * is that some synchronization object is *very* heavily contended.)
970Sstevel@tonic-gate *
980Sstevel@tonic-gate * Each thread has an attached turnstile in case it needs to block.
990Sstevel@tonic-gate * A thread cannot block on more than one lock at a time, so one
1000Sstevel@tonic-gate * turnstile per thread is the most we ever need. The first thread
1010Sstevel@tonic-gate * to block on a lock donates its attached turnstile and adds it to
1020Sstevel@tonic-gate * the appropriate hash chain in turnstile_table[]. This becomes the
1030Sstevel@tonic-gate * "active turnstile" for the lock. Each subsequent thread that blocks
1040Sstevel@tonic-gate * on the same lock discovers that the lock already has an active
1050Sstevel@tonic-gate * turnstile, so it stashes its own turnstile on the active turnstile's
1060Sstevel@tonic-gate * freelist. As threads wake up, the process is reversed.
1070Sstevel@tonic-gate *
1080Sstevel@tonic-gate * turnstile_block() puts the current thread to sleep on the active
1090Sstevel@tonic-gate * turnstile for the desired lock, walks the blocking chain to apply
1100Sstevel@tonic-gate * priority inheritance to everyone in its way, and yields the CPU.
1110Sstevel@tonic-gate *
1120Sstevel@tonic-gate * turnstile_wakeup() waives any priority the owner may have inherited
1130Sstevel@tonic-gate * and wakes the specified number of waiting threads. If the caller is
1140Sstevel@tonic-gate * doing direct handoff of ownership (rather than just dropping the lock),
1150Sstevel@tonic-gate * the new owner automatically inherits priority from any existing waiters.
1160Sstevel@tonic-gate */
1170Sstevel@tonic-gate
1180Sstevel@tonic-gate #include <sys/param.h>
1190Sstevel@tonic-gate #include <sys/systm.h>
1200Sstevel@tonic-gate #include <sys/thread.h>
1210Sstevel@tonic-gate #include <sys/proc.h>
1220Sstevel@tonic-gate #include <sys/debug.h>
1230Sstevel@tonic-gate #include <sys/cpuvar.h>
1240Sstevel@tonic-gate #include <sys/turnstile.h>
1250Sstevel@tonic-gate #include <sys/t_lock.h>
1260Sstevel@tonic-gate #include <sys/disp.h>
1270Sstevel@tonic-gate #include <sys/sobject.h>
1280Sstevel@tonic-gate #include <sys/cmn_err.h>
1290Sstevel@tonic-gate #include <sys/sysmacros.h>
1300Sstevel@tonic-gate #include <sys/lockstat.h>
1310Sstevel@tonic-gate #include <sys/lwp_upimutex_impl.h>
1320Sstevel@tonic-gate #include <sys/schedctl.h>
1330Sstevel@tonic-gate #include <sys/cpu.h>
1340Sstevel@tonic-gate #include <sys/sdt.h>
1350Sstevel@tonic-gate #include <sys/cpupart.h>
1360Sstevel@tonic-gate
1370Sstevel@tonic-gate extern upib_t upimutextab[UPIMUTEX_TABSIZE];
1380Sstevel@tonic-gate
1390Sstevel@tonic-gate #define IS_UPI(sobj) \
1400Sstevel@tonic-gate ((uintptr_t)(sobj) - (uintptr_t)upimutextab < sizeof (upimutextab))
1410Sstevel@tonic-gate
1420Sstevel@tonic-gate /*
1430Sstevel@tonic-gate * The turnstile hash table is partitioned into two halves: the lower half
1440Sstevel@tonic-gate * is used for upimutextab[] locks, the upper half for everything else.
1450Sstevel@tonic-gate * The reason for the distinction is that SOBJ_USER_PI locks present a
1460Sstevel@tonic-gate * unique problem: the upimutextab[] lock passed to turnstile_block()
1470Sstevel@tonic-gate * cannot be dropped until the calling thread has blocked on its
1480Sstevel@tonic-gate * SOBJ_USER_PI lock and willed its priority down the blocking chain.
1490Sstevel@tonic-gate * At that point, the caller's t_lockp will be one of the turnstile locks.
1500Sstevel@tonic-gate * If mutex_exit() discovers that the upimutextab[] lock has waiters, it
1510Sstevel@tonic-gate * must wake them, which forces a lock ordering on us: the turnstile lock
1520Sstevel@tonic-gate * for the upimutextab[] lock will be acquired in mutex_vector_exit(),
1530Sstevel@tonic-gate * which will eventually call into turnstile_pi_waive(), which will then
1540Sstevel@tonic-gate * acquire the caller's thread lock, which in this case is the turnstile
1550Sstevel@tonic-gate * lock for the SOBJ_USER_PI lock. In general, when two turnstile locks
1560Sstevel@tonic-gate * must be held at the same time, the lock order must be the address order.
1570Sstevel@tonic-gate * Therefore, to prevent deadlock in turnstile_pi_waive(), we must ensure
1580Sstevel@tonic-gate * that upimutextab[] locks *always* hash to lower addresses than any
1590Sstevel@tonic-gate * other locks. You think this is cheesy? Let's see you do better.
1600Sstevel@tonic-gate */
1610Sstevel@tonic-gate #define TURNSTILE_HASH_SIZE 128 /* must be power of 2 */
1620Sstevel@tonic-gate #define TURNSTILE_HASH_MASK (TURNSTILE_HASH_SIZE - 1)
1630Sstevel@tonic-gate #define TURNSTILE_SOBJ_HASH(sobj) \
1640Sstevel@tonic-gate ((((ulong_t)sobj >> 2) + ((ulong_t)sobj >> 9)) & TURNSTILE_HASH_MASK)
1650Sstevel@tonic-gate #define TURNSTILE_SOBJ_BUCKET(sobj) \
1660Sstevel@tonic-gate ((IS_UPI(sobj) ? 0 : TURNSTILE_HASH_SIZE) + TURNSTILE_SOBJ_HASH(sobj))
1670Sstevel@tonic-gate #define TURNSTILE_CHAIN(sobj) turnstile_table[TURNSTILE_SOBJ_BUCKET(sobj)]
1680Sstevel@tonic-gate
1690Sstevel@tonic-gate typedef struct turnstile_chain {
1700Sstevel@tonic-gate turnstile_t *tc_first; /* first turnstile on hash chain */
1710Sstevel@tonic-gate disp_lock_t tc_lock; /* lock for this hash chain */
1720Sstevel@tonic-gate } turnstile_chain_t;
1730Sstevel@tonic-gate
1740Sstevel@tonic-gate turnstile_chain_t turnstile_table[2 * TURNSTILE_HASH_SIZE];
1750Sstevel@tonic-gate
1760Sstevel@tonic-gate static lock_t turnstile_loser_lock;
1770Sstevel@tonic-gate
1780Sstevel@tonic-gate /*
1790Sstevel@tonic-gate * Make 'inheritor' inherit priority from this turnstile.
1800Sstevel@tonic-gate */
1810Sstevel@tonic-gate static void
turnstile_pi_inherit(turnstile_t * ts,kthread_t * inheritor,pri_t epri)1820Sstevel@tonic-gate turnstile_pi_inherit(turnstile_t *ts, kthread_t *inheritor, pri_t epri)
1830Sstevel@tonic-gate {
1840Sstevel@tonic-gate ASSERT(THREAD_LOCK_HELD(inheritor));
1850Sstevel@tonic-gate ASSERT(DISP_LOCK_HELD(&TURNSTILE_CHAIN(ts->ts_sobj).tc_lock));
1860Sstevel@tonic-gate
1870Sstevel@tonic-gate if (epri <= inheritor->t_pri)
1880Sstevel@tonic-gate return;
1890Sstevel@tonic-gate
1900Sstevel@tonic-gate if (ts->ts_inheritor == NULL) {
1910Sstevel@tonic-gate ts->ts_inheritor = inheritor;
1920Sstevel@tonic-gate ts->ts_epri = epri;
1930Sstevel@tonic-gate disp_lock_enter_high(&inheritor->t_pi_lock);
1940Sstevel@tonic-gate ts->ts_prioinv = inheritor->t_prioinv;
1950Sstevel@tonic-gate inheritor->t_prioinv = ts;
1960Sstevel@tonic-gate disp_lock_exit_high(&inheritor->t_pi_lock);
1970Sstevel@tonic-gate } else {
1980Sstevel@tonic-gate /*
1990Sstevel@tonic-gate * 'inheritor' is already inheriting from this turnstile,
2000Sstevel@tonic-gate * so just adjust its priority.
2010Sstevel@tonic-gate */
2020Sstevel@tonic-gate ASSERT(ts->ts_inheritor == inheritor);
2030Sstevel@tonic-gate if (ts->ts_epri < epri)
2040Sstevel@tonic-gate ts->ts_epri = epri;
2050Sstevel@tonic-gate }
2060Sstevel@tonic-gate
2070Sstevel@tonic-gate if (epri > DISP_PRIO(inheritor))
2080Sstevel@tonic-gate thread_change_epri(inheritor, epri);
2090Sstevel@tonic-gate }
2100Sstevel@tonic-gate
2110Sstevel@tonic-gate /*
2120Sstevel@tonic-gate * If turnstile is non-NULL, remove it from inheritor's t_prioinv list.
2130Sstevel@tonic-gate * Compute new inherited priority, and return it.
2140Sstevel@tonic-gate */
2150Sstevel@tonic-gate static pri_t
turnstile_pi_tsdelete(turnstile_t * ts,kthread_t * inheritor)2160Sstevel@tonic-gate turnstile_pi_tsdelete(turnstile_t *ts, kthread_t *inheritor)
2170Sstevel@tonic-gate {
2180Sstevel@tonic-gate turnstile_t **tspp, *tsp;
2190Sstevel@tonic-gate pri_t new_epri = 0;
2200Sstevel@tonic-gate
2210Sstevel@tonic-gate disp_lock_enter_high(&inheritor->t_pi_lock);
2220Sstevel@tonic-gate tspp = &inheritor->t_prioinv;
2230Sstevel@tonic-gate while ((tsp = *tspp) != NULL) {
2240Sstevel@tonic-gate if (tsp == ts)
2250Sstevel@tonic-gate *tspp = tsp->ts_prioinv;
2260Sstevel@tonic-gate else
2270Sstevel@tonic-gate new_epri = MAX(new_epri, tsp->ts_epri);
2280Sstevel@tonic-gate tspp = &tsp->ts_prioinv;
2290Sstevel@tonic-gate }
2300Sstevel@tonic-gate disp_lock_exit_high(&inheritor->t_pi_lock);
2310Sstevel@tonic-gate return (new_epri);
2320Sstevel@tonic-gate }
2330Sstevel@tonic-gate
2340Sstevel@tonic-gate /*
2350Sstevel@tonic-gate * Remove turnstile from inheritor's t_prioinv list, compute
2360Sstevel@tonic-gate * new priority, and change the inheritor's effective priority if
2370Sstevel@tonic-gate * necessary. Keep in synch with turnstile_pi_recalc().
2380Sstevel@tonic-gate */
2390Sstevel@tonic-gate static void
turnstile_pi_waive(turnstile_t * ts)2400Sstevel@tonic-gate turnstile_pi_waive(turnstile_t *ts)
2410Sstevel@tonic-gate {
2420Sstevel@tonic-gate kthread_t *inheritor = ts->ts_inheritor;
2430Sstevel@tonic-gate pri_t new_epri;
2440Sstevel@tonic-gate
2450Sstevel@tonic-gate ASSERT(inheritor == curthread);
2460Sstevel@tonic-gate
2470Sstevel@tonic-gate thread_lock_high(inheritor);
2480Sstevel@tonic-gate new_epri = turnstile_pi_tsdelete(ts, inheritor);
2490Sstevel@tonic-gate if (new_epri != DISP_PRIO(inheritor))
2500Sstevel@tonic-gate thread_change_epri(inheritor, new_epri);
2510Sstevel@tonic-gate ts->ts_inheritor = NULL;
2520Sstevel@tonic-gate if (DISP_MUST_SURRENDER(inheritor))
2530Sstevel@tonic-gate cpu_surrender(inheritor);
2540Sstevel@tonic-gate thread_unlock_high(inheritor);
2550Sstevel@tonic-gate }
2560Sstevel@tonic-gate
2570Sstevel@tonic-gate /*
2580Sstevel@tonic-gate * Compute caller's new inherited priority, and change its effective
2590Sstevel@tonic-gate * priority if necessary. Necessary only for SOBJ_USER_PI, because of
2600Sstevel@tonic-gate * its interruptibility characteristic.
2610Sstevel@tonic-gate */
2620Sstevel@tonic-gate void
turnstile_pi_recalc(void)2630Sstevel@tonic-gate turnstile_pi_recalc(void)
2640Sstevel@tonic-gate {
2650Sstevel@tonic-gate kthread_t *inheritor = curthread;
2660Sstevel@tonic-gate pri_t new_epri;
2670Sstevel@tonic-gate
2680Sstevel@tonic-gate thread_lock(inheritor);
2690Sstevel@tonic-gate new_epri = turnstile_pi_tsdelete(NULL, inheritor);
2700Sstevel@tonic-gate if (new_epri != DISP_PRIO(inheritor))
2710Sstevel@tonic-gate thread_change_epri(inheritor, new_epri);
2720Sstevel@tonic-gate if (DISP_MUST_SURRENDER(inheritor))
2730Sstevel@tonic-gate cpu_surrender(inheritor);
2740Sstevel@tonic-gate thread_unlock(inheritor);
2750Sstevel@tonic-gate }
2760Sstevel@tonic-gate
2770Sstevel@tonic-gate /*
2780Sstevel@tonic-gate * Grab the lock protecting the hash chain for sobj
2790Sstevel@tonic-gate * and return the active turnstile for sobj, if any.
2800Sstevel@tonic-gate */
2810Sstevel@tonic-gate turnstile_t *
turnstile_lookup(void * sobj)2820Sstevel@tonic-gate turnstile_lookup(void *sobj)
2830Sstevel@tonic-gate {
2840Sstevel@tonic-gate turnstile_t *ts;
2850Sstevel@tonic-gate turnstile_chain_t *tc = &TURNSTILE_CHAIN(sobj);
2860Sstevel@tonic-gate
2870Sstevel@tonic-gate disp_lock_enter(&tc->tc_lock);
2880Sstevel@tonic-gate
2890Sstevel@tonic-gate for (ts = tc->tc_first; ts != NULL; ts = ts->ts_next)
2900Sstevel@tonic-gate if (ts->ts_sobj == sobj)
2910Sstevel@tonic-gate break;
2920Sstevel@tonic-gate
2930Sstevel@tonic-gate return (ts);
2940Sstevel@tonic-gate }
2950Sstevel@tonic-gate
2960Sstevel@tonic-gate /*
2970Sstevel@tonic-gate * Drop the lock protecting the hash chain for sobj.
2980Sstevel@tonic-gate */
2990Sstevel@tonic-gate void
turnstile_exit(void * sobj)3000Sstevel@tonic-gate turnstile_exit(void *sobj)
3010Sstevel@tonic-gate {
3020Sstevel@tonic-gate disp_lock_exit(&TURNSTILE_CHAIN(sobj).tc_lock);
3030Sstevel@tonic-gate }
3040Sstevel@tonic-gate
3050Sstevel@tonic-gate /*
3060Sstevel@tonic-gate * When we apply priority inheritance, we must grab the owner's thread lock
3070Sstevel@tonic-gate * while already holding the waiter's thread lock. If both thread locks are
3080Sstevel@tonic-gate * turnstile locks, this can lead to deadlock: while we hold L1 and try to
3090Sstevel@tonic-gate * grab L2, some unrelated thread may be applying priority inheritance to
3100Sstevel@tonic-gate * some other blocking chain, holding L2 and trying to grab L1. The most
3110Sstevel@tonic-gate * obvious solution -- do a lock_try() for the owner lock -- isn't quite
3120Sstevel@tonic-gate * sufficient because it can cause livelock: each thread may hold one lock,
3130Sstevel@tonic-gate * try to grab the other, fail, bail out, and try again, looping forever.
3140Sstevel@tonic-gate * To prevent livelock we must define a winner, i.e. define an arbitrary
3150Sstevel@tonic-gate * lock ordering on the turnstile locks. For simplicity we declare that
3160Sstevel@tonic-gate * virtual address order defines lock order, i.e. if L1 < L2, then the
3170Sstevel@tonic-gate * correct lock ordering is L1, L2. Thus the thread that holds L1 and
3180Sstevel@tonic-gate * wants L2 should spin until L2 is available, but the thread that holds
3190Sstevel@tonic-gate * L2 and can't get L1 on the first try must drop L2 and return failure.
3200Sstevel@tonic-gate * Moreover, the losing thread must not reacquire L2 until the winning
3210Sstevel@tonic-gate * thread has had a chance to grab it; to ensure this, the losing thread
3220Sstevel@tonic-gate * must grab L1 after dropping L2, thus spinning until the winner is done.
3230Sstevel@tonic-gate * Complicating matters further, note that the owner's thread lock pointer
3240Sstevel@tonic-gate * can change (i.e. be pointed at a different lock) while we're trying to
3250Sstevel@tonic-gate * grab it. If that happens, we must unwind our state and try again.
3260Sstevel@tonic-gate *
3270Sstevel@tonic-gate * On success, returns 1 with both locks held.
3280Sstevel@tonic-gate * On failure, returns 0 with neither lock held.
3290Sstevel@tonic-gate */
3300Sstevel@tonic-gate static int
turnstile_interlock(lock_t * wlp,lock_t * volatile * olpp)3310Sstevel@tonic-gate turnstile_interlock(lock_t *wlp, lock_t *volatile *olpp)
3320Sstevel@tonic-gate {
3330Sstevel@tonic-gate ASSERT(LOCK_HELD(wlp));
3340Sstevel@tonic-gate
3350Sstevel@tonic-gate for (;;) {
3360Sstevel@tonic-gate volatile lock_t *olp = *olpp;
3370Sstevel@tonic-gate
3380Sstevel@tonic-gate /*
3390Sstevel@tonic-gate * If the locks are identical, there's nothing to do.
3400Sstevel@tonic-gate */
3410Sstevel@tonic-gate if (olp == wlp)
3420Sstevel@tonic-gate return (1);
3430Sstevel@tonic-gate if (lock_try((lock_t *)olp)) {
3440Sstevel@tonic-gate /*
3450Sstevel@tonic-gate * If 'olp' is still the right lock, return success.
3460Sstevel@tonic-gate * Otherwise, drop 'olp' and try the dance again.
3470Sstevel@tonic-gate */
3480Sstevel@tonic-gate if (olp == *olpp)
3490Sstevel@tonic-gate return (1);
3500Sstevel@tonic-gate lock_clear((lock_t *)olp);
3510Sstevel@tonic-gate } else {
3526103Sck142721 hrtime_t spin_time = 0;
3530Sstevel@tonic-gate /*
3540Sstevel@tonic-gate * If we're grabbing the locks out of order, we lose.
3550Sstevel@tonic-gate * Drop the waiter's lock, and then grab and release
3560Sstevel@tonic-gate * the owner's lock to ensure that we won't retry
3570Sstevel@tonic-gate * until the winner is done (as described above).
3580Sstevel@tonic-gate */
3590Sstevel@tonic-gate if (olp >= (lock_t *)turnstile_table && olp < wlp) {
3600Sstevel@tonic-gate lock_clear(wlp);
3610Sstevel@tonic-gate lock_set((lock_t *)olp);
3620Sstevel@tonic-gate lock_clear((lock_t *)olp);
3630Sstevel@tonic-gate return (0);
3640Sstevel@tonic-gate }
3650Sstevel@tonic-gate /*
3660Sstevel@tonic-gate * We're grabbing the locks in the right order,
3670Sstevel@tonic-gate * so spin until the owner's lock either becomes
3680Sstevel@tonic-gate * available or spontaneously changes.
3690Sstevel@tonic-gate */
3706103Sck142721 spin_time =
3716103Sck142721 LOCKSTAT_START_TIME(LS_TURNSTILE_INTERLOCK_SPIN);
3720Sstevel@tonic-gate while (olp == *olpp && LOCK_HELD(olp)) {
3730Sstevel@tonic-gate if (panicstr)
3740Sstevel@tonic-gate return (1);
3750Sstevel@tonic-gate SMT_PAUSE();
3760Sstevel@tonic-gate }
3776103Sck142721 LOCKSTAT_RECORD_TIME(LS_TURNSTILE_INTERLOCK_SPIN,
3786103Sck142721 olp, spin_time);
3790Sstevel@tonic-gate }
3800Sstevel@tonic-gate }
3810Sstevel@tonic-gate }
3820Sstevel@tonic-gate
3830Sstevel@tonic-gate /*
3840Sstevel@tonic-gate * Block the current thread on a synchronization object.
3850Sstevel@tonic-gate *
3860Sstevel@tonic-gate * Turnstiles implement both kernel and user-level priority inheritance.
3870Sstevel@tonic-gate * To avoid missed wakeups in the user-level case, lwp_upimutex_lock() calls
3880Sstevel@tonic-gate * turnstile_block() holding the appropriate lock in the upimutextab (see
3890Sstevel@tonic-gate * the block comment in lwp_upimutex_lock() for details). The held lock is
3900Sstevel@tonic-gate * passed to turnstile_block() as the "mp" parameter, and will be dropped
3910Sstevel@tonic-gate * after priority has been willed, but before the thread actually sleeps
3920Sstevel@tonic-gate * (this locking behavior leads to some subtle ordering issues; see the
3930Sstevel@tonic-gate * block comment on turnstile hashing for details). This _must_ be the only
3940Sstevel@tonic-gate * lock held when calling turnstile_block() with a SOBJ_USER_PI sobj; holding
3950Sstevel@tonic-gate * other locks can result in panics due to cycles in the blocking chain.
3960Sstevel@tonic-gate *
3970Sstevel@tonic-gate * turnstile_block() always succeeds for kernel synchronization objects.
3980Sstevel@tonic-gate * For SOBJ_USER_PI locks the possible errors are EINTR for signals, and
3990Sstevel@tonic-gate * EDEADLK for cycles in the blocking chain. A return code of zero indicates
4000Sstevel@tonic-gate * *either* that the lock is now held, or that this is a spurious wake-up, or
4010Sstevel@tonic-gate * that the lock can never be held due to an ENOTRECOVERABLE error.
4020Sstevel@tonic-gate * It is up to lwp_upimutex_lock() to sort this all out.
4030Sstevel@tonic-gate */
4040Sstevel@tonic-gate
4050Sstevel@tonic-gate int
turnstile_block(turnstile_t * ts,int qnum,void * sobj,sobj_ops_t * sobj_ops,kmutex_t * mp,lwp_timer_t * lwptp)4060Sstevel@tonic-gate turnstile_block(turnstile_t *ts, int qnum, void *sobj, sobj_ops_t *sobj_ops,
4070Sstevel@tonic-gate kmutex_t *mp, lwp_timer_t *lwptp)
4080Sstevel@tonic-gate {
4090Sstevel@tonic-gate kthread_t *owner;
4100Sstevel@tonic-gate kthread_t *t = curthread;
4110Sstevel@tonic-gate proc_t *p = ttoproc(t);
4120Sstevel@tonic-gate klwp_t *lwp = ttolwp(t);
4130Sstevel@tonic-gate turnstile_chain_t *tc = &TURNSTILE_CHAIN(sobj);
4140Sstevel@tonic-gate int error = 0;
4150Sstevel@tonic-gate int loser = 0;
4160Sstevel@tonic-gate
4170Sstevel@tonic-gate ASSERT(DISP_LOCK_HELD(&tc->tc_lock));
4180Sstevel@tonic-gate ASSERT(mp == NULL || IS_UPI(mp));
4190Sstevel@tonic-gate ASSERT((SOBJ_TYPE(sobj_ops) == SOBJ_USER_PI) ^ (mp == NULL));
4200Sstevel@tonic-gate
4210Sstevel@tonic-gate thread_lock_high(t);
4220Sstevel@tonic-gate
4230Sstevel@tonic-gate if (ts == NULL) {
4240Sstevel@tonic-gate /*
4250Sstevel@tonic-gate * This is the first thread to block on this sobj.
4260Sstevel@tonic-gate * Take its attached turnstile and add it to the hash chain.
4270Sstevel@tonic-gate */
4280Sstevel@tonic-gate ts = t->t_ts;
4290Sstevel@tonic-gate ts->ts_sobj = sobj;
4300Sstevel@tonic-gate ts->ts_next = tc->tc_first;
4310Sstevel@tonic-gate tc->tc_first = ts;
4320Sstevel@tonic-gate ASSERT(ts->ts_waiters == 0);
4330Sstevel@tonic-gate } else {
4340Sstevel@tonic-gate /*
4350Sstevel@tonic-gate * Another thread has already donated its turnstile
4360Sstevel@tonic-gate * to block on this sobj, so ours isn't needed.
4370Sstevel@tonic-gate * Stash it on the active turnstile's freelist.
4380Sstevel@tonic-gate */
4390Sstevel@tonic-gate turnstile_t *myts = t->t_ts;
4400Sstevel@tonic-gate myts->ts_free = ts->ts_free;
4410Sstevel@tonic-gate ts->ts_free = myts;
4420Sstevel@tonic-gate t->t_ts = ts;
4430Sstevel@tonic-gate ASSERT(ts->ts_sobj == sobj);
4440Sstevel@tonic-gate ASSERT(ts->ts_waiters > 0);
4450Sstevel@tonic-gate }
4460Sstevel@tonic-gate
4470Sstevel@tonic-gate /*
4480Sstevel@tonic-gate * Put the thread to sleep.
4490Sstevel@tonic-gate */
4500Sstevel@tonic-gate ASSERT(t != CPU->cpu_idle_thread);
4510Sstevel@tonic-gate ASSERT(CPU_ON_INTR(CPU) == 0);
4520Sstevel@tonic-gate ASSERT(t->t_wchan0 == NULL && t->t_wchan == NULL);
4530Sstevel@tonic-gate ASSERT(t->t_state == TS_ONPROC);
4540Sstevel@tonic-gate
4550Sstevel@tonic-gate if (SOBJ_TYPE(sobj_ops) == SOBJ_USER_PI) {
4560Sstevel@tonic-gate curthread->t_flag |= T_WAKEABLE;
4570Sstevel@tonic-gate }
4580Sstevel@tonic-gate CL_SLEEP(t); /* assign kernel priority */
4590Sstevel@tonic-gate THREAD_SLEEP(t, &tc->tc_lock);
4600Sstevel@tonic-gate t->t_wchan = sobj;
4610Sstevel@tonic-gate t->t_sobj_ops = sobj_ops;
4620Sstevel@tonic-gate DTRACE_SCHED(sleep);
4630Sstevel@tonic-gate
4640Sstevel@tonic-gate if (lwp != NULL) {
4650Sstevel@tonic-gate lwp->lwp_ru.nvcsw++;
4660Sstevel@tonic-gate (void) new_mstate(t, LMS_SLEEP);
4670Sstevel@tonic-gate if (SOBJ_TYPE(sobj_ops) == SOBJ_USER_PI) {
4680Sstevel@tonic-gate lwp->lwp_asleep = 1;
4690Sstevel@tonic-gate lwp->lwp_sysabort = 0;
4700Sstevel@tonic-gate /*
4710Sstevel@tonic-gate * make wchan0 non-zero to conform to the rule that
4720Sstevel@tonic-gate * threads blocking for user-level objects have a
4730Sstevel@tonic-gate * non-zero wchan0: this prevents spurious wake-ups
4740Sstevel@tonic-gate * by, for example, /proc.
4750Sstevel@tonic-gate */
4760Sstevel@tonic-gate t->t_wchan0 = (caddr_t)1;
4770Sstevel@tonic-gate }
4780Sstevel@tonic-gate }
4790Sstevel@tonic-gate ts->ts_waiters++;
4800Sstevel@tonic-gate sleepq_insert(&ts->ts_sleepq[qnum], t);
4810Sstevel@tonic-gate
4820Sstevel@tonic-gate if (SOBJ_TYPE(sobj_ops) == SOBJ_MUTEX &&
4830Sstevel@tonic-gate SOBJ_OWNER(sobj_ops, sobj) == NULL)
4840Sstevel@tonic-gate panic("turnstile_block(%p): unowned mutex", (void *)ts);
4850Sstevel@tonic-gate
4860Sstevel@tonic-gate /*
4870Sstevel@tonic-gate * Follow the blocking chain to its end, willing our priority to
4880Sstevel@tonic-gate * everyone who's in our way.
4890Sstevel@tonic-gate */
4900Sstevel@tonic-gate while (t->t_sobj_ops != NULL &&
4910Sstevel@tonic-gate (owner = SOBJ_OWNER(t->t_sobj_ops, t->t_wchan)) != NULL) {
4920Sstevel@tonic-gate if (owner == curthread) {
4930Sstevel@tonic-gate if (SOBJ_TYPE(sobj_ops) != SOBJ_USER_PI) {
4940Sstevel@tonic-gate panic("Deadlock: cycle in blocking chain");
4950Sstevel@tonic-gate }
4960Sstevel@tonic-gate /*
4970Sstevel@tonic-gate * If the cycle we've encountered ends in mp,
4980Sstevel@tonic-gate * then we know it isn't a 'real' cycle because
4990Sstevel@tonic-gate * we're going to drop mp before we go to sleep.
5000Sstevel@tonic-gate * Moreover, since we've come full circle we know
5010Sstevel@tonic-gate * that we must have willed priority to everyone
5020Sstevel@tonic-gate * in our way. Therefore, we can break out now.
5030Sstevel@tonic-gate */
5040Sstevel@tonic-gate if (t->t_wchan == (void *)mp)
5050Sstevel@tonic-gate break;
5060Sstevel@tonic-gate
5070Sstevel@tonic-gate if (loser)
5080Sstevel@tonic-gate lock_clear(&turnstile_loser_lock);
5090Sstevel@tonic-gate /*
5100Sstevel@tonic-gate * For SOBJ_USER_PI, a cycle is an application
5110Sstevel@tonic-gate * deadlock which needs to be communicated
5120Sstevel@tonic-gate * back to the application.
5130Sstevel@tonic-gate */
5140Sstevel@tonic-gate thread_unlock_nopreempt(t);
5150Sstevel@tonic-gate mutex_exit(mp);
5160Sstevel@tonic-gate setrun(curthread);
5170Sstevel@tonic-gate swtch(); /* necessary to transition state */
5180Sstevel@tonic-gate curthread->t_flag &= ~T_WAKEABLE;
5196622Sraf if (lwptp->lwpt_id != 0)
5206622Sraf (void) lwp_timer_dequeue(lwptp);
5210Sstevel@tonic-gate setallwatch();
5220Sstevel@tonic-gate lwp->lwp_asleep = 0;
5230Sstevel@tonic-gate lwp->lwp_sysabort = 0;
5240Sstevel@tonic-gate return (EDEADLK);
5250Sstevel@tonic-gate }
5260Sstevel@tonic-gate if (!turnstile_interlock(t->t_lockp, &owner->t_lockp)) {
5270Sstevel@tonic-gate /*
5280Sstevel@tonic-gate * If we failed to grab the owner's thread lock,
5290Sstevel@tonic-gate * turnstile_interlock() will have dropped t's
5300Sstevel@tonic-gate * thread lock, so at this point we don't even know
5310Sstevel@tonic-gate * that 't' exists anymore. The simplest solution
5320Sstevel@tonic-gate * is to restart the entire priority inheritance dance
5330Sstevel@tonic-gate * from the beginning of the blocking chain, since
5340Sstevel@tonic-gate * we *do* know that 'curthread' still exists.
5350Sstevel@tonic-gate * Application of priority inheritance is idempotent,
5360Sstevel@tonic-gate * so it's OK that we're doing it more than once.
5370Sstevel@tonic-gate * Note also that since we've dropped our thread lock,
5380Sstevel@tonic-gate * we may already have been woken up; if so, our
5390Sstevel@tonic-gate * t_sobj_ops will be NULL, the loop will terminate,
5400Sstevel@tonic-gate * and the call to swtch() will be a no-op. Phew.
5410Sstevel@tonic-gate *
5420Sstevel@tonic-gate * There is one further complication: if two (or more)
5430Sstevel@tonic-gate * threads keep trying to grab the turnstile locks out
5440Sstevel@tonic-gate * of order and keep losing the race to another thread,
5450Sstevel@tonic-gate * these "dueling losers" can livelock the system.
5460Sstevel@tonic-gate * Therefore, once we get into this rare situation,
5470Sstevel@tonic-gate * we serialize all the losers.
5480Sstevel@tonic-gate */
5490Sstevel@tonic-gate if (loser == 0) {
5500Sstevel@tonic-gate loser = 1;
5510Sstevel@tonic-gate lock_set(&turnstile_loser_lock);
5520Sstevel@tonic-gate }
5530Sstevel@tonic-gate t = curthread;
5540Sstevel@tonic-gate thread_lock_high(t);
5550Sstevel@tonic-gate continue;
5560Sstevel@tonic-gate }
5570Sstevel@tonic-gate
5580Sstevel@tonic-gate /*
5590Sstevel@tonic-gate * We now have the owner's thread lock. If we are traversing
5600Sstevel@tonic-gate * from non-SOBJ_USER_PI ops to SOBJ_USER_PI ops, then we know
5610Sstevel@tonic-gate * that we have caught the thread while in the TS_SLEEP state,
5620Sstevel@tonic-gate * but holding mp. We know that this situation is transient
5630Sstevel@tonic-gate * (mp will be dropped before the holder actually sleeps on
5640Sstevel@tonic-gate * the SOBJ_USER_PI sobj), so we will spin waiting for mp to
5650Sstevel@tonic-gate * be dropped. Then, as in the turnstile_interlock() failure
5660Sstevel@tonic-gate * case, we will restart the priority inheritance dance.
5670Sstevel@tonic-gate */
5680Sstevel@tonic-gate if (SOBJ_TYPE(t->t_sobj_ops) != SOBJ_USER_PI &&
5690Sstevel@tonic-gate owner->t_sobj_ops != NULL &&
5700Sstevel@tonic-gate SOBJ_TYPE(owner->t_sobj_ops) == SOBJ_USER_PI) {
5710Sstevel@tonic-gate kmutex_t *upi_lock = (kmutex_t *)t->t_wchan;
5720Sstevel@tonic-gate
5730Sstevel@tonic-gate ASSERT(IS_UPI(upi_lock));
5740Sstevel@tonic-gate ASSERT(SOBJ_TYPE(t->t_sobj_ops) == SOBJ_MUTEX);
5750Sstevel@tonic-gate
5760Sstevel@tonic-gate if (t->t_lockp != owner->t_lockp)
5770Sstevel@tonic-gate thread_unlock_high(owner);
5780Sstevel@tonic-gate thread_unlock_high(t);
5790Sstevel@tonic-gate if (loser)
5800Sstevel@tonic-gate lock_clear(&turnstile_loser_lock);
5810Sstevel@tonic-gate
5820Sstevel@tonic-gate while (mutex_owner(upi_lock) == owner) {
5830Sstevel@tonic-gate SMT_PAUSE();
5840Sstevel@tonic-gate continue;
5850Sstevel@tonic-gate }
5860Sstevel@tonic-gate
5870Sstevel@tonic-gate if (loser)
5880Sstevel@tonic-gate lock_set(&turnstile_loser_lock);
5890Sstevel@tonic-gate t = curthread;
5900Sstevel@tonic-gate thread_lock_high(t);
5910Sstevel@tonic-gate continue;
5920Sstevel@tonic-gate }
5930Sstevel@tonic-gate
5940Sstevel@tonic-gate turnstile_pi_inherit(t->t_ts, owner, DISP_PRIO(t));
5950Sstevel@tonic-gate if (t->t_lockp != owner->t_lockp)
5960Sstevel@tonic-gate thread_unlock_high(t);
5970Sstevel@tonic-gate t = owner;
5980Sstevel@tonic-gate }
5990Sstevel@tonic-gate
6000Sstevel@tonic-gate if (loser)
6010Sstevel@tonic-gate lock_clear(&turnstile_loser_lock);
6020Sstevel@tonic-gate
6030Sstevel@tonic-gate /*
6040Sstevel@tonic-gate * Note: 't' and 'curthread' were synonymous before the loop above,
6050Sstevel@tonic-gate * but now they may be different. ('t' is now the last thread in
6060Sstevel@tonic-gate * the blocking chain.)
6070Sstevel@tonic-gate */
6080Sstevel@tonic-gate if (SOBJ_TYPE(sobj_ops) == SOBJ_USER_PI) {
6090Sstevel@tonic-gate ushort_t s = curthread->t_oldspl;
6100Sstevel@tonic-gate int timedwait = 0;
6116622Sraf uint_t imm_timeout = 0;
6120Sstevel@tonic-gate clock_t tim = -1;
6130Sstevel@tonic-gate
6140Sstevel@tonic-gate thread_unlock_high(t);
6150Sstevel@tonic-gate if (lwptp->lwpt_id != 0) {
6160Sstevel@tonic-gate /*
6176622Sraf * We enqueued a timeout. If it has already fired,
6186622Sraf * lwptp->lwpt_imm_timeout has been set with cas,
6196622Sraf * so fetch it with cas.
6200Sstevel@tonic-gate */
6210Sstevel@tonic-gate timedwait = 1;
6226622Sraf imm_timeout =
6236622Sraf atomic_cas_uint(&lwptp->lwpt_imm_timeout, 0, 0);
6240Sstevel@tonic-gate }
6250Sstevel@tonic-gate mutex_exit(mp);
6260Sstevel@tonic-gate splx(s);
6270Sstevel@tonic-gate
6280Sstevel@tonic-gate if (ISSIG(curthread, JUSTLOOKING) ||
6296622Sraf MUSTRETURN(p, curthread) || imm_timeout)
6300Sstevel@tonic-gate setrun(curthread);
6310Sstevel@tonic-gate swtch();
6320Sstevel@tonic-gate curthread->t_flag &= ~T_WAKEABLE;
6330Sstevel@tonic-gate if (timedwait)
6340Sstevel@tonic-gate tim = lwp_timer_dequeue(lwptp);
6350Sstevel@tonic-gate setallwatch();
6360Sstevel@tonic-gate if (ISSIG(curthread, FORREAL) || lwp->lwp_sysabort ||
6370Sstevel@tonic-gate MUSTRETURN(p, curthread))
6380Sstevel@tonic-gate error = EINTR;
6396622Sraf else if (imm_timeout || (timedwait && tim == -1))
6400Sstevel@tonic-gate error = ETIME;
6410Sstevel@tonic-gate lwp->lwp_sysabort = 0;
6420Sstevel@tonic-gate lwp->lwp_asleep = 0;
6430Sstevel@tonic-gate } else {
6440Sstevel@tonic-gate thread_unlock_nopreempt(t);
6450Sstevel@tonic-gate swtch();
6460Sstevel@tonic-gate }
6470Sstevel@tonic-gate
6480Sstevel@tonic-gate return (error);
6490Sstevel@tonic-gate }
6500Sstevel@tonic-gate
6510Sstevel@tonic-gate /*
6520Sstevel@tonic-gate * Remove thread from specified turnstile sleep queue; retrieve its
6530Sstevel@tonic-gate * free turnstile; if it is the last waiter, delete the turnstile
6540Sstevel@tonic-gate * from the turnstile chain and if there is an inheritor, delete it
6550Sstevel@tonic-gate * from the inheritor's t_prioinv chain.
6560Sstevel@tonic-gate */
6570Sstevel@tonic-gate static void
turnstile_dequeue(kthread_t * t)6580Sstevel@tonic-gate turnstile_dequeue(kthread_t *t)
6590Sstevel@tonic-gate {
6600Sstevel@tonic-gate turnstile_t *ts = t->t_ts;
6610Sstevel@tonic-gate turnstile_chain_t *tc = &TURNSTILE_CHAIN(ts->ts_sobj);
6620Sstevel@tonic-gate turnstile_t *tsfree, **tspp;
6630Sstevel@tonic-gate
6640Sstevel@tonic-gate ASSERT(DISP_LOCK_HELD(&tc->tc_lock));
6650Sstevel@tonic-gate ASSERT(t->t_lockp == &tc->tc_lock);
6660Sstevel@tonic-gate
6670Sstevel@tonic-gate if ((tsfree = ts->ts_free) != NULL) {
6680Sstevel@tonic-gate ASSERT(ts->ts_waiters > 1);
6690Sstevel@tonic-gate ASSERT(tsfree->ts_waiters == 0);
6700Sstevel@tonic-gate t->t_ts = tsfree;
6710Sstevel@tonic-gate ts->ts_free = tsfree->ts_free;
6720Sstevel@tonic-gate tsfree->ts_free = NULL;
6730Sstevel@tonic-gate } else {
6740Sstevel@tonic-gate /*
6750Sstevel@tonic-gate * The active turnstile's freelist is empty, so this
6760Sstevel@tonic-gate * must be the last waiter. Remove the turnstile
6770Sstevel@tonic-gate * from the hash chain and leave the now-inactive
6780Sstevel@tonic-gate * turnstile attached to the thread we're waking.
6790Sstevel@tonic-gate * Note that the ts_inheritor for the turnstile
6800Sstevel@tonic-gate * may be NULL. If one exists, its t_prioinv
6810Sstevel@tonic-gate * chain has to be updated.
6820Sstevel@tonic-gate */
6830Sstevel@tonic-gate ASSERT(ts->ts_waiters == 1);
6840Sstevel@tonic-gate if (ts->ts_inheritor != NULL) {
6850Sstevel@tonic-gate (void) turnstile_pi_tsdelete(ts, ts->ts_inheritor);
6860Sstevel@tonic-gate /*
6870Sstevel@tonic-gate * If we ever do a "disinherit" or "unboost", we need
6880Sstevel@tonic-gate * to do it only if "t" is a thread at the head of the
6890Sstevel@tonic-gate * sleep queue. Since the sleep queue is prioritized,
6900Sstevel@tonic-gate * the disinherit is necessary only if the interrupted
6910Sstevel@tonic-gate * thread is the highest priority thread.
6920Sstevel@tonic-gate * Otherwise, there is a higher priority thread blocked
6930Sstevel@tonic-gate * on the turnstile, whose inheritance cannot be
6940Sstevel@tonic-gate * disinherited. However, disinheriting is explicitly
6950Sstevel@tonic-gate * not done here, since it would require holding the
6960Sstevel@tonic-gate * inheritor's thread lock (see turnstile_unsleep()).
6970Sstevel@tonic-gate */
6980Sstevel@tonic-gate ts->ts_inheritor = NULL;
6990Sstevel@tonic-gate }
7000Sstevel@tonic-gate tspp = &tc->tc_first;
7010Sstevel@tonic-gate while (*tspp != ts)
7020Sstevel@tonic-gate tspp = &(*tspp)->ts_next;
7030Sstevel@tonic-gate *tspp = ts->ts_next;
7040Sstevel@tonic-gate ASSERT(t->t_ts == ts);
7050Sstevel@tonic-gate }
7060Sstevel@tonic-gate ts->ts_waiters--;
7070Sstevel@tonic-gate sleepq_dequeue(t);
7080Sstevel@tonic-gate t->t_sobj_ops = NULL;
7090Sstevel@tonic-gate t->t_wchan = NULL;
7100Sstevel@tonic-gate t->t_wchan0 = NULL;
7110Sstevel@tonic-gate ASSERT(t->t_state == TS_SLEEP);
7120Sstevel@tonic-gate }
7130Sstevel@tonic-gate
7140Sstevel@tonic-gate /*
7150Sstevel@tonic-gate * Wake threads that are blocked in a turnstile.
7160Sstevel@tonic-gate */
7170Sstevel@tonic-gate void
turnstile_wakeup(turnstile_t * ts,int qnum,int nthreads,kthread_t * owner)7180Sstevel@tonic-gate turnstile_wakeup(turnstile_t *ts, int qnum, int nthreads, kthread_t *owner)
7190Sstevel@tonic-gate {
7200Sstevel@tonic-gate turnstile_chain_t *tc = &TURNSTILE_CHAIN(ts->ts_sobj);
7210Sstevel@tonic-gate sleepq_t *sqp = &ts->ts_sleepq[qnum];
7220Sstevel@tonic-gate
7230Sstevel@tonic-gate ASSERT(DISP_LOCK_HELD(&tc->tc_lock));
7240Sstevel@tonic-gate
7250Sstevel@tonic-gate /*
7260Sstevel@tonic-gate * Waive any priority we may have inherited from this turnstile.
7270Sstevel@tonic-gate */
7280Sstevel@tonic-gate if (ts->ts_inheritor != NULL) {
7290Sstevel@tonic-gate turnstile_pi_waive(ts);
7300Sstevel@tonic-gate }
7310Sstevel@tonic-gate while (nthreads-- > 0) {
7320Sstevel@tonic-gate kthread_t *t = sqp->sq_first;
7330Sstevel@tonic-gate ASSERT(t->t_ts == ts);
7340Sstevel@tonic-gate ASSERT(ts->ts_waiters > 1 || ts->ts_inheritor == NULL);
7350Sstevel@tonic-gate DTRACE_SCHED1(wakeup, kthread_t *, t);
7360Sstevel@tonic-gate turnstile_dequeue(t);
7370Sstevel@tonic-gate CL_WAKEUP(t); /* previous thread lock, tc_lock, not dropped */
7380Sstevel@tonic-gate /*
7390Sstevel@tonic-gate * If the caller did direct handoff of ownership,
7400Sstevel@tonic-gate * make the new owner inherit from this turnstile.
7410Sstevel@tonic-gate */
7420Sstevel@tonic-gate if (t == owner) {
7430Sstevel@tonic-gate kthread_t *wp = ts->ts_sleepq[TS_WRITER_Q].sq_first;
7440Sstevel@tonic-gate kthread_t *rp = ts->ts_sleepq[TS_READER_Q].sq_first;
7450Sstevel@tonic-gate pri_t wpri = wp ? DISP_PRIO(wp) : 0;
7460Sstevel@tonic-gate pri_t rpri = rp ? DISP_PRIO(rp) : 0;
7470Sstevel@tonic-gate turnstile_pi_inherit(ts, t, MAX(wpri, rpri));
7480Sstevel@tonic-gate owner = NULL;
7490Sstevel@tonic-gate }
7500Sstevel@tonic-gate thread_unlock_high(t); /* drop run queue lock */
7510Sstevel@tonic-gate }
7520Sstevel@tonic-gate if (owner != NULL)
753*7632SNick.Todd@Sun.COM panic("turnstile_wakeup: owner %p not woken", (void *)owner);
7540Sstevel@tonic-gate disp_lock_exit(&tc->tc_lock);
7550Sstevel@tonic-gate }
7560Sstevel@tonic-gate
7570Sstevel@tonic-gate /*
7580Sstevel@tonic-gate * Change priority of a thread sleeping in a turnstile.
7590Sstevel@tonic-gate */
7600Sstevel@tonic-gate void
turnstile_change_pri(kthread_t * t,pri_t pri,pri_t * t_prip)7610Sstevel@tonic-gate turnstile_change_pri(kthread_t *t, pri_t pri, pri_t *t_prip)
7620Sstevel@tonic-gate {
7630Sstevel@tonic-gate sleepq_t *sqp = t->t_sleepq;
7640Sstevel@tonic-gate
7650Sstevel@tonic-gate sleepq_dequeue(t);
7660Sstevel@tonic-gate *t_prip = pri;
7670Sstevel@tonic-gate sleepq_insert(sqp, t);
7680Sstevel@tonic-gate }
7690Sstevel@tonic-gate
7700Sstevel@tonic-gate /*
7710Sstevel@tonic-gate * We don't allow spurious wakeups of threads blocked in turnstiles
7720Sstevel@tonic-gate * for synch objects whose sobj_ops vector is initialized with the
7730Sstevel@tonic-gate * following routine (e.g. kernel synchronization objects).
7740Sstevel@tonic-gate * This is vital to the correctness of direct-handoff logic in some
7750Sstevel@tonic-gate * synchronization primitives, and it also simplifies the PI logic.
7760Sstevel@tonic-gate */
7770Sstevel@tonic-gate /* ARGSUSED */
7780Sstevel@tonic-gate void
turnstile_stay_asleep(kthread_t * t)7790Sstevel@tonic-gate turnstile_stay_asleep(kthread_t *t)
7800Sstevel@tonic-gate {
7810Sstevel@tonic-gate }
7820Sstevel@tonic-gate
7830Sstevel@tonic-gate /*
7840Sstevel@tonic-gate * Wake up a thread blocked in a turnstile. Used to enable interruptibility
7850Sstevel@tonic-gate * of threads blocked on a SOBJ_USER_PI sobj.
7860Sstevel@tonic-gate *
7870Sstevel@tonic-gate * The implications of this interface are:
7880Sstevel@tonic-gate *
7890Sstevel@tonic-gate * 1. turnstile_block() may return with an EINTR.
7900Sstevel@tonic-gate * 2. When the owner of an sobj releases it, but no turnstile is found (i.e.
7910Sstevel@tonic-gate * no waiters), the (prior) owner must call turnstile_pi_recalc() to
7920Sstevel@tonic-gate * waive any priority inherited from interrupted waiters.
7930Sstevel@tonic-gate *
7940Sstevel@tonic-gate * When a waiter is interrupted, disinheriting its willed priority from the
7950Sstevel@tonic-gate * inheritor would require holding the inheritor's thread lock, while also
7960Sstevel@tonic-gate * holding the waiter's thread lock which is a turnstile lock. If the
7970Sstevel@tonic-gate * inheritor's thread lock is not free, and is also a turnstile lock that
7980Sstevel@tonic-gate * is out of lock order, the waiter's thread lock would have to be dropped.
7990Sstevel@tonic-gate * This leads to complications for the caller of turnstile_unsleep(), since
8000Sstevel@tonic-gate * the caller holds the waiter's thread lock. So, instead of disinheriting
8010Sstevel@tonic-gate * on waiter interruption, the owner is required to follow rule 2 above.
8020Sstevel@tonic-gate *
8030Sstevel@tonic-gate * Avoiding disinherit on waiter interruption seems acceptable because
8040Sstevel@tonic-gate * the owner runs at an unnecessarily high priority only while sobj is held,
8050Sstevel@tonic-gate * which it would have done in any case, if the waiter had not been interrupted.
8060Sstevel@tonic-gate */
8070Sstevel@tonic-gate void
turnstile_unsleep(kthread_t * t)8080Sstevel@tonic-gate turnstile_unsleep(kthread_t *t)
8090Sstevel@tonic-gate {
8100Sstevel@tonic-gate turnstile_dequeue(t);
8110Sstevel@tonic-gate THREAD_TRANSITION(t);
8120Sstevel@tonic-gate CL_SETRUN(t);
8130Sstevel@tonic-gate }
814