10Sstevel@tonic-gate /* 20Sstevel@tonic-gate * CDDL HEADER START 30Sstevel@tonic-gate * 40Sstevel@tonic-gate * The contents of this file are subject to the terms of the 5*6103Sck142721 * Common Development and Distribution License (the "License"). 6*6103Sck142721 * You may not use this file except in compliance with the License. 70Sstevel@tonic-gate * 80Sstevel@tonic-gate * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 90Sstevel@tonic-gate * or http://www.opensolaris.org/os/licensing. 100Sstevel@tonic-gate * See the License for the specific language governing permissions 110Sstevel@tonic-gate * and limitations under the License. 120Sstevel@tonic-gate * 130Sstevel@tonic-gate * When distributing Covered Code, include this CDDL HEADER in each 140Sstevel@tonic-gate * file and include the License file at usr/src/OPENSOLARIS.LICENSE. 150Sstevel@tonic-gate * If applicable, add the following below this CDDL HEADER, with the 160Sstevel@tonic-gate * fields enclosed by brackets "[]" replaced with your own identifying 170Sstevel@tonic-gate * information: Portions Copyright [yyyy] [name of copyright owner] 180Sstevel@tonic-gate * 190Sstevel@tonic-gate * CDDL HEADER END 200Sstevel@tonic-gate */ 210Sstevel@tonic-gate /* 22*6103Sck142721 * Copyright 2008 Sun Microsystems, Inc. All rights reserved. 230Sstevel@tonic-gate * Use is subject to license terms. 240Sstevel@tonic-gate */ 250Sstevel@tonic-gate 260Sstevel@tonic-gate #pragma ident "%Z%%M% %I% %E% SMI" 270Sstevel@tonic-gate 280Sstevel@tonic-gate /* 290Sstevel@tonic-gate * Big Theory Statement for turnstiles. 300Sstevel@tonic-gate * 310Sstevel@tonic-gate * Turnstiles provide blocking and wakeup support, including priority 320Sstevel@tonic-gate * inheritance, for synchronization primitives (e.g. mutexes and rwlocks). 330Sstevel@tonic-gate * Typical usage is as follows: 340Sstevel@tonic-gate * 350Sstevel@tonic-gate * To block on lock 'lp' for read access in foo_enter(): 360Sstevel@tonic-gate * 370Sstevel@tonic-gate * ts = turnstile_lookup(lp); 380Sstevel@tonic-gate * [ If the lock is still held, set the waiters bit 390Sstevel@tonic-gate * turnstile_block(ts, TS_READER_Q, lp, &foo_sobj_ops); 400Sstevel@tonic-gate * 410Sstevel@tonic-gate * To wake threads waiting for write access to lock 'lp' in foo_exit(): 420Sstevel@tonic-gate * 430Sstevel@tonic-gate * ts = turnstile_lookup(lp); 440Sstevel@tonic-gate * [ Either drop the lock (change owner to NULL) or perform a direct 450Sstevel@tonic-gate * [ handoff (change owner to one of the threads we're about to wake). 460Sstevel@tonic-gate * [ If we're going to wake the last waiter, clear the waiters bit. 470Sstevel@tonic-gate * turnstile_wakeup(ts, TS_WRITER_Q, nwaiters, new_owner or NULL); 480Sstevel@tonic-gate * 490Sstevel@tonic-gate * turnstile_lookup() returns holding the turnstile hash chain lock for lp. 500Sstevel@tonic-gate * Both turnstile_block() and turnstile_wakeup() drop the turnstile lock. 510Sstevel@tonic-gate * To abort a turnstile operation, the client must call turnstile_exit(). 520Sstevel@tonic-gate * 530Sstevel@tonic-gate * Requirements of the client: 540Sstevel@tonic-gate * 550Sstevel@tonic-gate * (1) The lock's waiters indicator may be manipulated *only* while 560Sstevel@tonic-gate * holding the turnstile hash chain lock (i.e. under turnstile_lookup()). 570Sstevel@tonic-gate * 580Sstevel@tonic-gate * (2) Once the lock is marked as having waiters, the owner may be 590Sstevel@tonic-gate * changed *only* while holding the turnstile hash chain lock. 600Sstevel@tonic-gate * 610Sstevel@tonic-gate * (3) The caller must never block on an unheld lock. 620Sstevel@tonic-gate * 630Sstevel@tonic-gate * Consequences of these assumptions include the following: 640Sstevel@tonic-gate * 650Sstevel@tonic-gate * (a) It is impossible for a lock to be unheld but have waiters. 660Sstevel@tonic-gate * 670Sstevel@tonic-gate * (b) The priority inheritance code can safely assume that an active 680Sstevel@tonic-gate * turnstile's ts_inheritor never changes until the inheritor calls 690Sstevel@tonic-gate * turnstile_pi_waive(). 700Sstevel@tonic-gate * 710Sstevel@tonic-gate * These assumptions simplify the implementation of both turnstiles and 720Sstevel@tonic-gate * their clients. 730Sstevel@tonic-gate * 740Sstevel@tonic-gate * Background on priority inheritance: 750Sstevel@tonic-gate * 760Sstevel@tonic-gate * Priority inheritance allows a thread to "will" its dispatch priority 770Sstevel@tonic-gate * to all the threads blocking it, directly or indirectly. This prevents 780Sstevel@tonic-gate * situations called priority inversions in which a high-priority thread 790Sstevel@tonic-gate * needs a lock held by a low-priority thread, which cannot run because 800Sstevel@tonic-gate * of medium-priority threads. Without PI, the medium-priority threads 810Sstevel@tonic-gate * can starve out the high-priority thread indefinitely. With PI, the 820Sstevel@tonic-gate * low-priority thread becomes high-priority until it releases whatever 830Sstevel@tonic-gate * synchronization object the real high-priority thread is waiting for. 840Sstevel@tonic-gate * 850Sstevel@tonic-gate * How turnstiles work: 860Sstevel@tonic-gate * 870Sstevel@tonic-gate * All active turnstiles reside in a global hash table, turnstile_table[]. 880Sstevel@tonic-gate * The address of a synchronization object determines its hash index. 890Sstevel@tonic-gate * Each hash chain is protected by its own dispatcher lock, acquired 900Sstevel@tonic-gate * by turnstile_lookup(). This lock protects the hash chain linkage, the 910Sstevel@tonic-gate * contents of all turnstiles on the hash chain, and the waiters bits of 920Sstevel@tonic-gate * every synchronization object in the system that hashes to the same chain. 930Sstevel@tonic-gate * Giving the lock such broad scope simplifies the interactions between 940Sstevel@tonic-gate * the turnstile code and its clients considerably. The blocking path 950Sstevel@tonic-gate * is rare enough that this has no impact on scalability. (If it ever 960Sstevel@tonic-gate * does, it's almost surely a second-order effect -- the real problem 970Sstevel@tonic-gate * is that some synchronization object is *very* heavily contended.) 980Sstevel@tonic-gate * 990Sstevel@tonic-gate * Each thread has an attached turnstile in case it needs to block. 1000Sstevel@tonic-gate * A thread cannot block on more than one lock at a time, so one 1010Sstevel@tonic-gate * turnstile per thread is the most we ever need. The first thread 1020Sstevel@tonic-gate * to block on a lock donates its attached turnstile and adds it to 1030Sstevel@tonic-gate * the appropriate hash chain in turnstile_table[]. This becomes the 1040Sstevel@tonic-gate * "active turnstile" for the lock. Each subsequent thread that blocks 1050Sstevel@tonic-gate * on the same lock discovers that the lock already has an active 1060Sstevel@tonic-gate * turnstile, so it stashes its own turnstile on the active turnstile's 1070Sstevel@tonic-gate * freelist. As threads wake up, the process is reversed. 1080Sstevel@tonic-gate * 1090Sstevel@tonic-gate * turnstile_block() puts the current thread to sleep on the active 1100Sstevel@tonic-gate * turnstile for the desired lock, walks the blocking chain to apply 1110Sstevel@tonic-gate * priority inheritance to everyone in its way, and yields the CPU. 1120Sstevel@tonic-gate * 1130Sstevel@tonic-gate * turnstile_wakeup() waives any priority the owner may have inherited 1140Sstevel@tonic-gate * and wakes the specified number of waiting threads. If the caller is 1150Sstevel@tonic-gate * doing direct handoff of ownership (rather than just dropping the lock), 1160Sstevel@tonic-gate * the new owner automatically inherits priority from any existing waiters. 1170Sstevel@tonic-gate */ 1180Sstevel@tonic-gate 1190Sstevel@tonic-gate #include <sys/param.h> 1200Sstevel@tonic-gate #include <sys/systm.h> 1210Sstevel@tonic-gate #include <sys/thread.h> 1220Sstevel@tonic-gate #include <sys/proc.h> 1230Sstevel@tonic-gate #include <sys/debug.h> 1240Sstevel@tonic-gate #include <sys/cpuvar.h> 1250Sstevel@tonic-gate #include <sys/turnstile.h> 1260Sstevel@tonic-gate #include <sys/t_lock.h> 1270Sstevel@tonic-gate #include <sys/disp.h> 1280Sstevel@tonic-gate #include <sys/sobject.h> 1290Sstevel@tonic-gate #include <sys/cmn_err.h> 1300Sstevel@tonic-gate #include <sys/sysmacros.h> 1310Sstevel@tonic-gate #include <sys/lockstat.h> 1320Sstevel@tonic-gate #include <sys/lwp_upimutex_impl.h> 1330Sstevel@tonic-gate #include <sys/schedctl.h> 1340Sstevel@tonic-gate #include <sys/cpu.h> 1350Sstevel@tonic-gate #include <sys/sdt.h> 1360Sstevel@tonic-gate #include <sys/cpupart.h> 1370Sstevel@tonic-gate 1380Sstevel@tonic-gate extern upib_t upimutextab[UPIMUTEX_TABSIZE]; 1390Sstevel@tonic-gate 1400Sstevel@tonic-gate #define IS_UPI(sobj) \ 1410Sstevel@tonic-gate ((uintptr_t)(sobj) - (uintptr_t)upimutextab < sizeof (upimutextab)) 1420Sstevel@tonic-gate 1430Sstevel@tonic-gate /* 1440Sstevel@tonic-gate * The turnstile hash table is partitioned into two halves: the lower half 1450Sstevel@tonic-gate * is used for upimutextab[] locks, the upper half for everything else. 1460Sstevel@tonic-gate * The reason for the distinction is that SOBJ_USER_PI locks present a 1470Sstevel@tonic-gate * unique problem: the upimutextab[] lock passed to turnstile_block() 1480Sstevel@tonic-gate * cannot be dropped until the calling thread has blocked on its 1490Sstevel@tonic-gate * SOBJ_USER_PI lock and willed its priority down the blocking chain. 1500Sstevel@tonic-gate * At that point, the caller's t_lockp will be one of the turnstile locks. 1510Sstevel@tonic-gate * If mutex_exit() discovers that the upimutextab[] lock has waiters, it 1520Sstevel@tonic-gate * must wake them, which forces a lock ordering on us: the turnstile lock 1530Sstevel@tonic-gate * for the upimutextab[] lock will be acquired in mutex_vector_exit(), 1540Sstevel@tonic-gate * which will eventually call into turnstile_pi_waive(), which will then 1550Sstevel@tonic-gate * acquire the caller's thread lock, which in this case is the turnstile 1560Sstevel@tonic-gate * lock for the SOBJ_USER_PI lock. In general, when two turnstile locks 1570Sstevel@tonic-gate * must be held at the same time, the lock order must be the address order. 1580Sstevel@tonic-gate * Therefore, to prevent deadlock in turnstile_pi_waive(), we must ensure 1590Sstevel@tonic-gate * that upimutextab[] locks *always* hash to lower addresses than any 1600Sstevel@tonic-gate * other locks. You think this is cheesy? Let's see you do better. 1610Sstevel@tonic-gate */ 1620Sstevel@tonic-gate #define TURNSTILE_HASH_SIZE 128 /* must be power of 2 */ 1630Sstevel@tonic-gate #define TURNSTILE_HASH_MASK (TURNSTILE_HASH_SIZE - 1) 1640Sstevel@tonic-gate #define TURNSTILE_SOBJ_HASH(sobj) \ 1650Sstevel@tonic-gate ((((ulong_t)sobj >> 2) + ((ulong_t)sobj >> 9)) & TURNSTILE_HASH_MASK) 1660Sstevel@tonic-gate #define TURNSTILE_SOBJ_BUCKET(sobj) \ 1670Sstevel@tonic-gate ((IS_UPI(sobj) ? 0 : TURNSTILE_HASH_SIZE) + TURNSTILE_SOBJ_HASH(sobj)) 1680Sstevel@tonic-gate #define TURNSTILE_CHAIN(sobj) turnstile_table[TURNSTILE_SOBJ_BUCKET(sobj)] 1690Sstevel@tonic-gate 1700Sstevel@tonic-gate typedef struct turnstile_chain { 1710Sstevel@tonic-gate turnstile_t *tc_first; /* first turnstile on hash chain */ 1720Sstevel@tonic-gate disp_lock_t tc_lock; /* lock for this hash chain */ 1730Sstevel@tonic-gate } turnstile_chain_t; 1740Sstevel@tonic-gate 1750Sstevel@tonic-gate turnstile_chain_t turnstile_table[2 * TURNSTILE_HASH_SIZE]; 1760Sstevel@tonic-gate 1770Sstevel@tonic-gate static lock_t turnstile_loser_lock; 1780Sstevel@tonic-gate 1790Sstevel@tonic-gate /* 1800Sstevel@tonic-gate * Make 'inheritor' inherit priority from this turnstile. 1810Sstevel@tonic-gate */ 1820Sstevel@tonic-gate static void 1830Sstevel@tonic-gate turnstile_pi_inherit(turnstile_t *ts, kthread_t *inheritor, pri_t epri) 1840Sstevel@tonic-gate { 1850Sstevel@tonic-gate ASSERT(THREAD_LOCK_HELD(inheritor)); 1860Sstevel@tonic-gate ASSERT(DISP_LOCK_HELD(&TURNSTILE_CHAIN(ts->ts_sobj).tc_lock)); 1870Sstevel@tonic-gate 1880Sstevel@tonic-gate if (epri <= inheritor->t_pri) 1890Sstevel@tonic-gate return; 1900Sstevel@tonic-gate 1910Sstevel@tonic-gate if (ts->ts_inheritor == NULL) { 1920Sstevel@tonic-gate ts->ts_inheritor = inheritor; 1930Sstevel@tonic-gate ts->ts_epri = epri; 1940Sstevel@tonic-gate disp_lock_enter_high(&inheritor->t_pi_lock); 1950Sstevel@tonic-gate ts->ts_prioinv = inheritor->t_prioinv; 1960Sstevel@tonic-gate inheritor->t_prioinv = ts; 1970Sstevel@tonic-gate disp_lock_exit_high(&inheritor->t_pi_lock); 1980Sstevel@tonic-gate } else { 1990Sstevel@tonic-gate /* 2000Sstevel@tonic-gate * 'inheritor' is already inheriting from this turnstile, 2010Sstevel@tonic-gate * so just adjust its priority. 2020Sstevel@tonic-gate */ 2030Sstevel@tonic-gate ASSERT(ts->ts_inheritor == inheritor); 2040Sstevel@tonic-gate if (ts->ts_epri < epri) 2050Sstevel@tonic-gate ts->ts_epri = epri; 2060Sstevel@tonic-gate } 2070Sstevel@tonic-gate 2080Sstevel@tonic-gate if (epri > DISP_PRIO(inheritor)) 2090Sstevel@tonic-gate thread_change_epri(inheritor, epri); 2100Sstevel@tonic-gate } 2110Sstevel@tonic-gate 2120Sstevel@tonic-gate /* 2130Sstevel@tonic-gate * If turnstile is non-NULL, remove it from inheritor's t_prioinv list. 2140Sstevel@tonic-gate * Compute new inherited priority, and return it. 2150Sstevel@tonic-gate */ 2160Sstevel@tonic-gate static pri_t 2170Sstevel@tonic-gate turnstile_pi_tsdelete(turnstile_t *ts, kthread_t *inheritor) 2180Sstevel@tonic-gate { 2190Sstevel@tonic-gate turnstile_t **tspp, *tsp; 2200Sstevel@tonic-gate pri_t new_epri = 0; 2210Sstevel@tonic-gate 2220Sstevel@tonic-gate disp_lock_enter_high(&inheritor->t_pi_lock); 2230Sstevel@tonic-gate tspp = &inheritor->t_prioinv; 2240Sstevel@tonic-gate while ((tsp = *tspp) != NULL) { 2250Sstevel@tonic-gate if (tsp == ts) 2260Sstevel@tonic-gate *tspp = tsp->ts_prioinv; 2270Sstevel@tonic-gate else 2280Sstevel@tonic-gate new_epri = MAX(new_epri, tsp->ts_epri); 2290Sstevel@tonic-gate tspp = &tsp->ts_prioinv; 2300Sstevel@tonic-gate } 2310Sstevel@tonic-gate disp_lock_exit_high(&inheritor->t_pi_lock); 2320Sstevel@tonic-gate return (new_epri); 2330Sstevel@tonic-gate } 2340Sstevel@tonic-gate 2350Sstevel@tonic-gate /* 2360Sstevel@tonic-gate * Remove turnstile from inheritor's t_prioinv list, compute 2370Sstevel@tonic-gate * new priority, and change the inheritor's effective priority if 2380Sstevel@tonic-gate * necessary. Keep in synch with turnstile_pi_recalc(). 2390Sstevel@tonic-gate */ 2400Sstevel@tonic-gate static void 2410Sstevel@tonic-gate turnstile_pi_waive(turnstile_t *ts) 2420Sstevel@tonic-gate { 2430Sstevel@tonic-gate kthread_t *inheritor = ts->ts_inheritor; 2440Sstevel@tonic-gate pri_t new_epri; 2450Sstevel@tonic-gate 2460Sstevel@tonic-gate ASSERT(inheritor == curthread); 2470Sstevel@tonic-gate 2480Sstevel@tonic-gate thread_lock_high(inheritor); 2490Sstevel@tonic-gate new_epri = turnstile_pi_tsdelete(ts, inheritor); 2500Sstevel@tonic-gate if (new_epri != DISP_PRIO(inheritor)) 2510Sstevel@tonic-gate thread_change_epri(inheritor, new_epri); 2520Sstevel@tonic-gate ts->ts_inheritor = NULL; 2530Sstevel@tonic-gate if (DISP_MUST_SURRENDER(inheritor)) 2540Sstevel@tonic-gate cpu_surrender(inheritor); 2550Sstevel@tonic-gate thread_unlock_high(inheritor); 2560Sstevel@tonic-gate } 2570Sstevel@tonic-gate 2580Sstevel@tonic-gate /* 2590Sstevel@tonic-gate * Compute caller's new inherited priority, and change its effective 2600Sstevel@tonic-gate * priority if necessary. Necessary only for SOBJ_USER_PI, because of 2610Sstevel@tonic-gate * its interruptibility characteristic. 2620Sstevel@tonic-gate */ 2630Sstevel@tonic-gate void 2640Sstevel@tonic-gate turnstile_pi_recalc(void) 2650Sstevel@tonic-gate { 2660Sstevel@tonic-gate kthread_t *inheritor = curthread; 2670Sstevel@tonic-gate pri_t new_epri; 2680Sstevel@tonic-gate 2690Sstevel@tonic-gate thread_lock(inheritor); 2700Sstevel@tonic-gate new_epri = turnstile_pi_tsdelete(NULL, inheritor); 2710Sstevel@tonic-gate if (new_epri != DISP_PRIO(inheritor)) 2720Sstevel@tonic-gate thread_change_epri(inheritor, new_epri); 2730Sstevel@tonic-gate if (DISP_MUST_SURRENDER(inheritor)) 2740Sstevel@tonic-gate cpu_surrender(inheritor); 2750Sstevel@tonic-gate thread_unlock(inheritor); 2760Sstevel@tonic-gate } 2770Sstevel@tonic-gate 2780Sstevel@tonic-gate /* 2790Sstevel@tonic-gate * Grab the lock protecting the hash chain for sobj 2800Sstevel@tonic-gate * and return the active turnstile for sobj, if any. 2810Sstevel@tonic-gate */ 2820Sstevel@tonic-gate turnstile_t * 2830Sstevel@tonic-gate turnstile_lookup(void *sobj) 2840Sstevel@tonic-gate { 2850Sstevel@tonic-gate turnstile_t *ts; 2860Sstevel@tonic-gate turnstile_chain_t *tc = &TURNSTILE_CHAIN(sobj); 2870Sstevel@tonic-gate 2880Sstevel@tonic-gate disp_lock_enter(&tc->tc_lock); 2890Sstevel@tonic-gate 2900Sstevel@tonic-gate for (ts = tc->tc_first; ts != NULL; ts = ts->ts_next) 2910Sstevel@tonic-gate if (ts->ts_sobj == sobj) 2920Sstevel@tonic-gate break; 2930Sstevel@tonic-gate 2940Sstevel@tonic-gate return (ts); 2950Sstevel@tonic-gate } 2960Sstevel@tonic-gate 2970Sstevel@tonic-gate /* 2980Sstevel@tonic-gate * Drop the lock protecting the hash chain for sobj. 2990Sstevel@tonic-gate */ 3000Sstevel@tonic-gate void 3010Sstevel@tonic-gate turnstile_exit(void *sobj) 3020Sstevel@tonic-gate { 3030Sstevel@tonic-gate disp_lock_exit(&TURNSTILE_CHAIN(sobj).tc_lock); 3040Sstevel@tonic-gate } 3050Sstevel@tonic-gate 3060Sstevel@tonic-gate /* 3070Sstevel@tonic-gate * When we apply priority inheritance, we must grab the owner's thread lock 3080Sstevel@tonic-gate * while already holding the waiter's thread lock. If both thread locks are 3090Sstevel@tonic-gate * turnstile locks, this can lead to deadlock: while we hold L1 and try to 3100Sstevel@tonic-gate * grab L2, some unrelated thread may be applying priority inheritance to 3110Sstevel@tonic-gate * some other blocking chain, holding L2 and trying to grab L1. The most 3120Sstevel@tonic-gate * obvious solution -- do a lock_try() for the owner lock -- isn't quite 3130Sstevel@tonic-gate * sufficient because it can cause livelock: each thread may hold one lock, 3140Sstevel@tonic-gate * try to grab the other, fail, bail out, and try again, looping forever. 3150Sstevel@tonic-gate * To prevent livelock we must define a winner, i.e. define an arbitrary 3160Sstevel@tonic-gate * lock ordering on the turnstile locks. For simplicity we declare that 3170Sstevel@tonic-gate * virtual address order defines lock order, i.e. if L1 < L2, then the 3180Sstevel@tonic-gate * correct lock ordering is L1, L2. Thus the thread that holds L1 and 3190Sstevel@tonic-gate * wants L2 should spin until L2 is available, but the thread that holds 3200Sstevel@tonic-gate * L2 and can't get L1 on the first try must drop L2 and return failure. 3210Sstevel@tonic-gate * Moreover, the losing thread must not reacquire L2 until the winning 3220Sstevel@tonic-gate * thread has had a chance to grab it; to ensure this, the losing thread 3230Sstevel@tonic-gate * must grab L1 after dropping L2, thus spinning until the winner is done. 3240Sstevel@tonic-gate * Complicating matters further, note that the owner's thread lock pointer 3250Sstevel@tonic-gate * can change (i.e. be pointed at a different lock) while we're trying to 3260Sstevel@tonic-gate * grab it. If that happens, we must unwind our state and try again. 3270Sstevel@tonic-gate * 3280Sstevel@tonic-gate * On success, returns 1 with both locks held. 3290Sstevel@tonic-gate * On failure, returns 0 with neither lock held. 3300Sstevel@tonic-gate */ 3310Sstevel@tonic-gate static int 3320Sstevel@tonic-gate turnstile_interlock(lock_t *wlp, lock_t *volatile *olpp) 3330Sstevel@tonic-gate { 3340Sstevel@tonic-gate ASSERT(LOCK_HELD(wlp)); 3350Sstevel@tonic-gate 3360Sstevel@tonic-gate for (;;) { 3370Sstevel@tonic-gate volatile lock_t *olp = *olpp; 3380Sstevel@tonic-gate 3390Sstevel@tonic-gate /* 3400Sstevel@tonic-gate * If the locks are identical, there's nothing to do. 3410Sstevel@tonic-gate */ 3420Sstevel@tonic-gate if (olp == wlp) 3430Sstevel@tonic-gate return (1); 3440Sstevel@tonic-gate if (lock_try((lock_t *)olp)) { 3450Sstevel@tonic-gate /* 3460Sstevel@tonic-gate * If 'olp' is still the right lock, return success. 3470Sstevel@tonic-gate * Otherwise, drop 'olp' and try the dance again. 3480Sstevel@tonic-gate */ 3490Sstevel@tonic-gate if (olp == *olpp) 3500Sstevel@tonic-gate return (1); 3510Sstevel@tonic-gate lock_clear((lock_t *)olp); 3520Sstevel@tonic-gate } else { 353*6103Sck142721 hrtime_t spin_time = 0; 3540Sstevel@tonic-gate /* 3550Sstevel@tonic-gate * If we're grabbing the locks out of order, we lose. 3560Sstevel@tonic-gate * Drop the waiter's lock, and then grab and release 3570Sstevel@tonic-gate * the owner's lock to ensure that we won't retry 3580Sstevel@tonic-gate * until the winner is done (as described above). 3590Sstevel@tonic-gate */ 3600Sstevel@tonic-gate if (olp >= (lock_t *)turnstile_table && olp < wlp) { 3610Sstevel@tonic-gate lock_clear(wlp); 3620Sstevel@tonic-gate lock_set((lock_t *)olp); 3630Sstevel@tonic-gate lock_clear((lock_t *)olp); 3640Sstevel@tonic-gate return (0); 3650Sstevel@tonic-gate } 3660Sstevel@tonic-gate /* 3670Sstevel@tonic-gate * We're grabbing the locks in the right order, 3680Sstevel@tonic-gate * so spin until the owner's lock either becomes 3690Sstevel@tonic-gate * available or spontaneously changes. 3700Sstevel@tonic-gate */ 371*6103Sck142721 spin_time = 372*6103Sck142721 LOCKSTAT_START_TIME(LS_TURNSTILE_INTERLOCK_SPIN); 3730Sstevel@tonic-gate while (olp == *olpp && LOCK_HELD(olp)) { 3740Sstevel@tonic-gate if (panicstr) 3750Sstevel@tonic-gate return (1); 3760Sstevel@tonic-gate SMT_PAUSE(); 3770Sstevel@tonic-gate } 378*6103Sck142721 LOCKSTAT_RECORD_TIME(LS_TURNSTILE_INTERLOCK_SPIN, 379*6103Sck142721 olp, spin_time); 3800Sstevel@tonic-gate } 3810Sstevel@tonic-gate } 3820Sstevel@tonic-gate } 3830Sstevel@tonic-gate 3840Sstevel@tonic-gate /* 3850Sstevel@tonic-gate * Block the current thread on a synchronization object. 3860Sstevel@tonic-gate * 3870Sstevel@tonic-gate * Turnstiles implement both kernel and user-level priority inheritance. 3880Sstevel@tonic-gate * To avoid missed wakeups in the user-level case, lwp_upimutex_lock() calls 3890Sstevel@tonic-gate * turnstile_block() holding the appropriate lock in the upimutextab (see 3900Sstevel@tonic-gate * the block comment in lwp_upimutex_lock() for details). The held lock is 3910Sstevel@tonic-gate * passed to turnstile_block() as the "mp" parameter, and will be dropped 3920Sstevel@tonic-gate * after priority has been willed, but before the thread actually sleeps 3930Sstevel@tonic-gate * (this locking behavior leads to some subtle ordering issues; see the 3940Sstevel@tonic-gate * block comment on turnstile hashing for details). This _must_ be the only 3950Sstevel@tonic-gate * lock held when calling turnstile_block() with a SOBJ_USER_PI sobj; holding 3960Sstevel@tonic-gate * other locks can result in panics due to cycles in the blocking chain. 3970Sstevel@tonic-gate * 3980Sstevel@tonic-gate * turnstile_block() always succeeds for kernel synchronization objects. 3990Sstevel@tonic-gate * For SOBJ_USER_PI locks the possible errors are EINTR for signals, and 4000Sstevel@tonic-gate * EDEADLK for cycles in the blocking chain. A return code of zero indicates 4010Sstevel@tonic-gate * *either* that the lock is now held, or that this is a spurious wake-up, or 4020Sstevel@tonic-gate * that the lock can never be held due to an ENOTRECOVERABLE error. 4030Sstevel@tonic-gate * It is up to lwp_upimutex_lock() to sort this all out. 4040Sstevel@tonic-gate */ 4050Sstevel@tonic-gate 4060Sstevel@tonic-gate int 4070Sstevel@tonic-gate turnstile_block(turnstile_t *ts, int qnum, void *sobj, sobj_ops_t *sobj_ops, 4080Sstevel@tonic-gate kmutex_t *mp, lwp_timer_t *lwptp) 4090Sstevel@tonic-gate { 4100Sstevel@tonic-gate kthread_t *owner; 4110Sstevel@tonic-gate kthread_t *t = curthread; 4120Sstevel@tonic-gate proc_t *p = ttoproc(t); 4130Sstevel@tonic-gate klwp_t *lwp = ttolwp(t); 4140Sstevel@tonic-gate turnstile_chain_t *tc = &TURNSTILE_CHAIN(sobj); 4150Sstevel@tonic-gate int error = 0; 4160Sstevel@tonic-gate int loser = 0; 4170Sstevel@tonic-gate 4180Sstevel@tonic-gate ASSERT(DISP_LOCK_HELD(&tc->tc_lock)); 4190Sstevel@tonic-gate ASSERT(mp == NULL || IS_UPI(mp)); 4200Sstevel@tonic-gate ASSERT((SOBJ_TYPE(sobj_ops) == SOBJ_USER_PI) ^ (mp == NULL)); 4210Sstevel@tonic-gate 4220Sstevel@tonic-gate thread_lock_high(t); 4230Sstevel@tonic-gate 4240Sstevel@tonic-gate if (ts == NULL) { 4250Sstevel@tonic-gate /* 4260Sstevel@tonic-gate * This is the first thread to block on this sobj. 4270Sstevel@tonic-gate * Take its attached turnstile and add it to the hash chain. 4280Sstevel@tonic-gate */ 4290Sstevel@tonic-gate ts = t->t_ts; 4300Sstevel@tonic-gate ts->ts_sobj = sobj; 4310Sstevel@tonic-gate ts->ts_next = tc->tc_first; 4320Sstevel@tonic-gate tc->tc_first = ts; 4330Sstevel@tonic-gate ASSERT(ts->ts_waiters == 0); 4340Sstevel@tonic-gate } else { 4350Sstevel@tonic-gate /* 4360Sstevel@tonic-gate * Another thread has already donated its turnstile 4370Sstevel@tonic-gate * to block on this sobj, so ours isn't needed. 4380Sstevel@tonic-gate * Stash it on the active turnstile's freelist. 4390Sstevel@tonic-gate */ 4400Sstevel@tonic-gate turnstile_t *myts = t->t_ts; 4410Sstevel@tonic-gate myts->ts_free = ts->ts_free; 4420Sstevel@tonic-gate ts->ts_free = myts; 4430Sstevel@tonic-gate t->t_ts = ts; 4440Sstevel@tonic-gate ASSERT(ts->ts_sobj == sobj); 4450Sstevel@tonic-gate ASSERT(ts->ts_waiters > 0); 4460Sstevel@tonic-gate } 4470Sstevel@tonic-gate 4480Sstevel@tonic-gate /* 4490Sstevel@tonic-gate * Put the thread to sleep. 4500Sstevel@tonic-gate */ 4510Sstevel@tonic-gate ASSERT(t != CPU->cpu_idle_thread); 4520Sstevel@tonic-gate ASSERT(CPU_ON_INTR(CPU) == 0); 4530Sstevel@tonic-gate ASSERT(t->t_wchan0 == NULL && t->t_wchan == NULL); 4540Sstevel@tonic-gate ASSERT(t->t_state == TS_ONPROC); 4550Sstevel@tonic-gate 4560Sstevel@tonic-gate if (SOBJ_TYPE(sobj_ops) == SOBJ_USER_PI) { 4570Sstevel@tonic-gate curthread->t_flag |= T_WAKEABLE; 4580Sstevel@tonic-gate } 4590Sstevel@tonic-gate CL_SLEEP(t); /* assign kernel priority */ 4600Sstevel@tonic-gate THREAD_SLEEP(t, &tc->tc_lock); 4610Sstevel@tonic-gate t->t_wchan = sobj; 4620Sstevel@tonic-gate t->t_sobj_ops = sobj_ops; 4630Sstevel@tonic-gate DTRACE_SCHED(sleep); 4640Sstevel@tonic-gate 4650Sstevel@tonic-gate if (lwp != NULL) { 4660Sstevel@tonic-gate lwp->lwp_ru.nvcsw++; 4670Sstevel@tonic-gate (void) new_mstate(t, LMS_SLEEP); 4680Sstevel@tonic-gate if (SOBJ_TYPE(sobj_ops) == SOBJ_USER_PI) { 4690Sstevel@tonic-gate lwp->lwp_asleep = 1; 4700Sstevel@tonic-gate lwp->lwp_sysabort = 0; 4710Sstevel@tonic-gate /* 4720Sstevel@tonic-gate * make wchan0 non-zero to conform to the rule that 4730Sstevel@tonic-gate * threads blocking for user-level objects have a 4740Sstevel@tonic-gate * non-zero wchan0: this prevents spurious wake-ups 4750Sstevel@tonic-gate * by, for example, /proc. 4760Sstevel@tonic-gate */ 4770Sstevel@tonic-gate t->t_wchan0 = (caddr_t)1; 4780Sstevel@tonic-gate } 4790Sstevel@tonic-gate } 4800Sstevel@tonic-gate ts->ts_waiters++; 4810Sstevel@tonic-gate sleepq_insert(&ts->ts_sleepq[qnum], t); 4820Sstevel@tonic-gate 4830Sstevel@tonic-gate if (SOBJ_TYPE(sobj_ops) == SOBJ_MUTEX && 4840Sstevel@tonic-gate SOBJ_OWNER(sobj_ops, sobj) == NULL) 4850Sstevel@tonic-gate panic("turnstile_block(%p): unowned mutex", (void *)ts); 4860Sstevel@tonic-gate 4870Sstevel@tonic-gate /* 4880Sstevel@tonic-gate * Follow the blocking chain to its end, willing our priority to 4890Sstevel@tonic-gate * everyone who's in our way. 4900Sstevel@tonic-gate */ 4910Sstevel@tonic-gate while (t->t_sobj_ops != NULL && 4920Sstevel@tonic-gate (owner = SOBJ_OWNER(t->t_sobj_ops, t->t_wchan)) != NULL) { 4930Sstevel@tonic-gate if (owner == curthread) { 4940Sstevel@tonic-gate if (SOBJ_TYPE(sobj_ops) != SOBJ_USER_PI) { 4950Sstevel@tonic-gate panic("Deadlock: cycle in blocking chain"); 4960Sstevel@tonic-gate } 4970Sstevel@tonic-gate /* 4980Sstevel@tonic-gate * If the cycle we've encountered ends in mp, 4990Sstevel@tonic-gate * then we know it isn't a 'real' cycle because 5000Sstevel@tonic-gate * we're going to drop mp before we go to sleep. 5010Sstevel@tonic-gate * Moreover, since we've come full circle we know 5020Sstevel@tonic-gate * that we must have willed priority to everyone 5030Sstevel@tonic-gate * in our way. Therefore, we can break out now. 5040Sstevel@tonic-gate */ 5050Sstevel@tonic-gate if (t->t_wchan == (void *)mp) 5060Sstevel@tonic-gate break; 5070Sstevel@tonic-gate 5080Sstevel@tonic-gate if (loser) 5090Sstevel@tonic-gate lock_clear(&turnstile_loser_lock); 5100Sstevel@tonic-gate /* 5110Sstevel@tonic-gate * For SOBJ_USER_PI, a cycle is an application 5120Sstevel@tonic-gate * deadlock which needs to be communicated 5130Sstevel@tonic-gate * back to the application. 5140Sstevel@tonic-gate */ 5150Sstevel@tonic-gate thread_unlock_nopreempt(t); 5160Sstevel@tonic-gate if (lwptp->lwpt_id != 0) { 5170Sstevel@tonic-gate /* 5180Sstevel@tonic-gate * We enqueued a timeout, we are 5190Sstevel@tonic-gate * holding curthread->t_delay_lock. 5200Sstevel@tonic-gate * Drop it and dequeue the timeout. 5210Sstevel@tonic-gate */ 5220Sstevel@tonic-gate mutex_exit(&curthread->t_delay_lock); 5230Sstevel@tonic-gate (void) lwp_timer_dequeue(lwptp); 5240Sstevel@tonic-gate } 5250Sstevel@tonic-gate mutex_exit(mp); 5260Sstevel@tonic-gate setrun(curthread); 5270Sstevel@tonic-gate swtch(); /* necessary to transition state */ 5280Sstevel@tonic-gate curthread->t_flag &= ~T_WAKEABLE; 5290Sstevel@tonic-gate setallwatch(); 5300Sstevel@tonic-gate lwp->lwp_asleep = 0; 5310Sstevel@tonic-gate lwp->lwp_sysabort = 0; 5320Sstevel@tonic-gate return (EDEADLK); 5330Sstevel@tonic-gate } 5340Sstevel@tonic-gate if (!turnstile_interlock(t->t_lockp, &owner->t_lockp)) { 5350Sstevel@tonic-gate /* 5360Sstevel@tonic-gate * If we failed to grab the owner's thread lock, 5370Sstevel@tonic-gate * turnstile_interlock() will have dropped t's 5380Sstevel@tonic-gate * thread lock, so at this point we don't even know 5390Sstevel@tonic-gate * that 't' exists anymore. The simplest solution 5400Sstevel@tonic-gate * is to restart the entire priority inheritance dance 5410Sstevel@tonic-gate * from the beginning of the blocking chain, since 5420Sstevel@tonic-gate * we *do* know that 'curthread' still exists. 5430Sstevel@tonic-gate * Application of priority inheritance is idempotent, 5440Sstevel@tonic-gate * so it's OK that we're doing it more than once. 5450Sstevel@tonic-gate * Note also that since we've dropped our thread lock, 5460Sstevel@tonic-gate * we may already have been woken up; if so, our 5470Sstevel@tonic-gate * t_sobj_ops will be NULL, the loop will terminate, 5480Sstevel@tonic-gate * and the call to swtch() will be a no-op. Phew. 5490Sstevel@tonic-gate * 5500Sstevel@tonic-gate * There is one further complication: if two (or more) 5510Sstevel@tonic-gate * threads keep trying to grab the turnstile locks out 5520Sstevel@tonic-gate * of order and keep losing the race to another thread, 5530Sstevel@tonic-gate * these "dueling losers" can livelock the system. 5540Sstevel@tonic-gate * Therefore, once we get into this rare situation, 5550Sstevel@tonic-gate * we serialize all the losers. 5560Sstevel@tonic-gate */ 5570Sstevel@tonic-gate if (loser == 0) { 5580Sstevel@tonic-gate loser = 1; 5590Sstevel@tonic-gate lock_set(&turnstile_loser_lock); 5600Sstevel@tonic-gate } 5610Sstevel@tonic-gate t = curthread; 5620Sstevel@tonic-gate thread_lock_high(t); 5630Sstevel@tonic-gate continue; 5640Sstevel@tonic-gate } 5650Sstevel@tonic-gate 5660Sstevel@tonic-gate /* 5670Sstevel@tonic-gate * We now have the owner's thread lock. If we are traversing 5680Sstevel@tonic-gate * from non-SOBJ_USER_PI ops to SOBJ_USER_PI ops, then we know 5690Sstevel@tonic-gate * that we have caught the thread while in the TS_SLEEP state, 5700Sstevel@tonic-gate * but holding mp. We know that this situation is transient 5710Sstevel@tonic-gate * (mp will be dropped before the holder actually sleeps on 5720Sstevel@tonic-gate * the SOBJ_USER_PI sobj), so we will spin waiting for mp to 5730Sstevel@tonic-gate * be dropped. Then, as in the turnstile_interlock() failure 5740Sstevel@tonic-gate * case, we will restart the priority inheritance dance. 5750Sstevel@tonic-gate */ 5760Sstevel@tonic-gate if (SOBJ_TYPE(t->t_sobj_ops) != SOBJ_USER_PI && 5770Sstevel@tonic-gate owner->t_sobj_ops != NULL && 5780Sstevel@tonic-gate SOBJ_TYPE(owner->t_sobj_ops) == SOBJ_USER_PI) { 5790Sstevel@tonic-gate kmutex_t *upi_lock = (kmutex_t *)t->t_wchan; 5800Sstevel@tonic-gate 5810Sstevel@tonic-gate ASSERT(IS_UPI(upi_lock)); 5820Sstevel@tonic-gate ASSERT(SOBJ_TYPE(t->t_sobj_ops) == SOBJ_MUTEX); 5830Sstevel@tonic-gate 5840Sstevel@tonic-gate if (t->t_lockp != owner->t_lockp) 5850Sstevel@tonic-gate thread_unlock_high(owner); 5860Sstevel@tonic-gate thread_unlock_high(t); 5870Sstevel@tonic-gate if (loser) 5880Sstevel@tonic-gate lock_clear(&turnstile_loser_lock); 5890Sstevel@tonic-gate 5900Sstevel@tonic-gate while (mutex_owner(upi_lock) == owner) { 5910Sstevel@tonic-gate SMT_PAUSE(); 5920Sstevel@tonic-gate continue; 5930Sstevel@tonic-gate } 5940Sstevel@tonic-gate 5950Sstevel@tonic-gate if (loser) 5960Sstevel@tonic-gate lock_set(&turnstile_loser_lock); 5970Sstevel@tonic-gate t = curthread; 5980Sstevel@tonic-gate thread_lock_high(t); 5990Sstevel@tonic-gate continue; 6000Sstevel@tonic-gate } 6010Sstevel@tonic-gate 6020Sstevel@tonic-gate turnstile_pi_inherit(t->t_ts, owner, DISP_PRIO(t)); 6030Sstevel@tonic-gate if (t->t_lockp != owner->t_lockp) 6040Sstevel@tonic-gate thread_unlock_high(t); 6050Sstevel@tonic-gate t = owner; 6060Sstevel@tonic-gate } 6070Sstevel@tonic-gate 6080Sstevel@tonic-gate if (loser) 6090Sstevel@tonic-gate lock_clear(&turnstile_loser_lock); 6100Sstevel@tonic-gate 6110Sstevel@tonic-gate /* 6120Sstevel@tonic-gate * Note: 't' and 'curthread' were synonymous before the loop above, 6130Sstevel@tonic-gate * but now they may be different. ('t' is now the last thread in 6140Sstevel@tonic-gate * the blocking chain.) 6150Sstevel@tonic-gate */ 6160Sstevel@tonic-gate if (SOBJ_TYPE(sobj_ops) == SOBJ_USER_PI) { 6170Sstevel@tonic-gate ushort_t s = curthread->t_oldspl; 6180Sstevel@tonic-gate int timedwait = 0; 6190Sstevel@tonic-gate clock_t tim = -1; 6200Sstevel@tonic-gate 6210Sstevel@tonic-gate thread_unlock_high(t); 6220Sstevel@tonic-gate if (lwptp->lwpt_id != 0) { 6230Sstevel@tonic-gate /* 6240Sstevel@tonic-gate * We enqueued a timeout and we are 6250Sstevel@tonic-gate * holding curthread->t_delay_lock. 6260Sstevel@tonic-gate */ 6270Sstevel@tonic-gate mutex_exit(&curthread->t_delay_lock); 6280Sstevel@tonic-gate timedwait = 1; 6290Sstevel@tonic-gate } 6300Sstevel@tonic-gate mutex_exit(mp); 6310Sstevel@tonic-gate splx(s); 6320Sstevel@tonic-gate 6330Sstevel@tonic-gate if (ISSIG(curthread, JUSTLOOKING) || 6340Sstevel@tonic-gate MUSTRETURN(p, curthread) || lwptp->lwpt_imm_timeout) 6350Sstevel@tonic-gate setrun(curthread); 6360Sstevel@tonic-gate swtch(); 6370Sstevel@tonic-gate curthread->t_flag &= ~T_WAKEABLE; 6380Sstevel@tonic-gate if (timedwait) 6390Sstevel@tonic-gate tim = lwp_timer_dequeue(lwptp); 6400Sstevel@tonic-gate setallwatch(); 6410Sstevel@tonic-gate if (ISSIG(curthread, FORREAL) || lwp->lwp_sysabort || 6420Sstevel@tonic-gate MUSTRETURN(p, curthread)) 6430Sstevel@tonic-gate error = EINTR; 6440Sstevel@tonic-gate else if (lwptp->lwpt_imm_timeout || (timedwait && tim == -1)) 6450Sstevel@tonic-gate error = ETIME; 6460Sstevel@tonic-gate lwp->lwp_sysabort = 0; 6470Sstevel@tonic-gate lwp->lwp_asleep = 0; 6480Sstevel@tonic-gate } else { 6490Sstevel@tonic-gate thread_unlock_nopreempt(t); 6500Sstevel@tonic-gate swtch(); 6510Sstevel@tonic-gate } 6520Sstevel@tonic-gate 6530Sstevel@tonic-gate return (error); 6540Sstevel@tonic-gate } 6550Sstevel@tonic-gate 6560Sstevel@tonic-gate /* 6570Sstevel@tonic-gate * Remove thread from specified turnstile sleep queue; retrieve its 6580Sstevel@tonic-gate * free turnstile; if it is the last waiter, delete the turnstile 6590Sstevel@tonic-gate * from the turnstile chain and if there is an inheritor, delete it 6600Sstevel@tonic-gate * from the inheritor's t_prioinv chain. 6610Sstevel@tonic-gate */ 6620Sstevel@tonic-gate static void 6630Sstevel@tonic-gate turnstile_dequeue(kthread_t *t) 6640Sstevel@tonic-gate { 6650Sstevel@tonic-gate turnstile_t *ts = t->t_ts; 6660Sstevel@tonic-gate turnstile_chain_t *tc = &TURNSTILE_CHAIN(ts->ts_sobj); 6670Sstevel@tonic-gate turnstile_t *tsfree, **tspp; 6680Sstevel@tonic-gate 6690Sstevel@tonic-gate ASSERT(DISP_LOCK_HELD(&tc->tc_lock)); 6700Sstevel@tonic-gate ASSERT(t->t_lockp == &tc->tc_lock); 6710Sstevel@tonic-gate 6720Sstevel@tonic-gate if ((tsfree = ts->ts_free) != NULL) { 6730Sstevel@tonic-gate ASSERT(ts->ts_waiters > 1); 6740Sstevel@tonic-gate ASSERT(tsfree->ts_waiters == 0); 6750Sstevel@tonic-gate t->t_ts = tsfree; 6760Sstevel@tonic-gate ts->ts_free = tsfree->ts_free; 6770Sstevel@tonic-gate tsfree->ts_free = NULL; 6780Sstevel@tonic-gate } else { 6790Sstevel@tonic-gate /* 6800Sstevel@tonic-gate * The active turnstile's freelist is empty, so this 6810Sstevel@tonic-gate * must be the last waiter. Remove the turnstile 6820Sstevel@tonic-gate * from the hash chain and leave the now-inactive 6830Sstevel@tonic-gate * turnstile attached to the thread we're waking. 6840Sstevel@tonic-gate * Note that the ts_inheritor for the turnstile 6850Sstevel@tonic-gate * may be NULL. If one exists, its t_prioinv 6860Sstevel@tonic-gate * chain has to be updated. 6870Sstevel@tonic-gate */ 6880Sstevel@tonic-gate ASSERT(ts->ts_waiters == 1); 6890Sstevel@tonic-gate if (ts->ts_inheritor != NULL) { 6900Sstevel@tonic-gate (void) turnstile_pi_tsdelete(ts, ts->ts_inheritor); 6910Sstevel@tonic-gate /* 6920Sstevel@tonic-gate * If we ever do a "disinherit" or "unboost", we need 6930Sstevel@tonic-gate * to do it only if "t" is a thread at the head of the 6940Sstevel@tonic-gate * sleep queue. Since the sleep queue is prioritized, 6950Sstevel@tonic-gate * the disinherit is necessary only if the interrupted 6960Sstevel@tonic-gate * thread is the highest priority thread. 6970Sstevel@tonic-gate * Otherwise, there is a higher priority thread blocked 6980Sstevel@tonic-gate * on the turnstile, whose inheritance cannot be 6990Sstevel@tonic-gate * disinherited. However, disinheriting is explicitly 7000Sstevel@tonic-gate * not done here, since it would require holding the 7010Sstevel@tonic-gate * inheritor's thread lock (see turnstile_unsleep()). 7020Sstevel@tonic-gate */ 7030Sstevel@tonic-gate ts->ts_inheritor = NULL; 7040Sstevel@tonic-gate } 7050Sstevel@tonic-gate tspp = &tc->tc_first; 7060Sstevel@tonic-gate while (*tspp != ts) 7070Sstevel@tonic-gate tspp = &(*tspp)->ts_next; 7080Sstevel@tonic-gate *tspp = ts->ts_next; 7090Sstevel@tonic-gate ASSERT(t->t_ts == ts); 7100Sstevel@tonic-gate } 7110Sstevel@tonic-gate ts->ts_waiters--; 7120Sstevel@tonic-gate sleepq_dequeue(t); 7130Sstevel@tonic-gate t->t_sobj_ops = NULL; 7140Sstevel@tonic-gate t->t_wchan = NULL; 7150Sstevel@tonic-gate t->t_wchan0 = NULL; 7160Sstevel@tonic-gate ASSERT(t->t_state == TS_SLEEP); 7170Sstevel@tonic-gate } 7180Sstevel@tonic-gate 7190Sstevel@tonic-gate /* 7200Sstevel@tonic-gate * Wake threads that are blocked in a turnstile. 7210Sstevel@tonic-gate */ 7220Sstevel@tonic-gate void 7230Sstevel@tonic-gate turnstile_wakeup(turnstile_t *ts, int qnum, int nthreads, kthread_t *owner) 7240Sstevel@tonic-gate { 7250Sstevel@tonic-gate turnstile_chain_t *tc = &TURNSTILE_CHAIN(ts->ts_sobj); 7260Sstevel@tonic-gate sleepq_t *sqp = &ts->ts_sleepq[qnum]; 7270Sstevel@tonic-gate 7280Sstevel@tonic-gate ASSERT(DISP_LOCK_HELD(&tc->tc_lock)); 7290Sstevel@tonic-gate 7300Sstevel@tonic-gate /* 7310Sstevel@tonic-gate * Waive any priority we may have inherited from this turnstile. 7320Sstevel@tonic-gate */ 7330Sstevel@tonic-gate if (ts->ts_inheritor != NULL) { 7340Sstevel@tonic-gate turnstile_pi_waive(ts); 7350Sstevel@tonic-gate } 7360Sstevel@tonic-gate while (nthreads-- > 0) { 7370Sstevel@tonic-gate kthread_t *t = sqp->sq_first; 7380Sstevel@tonic-gate ASSERT(t->t_ts == ts); 7390Sstevel@tonic-gate ASSERT(ts->ts_waiters > 1 || ts->ts_inheritor == NULL); 7400Sstevel@tonic-gate DTRACE_SCHED1(wakeup, kthread_t *, t); 7410Sstevel@tonic-gate turnstile_dequeue(t); 7420Sstevel@tonic-gate CL_WAKEUP(t); /* previous thread lock, tc_lock, not dropped */ 7430Sstevel@tonic-gate /* 7440Sstevel@tonic-gate * If the caller did direct handoff of ownership, 7450Sstevel@tonic-gate * make the new owner inherit from this turnstile. 7460Sstevel@tonic-gate */ 7470Sstevel@tonic-gate if (t == owner) { 7480Sstevel@tonic-gate kthread_t *wp = ts->ts_sleepq[TS_WRITER_Q].sq_first; 7490Sstevel@tonic-gate kthread_t *rp = ts->ts_sleepq[TS_READER_Q].sq_first; 7500Sstevel@tonic-gate pri_t wpri = wp ? DISP_PRIO(wp) : 0; 7510Sstevel@tonic-gate pri_t rpri = rp ? DISP_PRIO(rp) : 0; 7520Sstevel@tonic-gate turnstile_pi_inherit(ts, t, MAX(wpri, rpri)); 7530Sstevel@tonic-gate owner = NULL; 7540Sstevel@tonic-gate } 7550Sstevel@tonic-gate thread_unlock_high(t); /* drop run queue lock */ 7560Sstevel@tonic-gate } 7570Sstevel@tonic-gate if (owner != NULL) 7580Sstevel@tonic-gate panic("turnstile_wakeup: owner %p not woken", owner); 7590Sstevel@tonic-gate disp_lock_exit(&tc->tc_lock); 7600Sstevel@tonic-gate } 7610Sstevel@tonic-gate 7620Sstevel@tonic-gate /* 7630Sstevel@tonic-gate * Change priority of a thread sleeping in a turnstile. 7640Sstevel@tonic-gate */ 7650Sstevel@tonic-gate void 7660Sstevel@tonic-gate turnstile_change_pri(kthread_t *t, pri_t pri, pri_t *t_prip) 7670Sstevel@tonic-gate { 7680Sstevel@tonic-gate sleepq_t *sqp = t->t_sleepq; 7690Sstevel@tonic-gate 7700Sstevel@tonic-gate sleepq_dequeue(t); 7710Sstevel@tonic-gate *t_prip = pri; 7720Sstevel@tonic-gate sleepq_insert(sqp, t); 7730Sstevel@tonic-gate } 7740Sstevel@tonic-gate 7750Sstevel@tonic-gate /* 7760Sstevel@tonic-gate * We don't allow spurious wakeups of threads blocked in turnstiles 7770Sstevel@tonic-gate * for synch objects whose sobj_ops vector is initialized with the 7780Sstevel@tonic-gate * following routine (e.g. kernel synchronization objects). 7790Sstevel@tonic-gate * This is vital to the correctness of direct-handoff logic in some 7800Sstevel@tonic-gate * synchronization primitives, and it also simplifies the PI logic. 7810Sstevel@tonic-gate */ 7820Sstevel@tonic-gate /* ARGSUSED */ 7830Sstevel@tonic-gate void 7840Sstevel@tonic-gate turnstile_stay_asleep(kthread_t *t) 7850Sstevel@tonic-gate { 7860Sstevel@tonic-gate } 7870Sstevel@tonic-gate 7880Sstevel@tonic-gate /* 7890Sstevel@tonic-gate * Wake up a thread blocked in a turnstile. Used to enable interruptibility 7900Sstevel@tonic-gate * of threads blocked on a SOBJ_USER_PI sobj. 7910Sstevel@tonic-gate * 7920Sstevel@tonic-gate * The implications of this interface are: 7930Sstevel@tonic-gate * 7940Sstevel@tonic-gate * 1. turnstile_block() may return with an EINTR. 7950Sstevel@tonic-gate * 2. When the owner of an sobj releases it, but no turnstile is found (i.e. 7960Sstevel@tonic-gate * no waiters), the (prior) owner must call turnstile_pi_recalc() to 7970Sstevel@tonic-gate * waive any priority inherited from interrupted waiters. 7980Sstevel@tonic-gate * 7990Sstevel@tonic-gate * When a waiter is interrupted, disinheriting its willed priority from the 8000Sstevel@tonic-gate * inheritor would require holding the inheritor's thread lock, while also 8010Sstevel@tonic-gate * holding the waiter's thread lock which is a turnstile lock. If the 8020Sstevel@tonic-gate * inheritor's thread lock is not free, and is also a turnstile lock that 8030Sstevel@tonic-gate * is out of lock order, the waiter's thread lock would have to be dropped. 8040Sstevel@tonic-gate * This leads to complications for the caller of turnstile_unsleep(), since 8050Sstevel@tonic-gate * the caller holds the waiter's thread lock. So, instead of disinheriting 8060Sstevel@tonic-gate * on waiter interruption, the owner is required to follow rule 2 above. 8070Sstevel@tonic-gate * 8080Sstevel@tonic-gate * Avoiding disinherit on waiter interruption seems acceptable because 8090Sstevel@tonic-gate * the owner runs at an unnecessarily high priority only while sobj is held, 8100Sstevel@tonic-gate * which it would have done in any case, if the waiter had not been interrupted. 8110Sstevel@tonic-gate */ 8120Sstevel@tonic-gate void 8130Sstevel@tonic-gate turnstile_unsleep(kthread_t *t) 8140Sstevel@tonic-gate { 8150Sstevel@tonic-gate turnstile_dequeue(t); 8160Sstevel@tonic-gate THREAD_TRANSITION(t); 8170Sstevel@tonic-gate CL_SETRUN(t); 8180Sstevel@tonic-gate } 819