10Sstevel@tonic-gate /* 20Sstevel@tonic-gate * CDDL HEADER START 30Sstevel@tonic-gate * 40Sstevel@tonic-gate * The contents of this file are subject to the terms of the 5*5864Sesaxe * Common Development and Distribution License (the "License"). 6*5864Sesaxe * You may not use this file except in compliance with the License. 70Sstevel@tonic-gate * 80Sstevel@tonic-gate * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 90Sstevel@tonic-gate * or http://www.opensolaris.org/os/licensing. 100Sstevel@tonic-gate * See the License for the specific language governing permissions 110Sstevel@tonic-gate * and limitations under the License. 120Sstevel@tonic-gate * 130Sstevel@tonic-gate * When distributing Covered Code, include this CDDL HEADER in each 140Sstevel@tonic-gate * file and include the License file at usr/src/OPENSOLARIS.LICENSE. 150Sstevel@tonic-gate * If applicable, add the following below this CDDL HEADER, with the 160Sstevel@tonic-gate * fields enclosed by brackets "[]" replaced with your own identifying 170Sstevel@tonic-gate * information: Portions Copyright [yyyy] [name of copyright owner] 180Sstevel@tonic-gate * 190Sstevel@tonic-gate * CDDL HEADER END 200Sstevel@tonic-gate */ 210Sstevel@tonic-gate /* 22*5864Sesaxe * Copyright 2008 Sun Microsystems, Inc. All rights reserved. 230Sstevel@tonic-gate * Use is subject to license terms. 240Sstevel@tonic-gate */ 250Sstevel@tonic-gate 260Sstevel@tonic-gate #pragma ident "%Z%%M% %I% %E% SMI" 270Sstevel@tonic-gate 280Sstevel@tonic-gate /* 290Sstevel@tonic-gate * The Cyclic Subsystem 300Sstevel@tonic-gate * -------------------- 310Sstevel@tonic-gate * 320Sstevel@tonic-gate * Prehistory 330Sstevel@tonic-gate * 340Sstevel@tonic-gate * Historically, most computer architectures have specified interval-based 350Sstevel@tonic-gate * timer parts (e.g. SPARCstation's counter/timer; Intel's i8254). While 360Sstevel@tonic-gate * these parts deal in relative (i.e. not absolute) time values, they are 370Sstevel@tonic-gate * typically used by the operating system to implement the abstraction of 380Sstevel@tonic-gate * absolute time. As a result, these parts cannot typically be reprogrammed 390Sstevel@tonic-gate * without introducing error in the system's notion of time. 400Sstevel@tonic-gate * 410Sstevel@tonic-gate * Starting in about 1994, chip architectures began specifying high resolution 420Sstevel@tonic-gate * timestamp registers. As of this writing (1999), all major chip families 430Sstevel@tonic-gate * (UltraSPARC, PentiumPro, MIPS, PowerPC, Alpha) have high resolution 440Sstevel@tonic-gate * timestamp registers, and two (UltraSPARC and MIPS) have added the capacity 450Sstevel@tonic-gate * to interrupt based on timestamp values. These timestamp-compare registers 460Sstevel@tonic-gate * present a time-based interrupt source which can be reprogrammed arbitrarily 470Sstevel@tonic-gate * often without introducing error. Given the low cost of implementing such a 480Sstevel@tonic-gate * timestamp-compare register (and the tangible benefit of eliminating 490Sstevel@tonic-gate * discrete timer parts), it is reasonable to expect that future chip 500Sstevel@tonic-gate * architectures will adopt this feature. 510Sstevel@tonic-gate * 520Sstevel@tonic-gate * The cyclic subsystem has been designed to take advantage of chip 530Sstevel@tonic-gate * architectures with the capacity to interrupt based on absolute, high 540Sstevel@tonic-gate * resolution values of time. 550Sstevel@tonic-gate * 560Sstevel@tonic-gate * Subsystem Overview 570Sstevel@tonic-gate * 580Sstevel@tonic-gate * The cyclic subsystem is a low-level kernel subsystem designed to provide 590Sstevel@tonic-gate * arbitrarily high resolution, per-CPU interval timers (to avoid colliding 600Sstevel@tonic-gate * with existing terms, we dub such an interval timer a "cyclic"). Cyclics 610Sstevel@tonic-gate * can be specified to fire at high, lock or low interrupt level, and may be 620Sstevel@tonic-gate * optionally bound to a CPU or a CPU partition. A cyclic's CPU or CPU 630Sstevel@tonic-gate * partition binding may be changed dynamically; the cyclic will be "juggled" 640Sstevel@tonic-gate * to a CPU which satisfies the new binding. Alternatively, a cyclic may 650Sstevel@tonic-gate * be specified to be "omnipresent", denoting firing on all online CPUs. 660Sstevel@tonic-gate * 670Sstevel@tonic-gate * Cyclic Subsystem Interface Overview 680Sstevel@tonic-gate * ----------------------------------- 690Sstevel@tonic-gate * 700Sstevel@tonic-gate * The cyclic subsystem has interfaces with the kernel at-large, with other 710Sstevel@tonic-gate * kernel subsystems (e.g. the processor management subsystem, the checkpoint 720Sstevel@tonic-gate * resume subsystem) and with the platform (the cyclic backend). Each 730Sstevel@tonic-gate * of these interfaces is given a brief synopsis here, and is described 740Sstevel@tonic-gate * in full above the interface's implementation. 750Sstevel@tonic-gate * 760Sstevel@tonic-gate * The following diagram displays the cyclic subsystem's interfaces to 770Sstevel@tonic-gate * other kernel components. The arrows denote a "calls" relationship, with 780Sstevel@tonic-gate * the large arrow indicating the cyclic subsystem's consumer interface. 790Sstevel@tonic-gate * Each arrow is labeled with the section in which the corresponding 800Sstevel@tonic-gate * interface is described. 810Sstevel@tonic-gate * 820Sstevel@tonic-gate * Kernel at-large consumers 830Sstevel@tonic-gate * -----------++------------ 840Sstevel@tonic-gate * || 850Sstevel@tonic-gate * || 860Sstevel@tonic-gate * _||_ 870Sstevel@tonic-gate * \ / 880Sstevel@tonic-gate * \/ 890Sstevel@tonic-gate * +---------------------+ 900Sstevel@tonic-gate * | | 910Sstevel@tonic-gate * | Cyclic subsystem |<----------- Other kernel subsystems 920Sstevel@tonic-gate * | | 930Sstevel@tonic-gate * +---------------------+ 940Sstevel@tonic-gate * ^ | 950Sstevel@tonic-gate * | | 960Sstevel@tonic-gate * | | 970Sstevel@tonic-gate * | v 980Sstevel@tonic-gate * +---------------------+ 990Sstevel@tonic-gate * | | 1000Sstevel@tonic-gate * | Cyclic backend | 1010Sstevel@tonic-gate * | (platform specific) | 1020Sstevel@tonic-gate * | | 1030Sstevel@tonic-gate * +---------------------+ 1040Sstevel@tonic-gate * 1050Sstevel@tonic-gate * 1060Sstevel@tonic-gate * Kernel At-Large Interfaces 1070Sstevel@tonic-gate * 1080Sstevel@tonic-gate * cyclic_add() <-- Creates a cyclic 1090Sstevel@tonic-gate * cyclic_add_omni() <-- Creates an omnipresent cyclic 1100Sstevel@tonic-gate * cyclic_remove() <-- Removes a cyclic 1110Sstevel@tonic-gate * cyclic_bind() <-- Change a cyclic's CPU or partition binding 1120Sstevel@tonic-gate * 1130Sstevel@tonic-gate * Inter-subsystem Interfaces 1140Sstevel@tonic-gate * 1150Sstevel@tonic-gate * cyclic_juggle() <-- Juggles cyclics away from a CPU 1160Sstevel@tonic-gate * cyclic_offline() <-- Offlines cyclic operation on a CPU 1170Sstevel@tonic-gate * cyclic_online() <-- Reenables operation on an offlined CPU 1180Sstevel@tonic-gate * cyclic_move_in() <-- Notifies subsystem of change in CPU partition 1190Sstevel@tonic-gate * cyclic_move_out() <-- Notifies subsystem of change in CPU partition 1200Sstevel@tonic-gate * cyclic_suspend() <-- Suspends the cyclic subsystem on all CPUs 1210Sstevel@tonic-gate * cyclic_resume() <-- Resumes the cyclic subsystem on all CPUs 1220Sstevel@tonic-gate * 1230Sstevel@tonic-gate * Backend Interfaces 1240Sstevel@tonic-gate * 1250Sstevel@tonic-gate * cyclic_init() <-- Initializes the cyclic subsystem 1260Sstevel@tonic-gate * cyclic_fire() <-- CY_HIGH_LEVEL interrupt entry point 1270Sstevel@tonic-gate * cyclic_softint() <-- CY_LOCK/LOW_LEVEL soft interrupt entry point 1280Sstevel@tonic-gate * 1290Sstevel@tonic-gate * The backend-supplied interfaces (through the cyc_backend structure) are 1300Sstevel@tonic-gate * documented in detail in <sys/cyclic_impl.h> 1310Sstevel@tonic-gate * 1320Sstevel@tonic-gate * 1330Sstevel@tonic-gate * Cyclic Subsystem Implementation Overview 1340Sstevel@tonic-gate * ---------------------------------------- 1350Sstevel@tonic-gate * 1360Sstevel@tonic-gate * The cyclic subsystem is designed to minimize interference between cyclics 1370Sstevel@tonic-gate * on different CPUs. Thus, all of the cyclic subsystem's data structures 1380Sstevel@tonic-gate * hang off of a per-CPU structure, cyc_cpu. 1390Sstevel@tonic-gate * 1400Sstevel@tonic-gate * Each cyc_cpu has a power-of-two sized array of cyclic structures (the 1410Sstevel@tonic-gate * cyp_cyclics member of the cyc_cpu structure). If cyclic_add() is called 1420Sstevel@tonic-gate * and there does not exist a free slot in the cyp_cyclics array, the size of 1430Sstevel@tonic-gate * the array will be doubled. The array will never shrink. Cyclics are 1440Sstevel@tonic-gate * referred to by their index in the cyp_cyclics array, which is of type 1450Sstevel@tonic-gate * cyc_index_t. 1460Sstevel@tonic-gate * 1470Sstevel@tonic-gate * The cyclics are kept sorted by expiration time in the cyc_cpu's heap. The 1480Sstevel@tonic-gate * heap is keyed by cyclic expiration time, with parents expiring earlier 1490Sstevel@tonic-gate * than their children. 1500Sstevel@tonic-gate * 1510Sstevel@tonic-gate * Heap Management 1520Sstevel@tonic-gate * 1530Sstevel@tonic-gate * The heap is managed primarily by cyclic_fire(). Upon entry, cyclic_fire() 1540Sstevel@tonic-gate * compares the root cyclic's expiration time to the current time. If the 1550Sstevel@tonic-gate * expiration time is in the past, cyclic_expire() is called on the root 1560Sstevel@tonic-gate * cyclic. Upon return from cyclic_expire(), the cyclic's new expiration time 1570Sstevel@tonic-gate * is derived by adding its interval to its old expiration time, and a 1580Sstevel@tonic-gate * downheap operation is performed. After the downheap, cyclic_fire() 1590Sstevel@tonic-gate * examines the (potentially changed) root cyclic, repeating the 1600Sstevel@tonic-gate * cyclic_expire()/add interval/cyclic_downheap() sequence until the root 1610Sstevel@tonic-gate * cyclic has an expiration time in the future. This expiration time 1620Sstevel@tonic-gate * (guaranteed to be the earliest in the heap) is then communicated to the 1630Sstevel@tonic-gate * backend via cyb_reprogram. Optimal backends will next call cyclic_fire() 1640Sstevel@tonic-gate * shortly after the root cyclic's expiration time. 1650Sstevel@tonic-gate * 1660Sstevel@tonic-gate * To allow efficient, deterministic downheap operations, we implement the 1670Sstevel@tonic-gate * heap as an array (the cyp_heap member of the cyc_cpu structure), with each 1680Sstevel@tonic-gate * element containing an index into the CPU's cyp_cyclics array. 1690Sstevel@tonic-gate * 1700Sstevel@tonic-gate * The heap is laid out in the array according to the following: 1710Sstevel@tonic-gate * 1720Sstevel@tonic-gate * 1. The root of the heap is always in the 0th element of the heap array 1730Sstevel@tonic-gate * 2. The left and right children of the nth element are element 1740Sstevel@tonic-gate * (((n + 1) << 1) - 1) and element ((n + 1) << 1), respectively. 1750Sstevel@tonic-gate * 1760Sstevel@tonic-gate * This layout is standard (see, e.g., Cormen's "Algorithms"); the proof 1770Sstevel@tonic-gate * that these constraints correctly lay out a heap (or indeed, any binary 1780Sstevel@tonic-gate * tree) is trivial and left to the reader. 1790Sstevel@tonic-gate * 1800Sstevel@tonic-gate * To see the heap by example, assume our cyclics array has the following 1810Sstevel@tonic-gate * members (at time t): 1820Sstevel@tonic-gate * 1830Sstevel@tonic-gate * cy_handler cy_level cy_expire 1840Sstevel@tonic-gate * --------------------------------------------- 1850Sstevel@tonic-gate * [ 0] clock() LOCK t+10000000 1860Sstevel@tonic-gate * [ 1] deadman() HIGH t+1000000000 1870Sstevel@tonic-gate * [ 2] clock_highres_fire() LOW t+100 1880Sstevel@tonic-gate * [ 3] clock_highres_fire() LOW t+1000 1890Sstevel@tonic-gate * [ 4] clock_highres_fire() LOW t+500 1900Sstevel@tonic-gate * [ 5] (free) -- -- 1910Sstevel@tonic-gate * [ 6] (free) -- -- 1920Sstevel@tonic-gate * [ 7] (free) -- -- 1930Sstevel@tonic-gate * 1940Sstevel@tonic-gate * The heap array could be: 1950Sstevel@tonic-gate * 1960Sstevel@tonic-gate * [0] [1] [2] [3] [4] [5] [6] [7] 1970Sstevel@tonic-gate * +-----+-----+-----+-----+-----+-----+-----+-----+ 1980Sstevel@tonic-gate * | | | | | | | | | 1990Sstevel@tonic-gate * | 2 | 3 | 4 | 0 | 1 | x | x | x | 2000Sstevel@tonic-gate * | | | | | | | | | 2010Sstevel@tonic-gate * +-----+-----+-----+-----+-----+-----+-----+-----+ 2020Sstevel@tonic-gate * 2030Sstevel@tonic-gate * Graphically, this array corresponds to the following (excuse the ASCII art): 2040Sstevel@tonic-gate * 2050Sstevel@tonic-gate * 2 2060Sstevel@tonic-gate * | 2070Sstevel@tonic-gate * +------------------+------------------+ 2080Sstevel@tonic-gate * 3 4 2090Sstevel@tonic-gate * | 2100Sstevel@tonic-gate * +---------+--------+ 2110Sstevel@tonic-gate * 0 1 2120Sstevel@tonic-gate * 2130Sstevel@tonic-gate * Note that the heap is laid out by layer: all nodes at a given depth are 2140Sstevel@tonic-gate * stored in consecutive elements of the array. Moreover, layers of 2150Sstevel@tonic-gate * consecutive depths are in adjacent element ranges. This property 2160Sstevel@tonic-gate * guarantees high locality of reference during downheap operations. 2170Sstevel@tonic-gate * Specifically, we are guaranteed that we can downheap to a depth of 2180Sstevel@tonic-gate * 2190Sstevel@tonic-gate * lg (cache_line_size / sizeof (cyc_index_t)) 2200Sstevel@tonic-gate * 2210Sstevel@tonic-gate * nodes with at most one cache miss. On UltraSPARC (64 byte e-cache line 2220Sstevel@tonic-gate * size), this corresponds to a depth of four nodes. Thus, if there are 2230Sstevel@tonic-gate * fewer than sixteen cyclics in the heap, downheaps on UltraSPARC miss at 2240Sstevel@tonic-gate * most once in the e-cache. 2250Sstevel@tonic-gate * 2260Sstevel@tonic-gate * Downheaps are required to compare siblings as they proceed down the 2270Sstevel@tonic-gate * heap. For downheaps proceeding beyond the one-cache-miss depth, every 2280Sstevel@tonic-gate * access to a left child could potentially miss in the cache. However, 2290Sstevel@tonic-gate * if we assume 2300Sstevel@tonic-gate * 2310Sstevel@tonic-gate * (cache_line_size / sizeof (cyc_index_t)) > 2, 2320Sstevel@tonic-gate * 2330Sstevel@tonic-gate * then all siblings are guaranteed to be on the same cache line. Thus, the 2340Sstevel@tonic-gate * miss on the left child will guarantee a hit on the right child; downheaps 2350Sstevel@tonic-gate * will incur at most one cache miss per layer beyond the one-cache-miss 2360Sstevel@tonic-gate * depth. The total number of cache misses for heap management during a 2370Sstevel@tonic-gate * downheap operation is thus bounded by 2380Sstevel@tonic-gate * 2390Sstevel@tonic-gate * lg (n) - lg (cache_line_size / sizeof (cyc_index_t)) 2400Sstevel@tonic-gate * 2410Sstevel@tonic-gate * Traditional pointer-based heaps are implemented without regard to 2420Sstevel@tonic-gate * locality. Downheaps can thus incur two cache misses per layer (one for 2430Sstevel@tonic-gate * each child), but at most one cache miss at the root. This yields a bound 2440Sstevel@tonic-gate * of 2450Sstevel@tonic-gate * 2460Sstevel@tonic-gate * 2 * lg (n) - 1 2470Sstevel@tonic-gate * 2480Sstevel@tonic-gate * on the total cache misses. 2490Sstevel@tonic-gate * 2500Sstevel@tonic-gate * This difference may seem theoretically trivial (the difference is, after 2510Sstevel@tonic-gate * all, constant), but can become substantial in practice -- especially for 2520Sstevel@tonic-gate * caches with very large cache lines and high miss penalties (e.g. TLBs). 2530Sstevel@tonic-gate * 2540Sstevel@tonic-gate * Heaps must always be full, balanced trees. Heap management must therefore 2550Sstevel@tonic-gate * track the next point-of-insertion into the heap. In pointer-based heaps, 2560Sstevel@tonic-gate * recomputing this point takes O(lg (n)). Given the layout of the 2570Sstevel@tonic-gate * array-based implementation, however, the next point-of-insertion is 2580Sstevel@tonic-gate * always: 2590Sstevel@tonic-gate * 2600Sstevel@tonic-gate * heap[number_of_elements] 2610Sstevel@tonic-gate * 2620Sstevel@tonic-gate * We exploit this property by implementing the free-list in the usused 2630Sstevel@tonic-gate * heap elements. Heap insertion, therefore, consists only of filling in 2640Sstevel@tonic-gate * the cyclic at cyp_cyclics[cyp_heap[number_of_elements]], incrementing 2650Sstevel@tonic-gate * the number of elements, and performing an upheap. Heap deletion consists 2660Sstevel@tonic-gate * of decrementing the number of elements, swapping the to-be-deleted element 2670Sstevel@tonic-gate * with the element at cyp_heap[number_of_elements], and downheaping. 2680Sstevel@tonic-gate * 2690Sstevel@tonic-gate * Filling in more details in our earlier example: 2700Sstevel@tonic-gate * 2710Sstevel@tonic-gate * +--- free list head 2720Sstevel@tonic-gate * | 2730Sstevel@tonic-gate * V 2740Sstevel@tonic-gate * 2750Sstevel@tonic-gate * [0] [1] [2] [3] [4] [5] [6] [7] 2760Sstevel@tonic-gate * +-----+-----+-----+-----+-----+-----+-----+-----+ 2770Sstevel@tonic-gate * | | | | | | | | | 2780Sstevel@tonic-gate * | 2 | 3 | 4 | 0 | 1 | 5 | 6 | 7 | 2790Sstevel@tonic-gate * | | | | | | | | | 2800Sstevel@tonic-gate * +-----+-----+-----+-----+-----+-----+-----+-----+ 2810Sstevel@tonic-gate * 2820Sstevel@tonic-gate * To insert into this heap, we would just need to fill in the cyclic at 2830Sstevel@tonic-gate * cyp_cyclics[5], bump the number of elements (from 5 to 6) and perform 2840Sstevel@tonic-gate * an upheap. 2850Sstevel@tonic-gate * 2860Sstevel@tonic-gate * If we wanted to remove, say, cyp_cyclics[3], we would first scan for it 2870Sstevel@tonic-gate * in the cyp_heap, and discover it at cyp_heap[1]. We would then decrement 2880Sstevel@tonic-gate * the number of elements (from 5 to 4), swap cyp_heap[1] with cyp_heap[4], 2890Sstevel@tonic-gate * and perform a downheap from cyp_heap[1]. The linear scan is required 2900Sstevel@tonic-gate * because the cyclic does not keep a backpointer into the heap. This makes 2910Sstevel@tonic-gate * heap manipulation (e.g. downheaps) faster at the expense of removal 2920Sstevel@tonic-gate * operations. 2930Sstevel@tonic-gate * 2940Sstevel@tonic-gate * Expiry processing 2950Sstevel@tonic-gate * 2960Sstevel@tonic-gate * As alluded to above, cyclic_expire() is called by cyclic_fire() at 2970Sstevel@tonic-gate * CY_HIGH_LEVEL to expire a cyclic. Cyclic subsystem consumers are 2980Sstevel@tonic-gate * guaranteed that for an arbitrary time t in the future, their cyclic 2990Sstevel@tonic-gate * handler will have been called (t - cyt_when) / cyt_interval times. Thus, 3000Sstevel@tonic-gate * there must be a one-to-one mapping between a cyclic's expiration at 3010Sstevel@tonic-gate * CY_HIGH_LEVEL and its execution at the desired level (either CY_HIGH_LEVEL, 3020Sstevel@tonic-gate * CY_LOCK_LEVEL or CY_LOW_LEVEL). 3030Sstevel@tonic-gate * 3040Sstevel@tonic-gate * For CY_HIGH_LEVEL cyclics, this is trivial; cyclic_expire() simply needs 3050Sstevel@tonic-gate * to call the handler. 3060Sstevel@tonic-gate * 3070Sstevel@tonic-gate * For CY_LOCK_LEVEL and CY_LOW_LEVEL cyclics, however, there exists a 3080Sstevel@tonic-gate * potential disconnect: if the CPU is at an interrupt level less than 3090Sstevel@tonic-gate * CY_HIGH_LEVEL but greater than the level of a cyclic for a period of 3100Sstevel@tonic-gate * time longer than twice the cyclic's interval, the cyclic will be expired 3110Sstevel@tonic-gate * twice before it can be handled. 3120Sstevel@tonic-gate * 3130Sstevel@tonic-gate * To maintain the one-to-one mapping, we track the difference between the 3140Sstevel@tonic-gate * number of times a cyclic has been expired and the number of times it's 3150Sstevel@tonic-gate * been handled in a "pending count" (the cy_pend field of the cyclic 3160Sstevel@tonic-gate * structure). cyclic_expire() thus increments the cy_pend count for the 3170Sstevel@tonic-gate * expired cyclic and posts a soft interrupt at the desired level. In the 3180Sstevel@tonic-gate * cyclic subsystem's soft interrupt handler, cyclic_softint(), we repeatedly 3190Sstevel@tonic-gate * call the cyclic handler and decrement cy_pend until we have decremented 3200Sstevel@tonic-gate * cy_pend to zero. 3210Sstevel@tonic-gate * 3220Sstevel@tonic-gate * The Producer/Consumer Buffer 3230Sstevel@tonic-gate * 3240Sstevel@tonic-gate * If we wish to avoid a linear scan of the cyclics array at soft interrupt 3250Sstevel@tonic-gate * level, cyclic_softint() must be able to quickly determine which cyclics 3260Sstevel@tonic-gate * have a non-zero cy_pend count. We thus introduce a per-soft interrupt 3270Sstevel@tonic-gate * level producer/consumer buffer shared with CY_HIGH_LEVEL. These buffers 3280Sstevel@tonic-gate * are encapsulated in the cyc_pcbuffer structure, and, like cyp_heap, are 3290Sstevel@tonic-gate * implemented as cyc_index_t arrays (the cypc_buf member of the cyc_pcbuffer 3300Sstevel@tonic-gate * structure). 3310Sstevel@tonic-gate * 3320Sstevel@tonic-gate * The producer (cyclic_expire() running at CY_HIGH_LEVEL) enqueues a cyclic 3330Sstevel@tonic-gate * by storing the cyclic's index to cypc_buf[cypc_prodndx] and incrementing 3340Sstevel@tonic-gate * cypc_prodndx. The consumer (cyclic_softint() running at either 3350Sstevel@tonic-gate * CY_LOCK_LEVEL or CY_LOW_LEVEL) dequeues a cyclic by loading from 3360Sstevel@tonic-gate * cypc_buf[cypc_consndx] and bumping cypc_consndx. The buffer is empty when 3370Sstevel@tonic-gate * cypc_prodndx == cypc_consndx. 3380Sstevel@tonic-gate * 3390Sstevel@tonic-gate * To bound the size of the producer/consumer buffer, cyclic_expire() only 3400Sstevel@tonic-gate * enqueues a cyclic if its cy_pend was zero (if the cyclic's cy_pend is 3410Sstevel@tonic-gate * non-zero, cyclic_expire() only bumps cy_pend). Symmetrically, 3420Sstevel@tonic-gate * cyclic_softint() only consumes a cyclic after it has decremented the 3430Sstevel@tonic-gate * cy_pend count to zero. 3440Sstevel@tonic-gate * 3450Sstevel@tonic-gate * Returning to our example, here is what the CY_LOW_LEVEL producer/consumer 3460Sstevel@tonic-gate * buffer might look like: 3470Sstevel@tonic-gate * 3480Sstevel@tonic-gate * cypc_consndx ---+ +--- cypc_prodndx 3490Sstevel@tonic-gate * | | 3500Sstevel@tonic-gate * V V 3510Sstevel@tonic-gate * 3520Sstevel@tonic-gate * [0] [1] [2] [3] [4] [5] [6] [7] 3530Sstevel@tonic-gate * +-----+-----+-----+-----+-----+-----+-----+-----+ 3540Sstevel@tonic-gate * | | | | | | | | | 3550Sstevel@tonic-gate * | x | x | 3 | 2 | 4 | x | x | x | <== cypc_buf 3560Sstevel@tonic-gate * | | | . | . | . | | | | 3570Sstevel@tonic-gate * +-----+-----+- | -+- | -+- | -+-----+-----+-----+ 3580Sstevel@tonic-gate * | | | 3590Sstevel@tonic-gate * | | | cy_pend cy_handler 3600Sstevel@tonic-gate * | | | ------------------------- 3610Sstevel@tonic-gate * | | | [ 0] 1 clock() 3620Sstevel@tonic-gate * | | | [ 1] 0 deadman() 3630Sstevel@tonic-gate * | +---- | -------> [ 2] 3 clock_highres_fire() 3640Sstevel@tonic-gate * +---------- | -------> [ 3] 1 clock_highres_fire() 3650Sstevel@tonic-gate * +--------> [ 4] 1 clock_highres_fire() 3660Sstevel@tonic-gate * [ 5] - (free) 3670Sstevel@tonic-gate * [ 6] - (free) 3680Sstevel@tonic-gate * [ 7] - (free) 3690Sstevel@tonic-gate * 3700Sstevel@tonic-gate * In particular, note that clock()'s cy_pend is 1 but that it is _not_ in 3710Sstevel@tonic-gate * this producer/consumer buffer; it would be enqueued in the CY_LOCK_LEVEL 3720Sstevel@tonic-gate * producer/consumer buffer. 3730Sstevel@tonic-gate * 3740Sstevel@tonic-gate * Locking 3750Sstevel@tonic-gate * 3760Sstevel@tonic-gate * Traditionally, access to per-CPU data structures shared between 3770Sstevel@tonic-gate * interrupt levels is serialized by manipulating programmable interrupt 3780Sstevel@tonic-gate * level: readers and writers are required to raise their interrupt level 3790Sstevel@tonic-gate * to that of the highest level writer. 3800Sstevel@tonic-gate * 3810Sstevel@tonic-gate * For the producer/consumer buffers (shared between cyclic_fire()/ 3820Sstevel@tonic-gate * cyclic_expire() executing at CY_HIGH_LEVEL and cyclic_softint() executing 3830Sstevel@tonic-gate * at one of CY_LOCK_LEVEL or CY_LOW_LEVEL), forcing cyclic_softint() to raise 3840Sstevel@tonic-gate * programmable interrupt level is undesirable: aside from the additional 3850Sstevel@tonic-gate * latency incurred by manipulating interrupt level in the hot cy_pend 3860Sstevel@tonic-gate * processing path, this would create the potential for soft level cy_pend 3870Sstevel@tonic-gate * processing to delay CY_HIGH_LEVEL firing and expiry processing. 3880Sstevel@tonic-gate * CY_LOCK/LOW_LEVEL cyclics could thereby induce jitter in CY_HIGH_LEVEL 3890Sstevel@tonic-gate * cyclics. 3900Sstevel@tonic-gate * 3910Sstevel@tonic-gate * To minimize jitter, then, we would like the cyclic_fire()/cyclic_expire() 3920Sstevel@tonic-gate * and cyclic_softint() code paths to be lock-free. 3930Sstevel@tonic-gate * 3940Sstevel@tonic-gate * For cyclic_fire()/cyclic_expire(), lock-free execution is straightforward: 3950Sstevel@tonic-gate * because these routines execute at a higher interrupt level than 3960Sstevel@tonic-gate * cyclic_softint(), their actions on the producer/consumer buffer appear 3970Sstevel@tonic-gate * atomic. In particular, the increment of cy_pend appears to occur 3980Sstevel@tonic-gate * atomically with the increment of cypc_prodndx. 3990Sstevel@tonic-gate * 4000Sstevel@tonic-gate * For cyclic_softint(), however, lock-free execution requires more delicacy. 4010Sstevel@tonic-gate * When cyclic_softint() discovers a cyclic in the producer/consumer buffer, 4020Sstevel@tonic-gate * it calls the cyclic's handler and attempts to atomically decrement the 4030Sstevel@tonic-gate * cy_pend count with a compare&swap operation. 4040Sstevel@tonic-gate * 4050Sstevel@tonic-gate * If the compare&swap operation succeeds, cyclic_softint() behaves 4060Sstevel@tonic-gate * conditionally based on the value it atomically wrote to cy_pend: 4070Sstevel@tonic-gate * 4080Sstevel@tonic-gate * - If the cy_pend was decremented to 0, the cyclic has been consumed; 4090Sstevel@tonic-gate * cyclic_softint() increments the cypc_consndx and checks for more 4100Sstevel@tonic-gate * enqueued work. 4110Sstevel@tonic-gate * 4120Sstevel@tonic-gate * - If the count was decremented to a non-zero value, there is more work 4130Sstevel@tonic-gate * to be done on the cyclic; cyclic_softint() calls the cyclic handler 4140Sstevel@tonic-gate * and repeats the atomic decrement process. 4150Sstevel@tonic-gate * 4160Sstevel@tonic-gate * If the compare&swap operation fails, cyclic_softint() knows that 4170Sstevel@tonic-gate * cyclic_expire() has intervened and bumped the cy_pend count (resizes 4180Sstevel@tonic-gate * and removals complicate this, however -- see the sections on their 4190Sstevel@tonic-gate * operation, below). cyclic_softint() thus reloads cy_pend, and re-attempts 4200Sstevel@tonic-gate * the atomic decrement. 4210Sstevel@tonic-gate * 4220Sstevel@tonic-gate * Recall that we bound the size of the producer/consumer buffer by 4230Sstevel@tonic-gate * having cyclic_expire() only enqueue the specified cyclic if its 4240Sstevel@tonic-gate * cy_pend count is zero; this assures that each cyclic is enqueued at 4250Sstevel@tonic-gate * most once. This leads to a critical constraint on cyclic_softint(), 4260Sstevel@tonic-gate * however: after the compare&swap operation which successfully decrements 4270Sstevel@tonic-gate * cy_pend to zero, cyclic_softint() must _not_ re-examine the consumed 4280Sstevel@tonic-gate * cyclic. In part to obey this constraint, cyclic_softint() calls the 4290Sstevel@tonic-gate * cyclic handler before decrementing cy_pend. 4300Sstevel@tonic-gate * 4310Sstevel@tonic-gate * Resizing 4320Sstevel@tonic-gate * 4330Sstevel@tonic-gate * All of the discussion thus far has assumed a static number of cyclics. 4340Sstevel@tonic-gate * Obviously, static limitations are not practical; we need the capacity 4350Sstevel@tonic-gate * to resize our data structures dynamically. 4360Sstevel@tonic-gate * 4370Sstevel@tonic-gate * We resize our data structures lazily, and only on a per-CPU basis. 4380Sstevel@tonic-gate * The size of the data structures always doubles and never shrinks. We 4390Sstevel@tonic-gate * serialize adds (and thus resizes) on cpu_lock; we never need to deal 4400Sstevel@tonic-gate * with concurrent resizes. Resizes should be rare; they may induce jitter 4410Sstevel@tonic-gate * on the CPU being resized, but should not affect cyclic operation on other 4420Sstevel@tonic-gate * CPUs. Pending cyclics may not be dropped during a resize operation. 4430Sstevel@tonic-gate * 4440Sstevel@tonic-gate * Three key cyc_cpu data structures need to be resized: the cyclics array, 4450Sstevel@tonic-gate * the heap array and the producer/consumer buffers. Resizing the first two 4460Sstevel@tonic-gate * is relatively straightforward: 4470Sstevel@tonic-gate * 4480Sstevel@tonic-gate * 1. The new, larger arrays are allocated in cyclic_expand() (called 4490Sstevel@tonic-gate * from cyclic_add()). 4500Sstevel@tonic-gate * 2. cyclic_expand() cross calls cyclic_expand_xcall() on the CPU 4510Sstevel@tonic-gate * undergoing the resize. 4520Sstevel@tonic-gate * 3. cyclic_expand_xcall() raises interrupt level to CY_HIGH_LEVEL 4530Sstevel@tonic-gate * 4. The contents of the old arrays are copied into the new arrays. 4540Sstevel@tonic-gate * 5. The old cyclics array is bzero()'d 4550Sstevel@tonic-gate * 6. The pointers are updated. 4560Sstevel@tonic-gate * 4570Sstevel@tonic-gate * The producer/consumer buffer is dicier: cyclic_expand_xcall() may have 4580Sstevel@tonic-gate * interrupted cyclic_softint() in the middle of consumption. To resize the 4590Sstevel@tonic-gate * producer/consumer buffer, we implement up to two buffers per soft interrupt 4600Sstevel@tonic-gate * level: a hard buffer (the buffer being produced into by cyclic_expire()) 4610Sstevel@tonic-gate * and a soft buffer (the buffer from which cyclic_softint() is consuming). 4620Sstevel@tonic-gate * During normal operation, the hard buffer and soft buffer point to the 4630Sstevel@tonic-gate * same underlying producer/consumer buffer. 4640Sstevel@tonic-gate * 4650Sstevel@tonic-gate * During a resize, however, cyclic_expand_xcall() changes the hard buffer 4660Sstevel@tonic-gate * to point to the new, larger producer/consumer buffer; all future 4670Sstevel@tonic-gate * cyclic_expire()'s will produce into the new buffer. cyclic_expand_xcall() 4680Sstevel@tonic-gate * then posts a CY_LOCK_LEVEL soft interrupt, landing in cyclic_softint(). 4690Sstevel@tonic-gate * 4700Sstevel@tonic-gate * As under normal operation, cyclic_softint() will consume cyclics from 4710Sstevel@tonic-gate * its soft buffer. After the soft buffer is drained, however, 4720Sstevel@tonic-gate * cyclic_softint() will see that the hard buffer has changed. At that time, 4730Sstevel@tonic-gate * cyclic_softint() will change its soft buffer to point to the hard buffer, 4740Sstevel@tonic-gate * and repeat the producer/consumer buffer draining procedure. 4750Sstevel@tonic-gate * 4760Sstevel@tonic-gate * After the new buffer is drained, cyclic_softint() will determine if both 4770Sstevel@tonic-gate * soft levels have seen their new producer/consumer buffer. If both have, 4780Sstevel@tonic-gate * cyclic_softint() will post on the semaphore cyp_modify_wait. If not, a 4790Sstevel@tonic-gate * soft interrupt will be generated for the remaining level. 4800Sstevel@tonic-gate * 4810Sstevel@tonic-gate * cyclic_expand() blocks on the cyp_modify_wait semaphore (a semaphore is 4820Sstevel@tonic-gate * used instead of a condition variable because of the race between the 4830Sstevel@tonic-gate * sema_p() in cyclic_expand() and the sema_v() in cyclic_softint()). This 4840Sstevel@tonic-gate * allows cyclic_expand() to know when the resize operation is complete; 4850Sstevel@tonic-gate * all of the old buffers (the heap, the cyclics array and the producer/ 4860Sstevel@tonic-gate * consumer buffers) can be freed. 4870Sstevel@tonic-gate * 4880Sstevel@tonic-gate * A final caveat on resizing: we described step (5) in the 4890Sstevel@tonic-gate * cyclic_expand_xcall() procedure without providing any motivation. This 4900Sstevel@tonic-gate * step addresses the problem of a cyclic_softint() attempting to decrement 4910Sstevel@tonic-gate * a cy_pend count while interrupted by a cyclic_expand_xcall(). Because 4920Sstevel@tonic-gate * cyclic_softint() has already called the handler by the time cy_pend is 4930Sstevel@tonic-gate * decremented, we want to assure that it doesn't decrement a cy_pend 4940Sstevel@tonic-gate * count in the old cyclics array. By zeroing the old cyclics array in 4950Sstevel@tonic-gate * cyclic_expand_xcall(), we are zeroing out every cy_pend count; when 4960Sstevel@tonic-gate * cyclic_softint() attempts to compare&swap on the cy_pend count, it will 4970Sstevel@tonic-gate * fail and recognize that the count has been zeroed. cyclic_softint() will 4980Sstevel@tonic-gate * update its stale copy of the cyp_cyclics pointer, re-read the cy_pend 4990Sstevel@tonic-gate * count from the new cyclics array, and re-attempt the compare&swap. 5000Sstevel@tonic-gate * 5010Sstevel@tonic-gate * Removals 5020Sstevel@tonic-gate * 5030Sstevel@tonic-gate * Cyclic removals should be rare. To simplify the implementation (and to 5040Sstevel@tonic-gate * allow optimization for the cyclic_fire()/cyclic_expire()/cyclic_softint() 5050Sstevel@tonic-gate * path), we force removals and adds to serialize on cpu_lock. 5060Sstevel@tonic-gate * 5070Sstevel@tonic-gate * Cyclic removal is complicated by a guarantee made to the consumer of 5080Sstevel@tonic-gate * the cyclic subsystem: after cyclic_remove() returns, the cyclic handler 5090Sstevel@tonic-gate * has returned and will never again be called. 5100Sstevel@tonic-gate * 5110Sstevel@tonic-gate * Here is the procedure for cyclic removal: 5120Sstevel@tonic-gate * 5130Sstevel@tonic-gate * 1. cyclic_remove() calls cyclic_remove_xcall() on the CPU undergoing 5140Sstevel@tonic-gate * the removal. 5150Sstevel@tonic-gate * 2. cyclic_remove_xcall() raises interrupt level to CY_HIGH_LEVEL 5160Sstevel@tonic-gate * 3. The current expiration time for the removed cyclic is recorded. 5170Sstevel@tonic-gate * 4. If the cy_pend count on the removed cyclic is non-zero, it 5180Sstevel@tonic-gate * is copied into cyp_rpend and subsequently zeroed. 5190Sstevel@tonic-gate * 5. The cyclic is removed from the heap 5200Sstevel@tonic-gate * 6. If the root of the heap has changed, the backend is reprogrammed. 5210Sstevel@tonic-gate * 7. If the cy_pend count was non-zero cyclic_remove() blocks on the 5220Sstevel@tonic-gate * cyp_modify_wait semaphore. 5230Sstevel@tonic-gate * 5240Sstevel@tonic-gate * The motivation for step (3) is explained in "Juggling", below. 5250Sstevel@tonic-gate * 5260Sstevel@tonic-gate * The cy_pend count is decremented in cyclic_softint() after the cyclic 5270Sstevel@tonic-gate * handler returns. Thus, if we find a cy_pend count of zero in step 5280Sstevel@tonic-gate * (4), we know that cyclic_remove() doesn't need to block. 5290Sstevel@tonic-gate * 5300Sstevel@tonic-gate * If the cy_pend count is non-zero, however, we must block in cyclic_remove() 5310Sstevel@tonic-gate * until cyclic_softint() has finished calling the cyclic handler. To let 5320Sstevel@tonic-gate * cyclic_softint() know that this cyclic has been removed, we zero the 5330Sstevel@tonic-gate * cy_pend count. This will cause cyclic_softint()'s compare&swap to fail. 5340Sstevel@tonic-gate * When cyclic_softint() sees the zero cy_pend count, it knows that it's been 5350Sstevel@tonic-gate * caught during a resize (see "Resizing", above) or that the cyclic has been 5360Sstevel@tonic-gate * removed. In the latter case, it calls cyclic_remove_pend() to call the 5370Sstevel@tonic-gate * cyclic handler cyp_rpend - 1 times, and posts on cyp_modify_wait. 5380Sstevel@tonic-gate * 5390Sstevel@tonic-gate * Juggling 5400Sstevel@tonic-gate * 5410Sstevel@tonic-gate * At first glance, cyclic juggling seems to be a difficult problem. The 5420Sstevel@tonic-gate * subsystem must guarantee that a cyclic doesn't execute simultaneously on 5430Sstevel@tonic-gate * different CPUs, while also assuring that a cyclic fires exactly once 5440Sstevel@tonic-gate * per interval. We solve this problem by leveraging a property of the 5450Sstevel@tonic-gate * platform: gethrtime() is required to increase in lock-step across 5460Sstevel@tonic-gate * multiple CPUs. Therefore, to juggle a cyclic, we remove it from its 5470Sstevel@tonic-gate * CPU, recording its expiration time in the remove cross call (step (3) 5480Sstevel@tonic-gate * in "Removing", above). We then add the cyclic to the new CPU, explicitly 5490Sstevel@tonic-gate * setting its expiration time to the time recorded in the removal. This 5500Sstevel@tonic-gate * leverages the existing cyclic expiry processing, which will compensate 5510Sstevel@tonic-gate * for any time lost while juggling. 5520Sstevel@tonic-gate * 5530Sstevel@tonic-gate */ 5540Sstevel@tonic-gate #include <sys/cyclic_impl.h> 5550Sstevel@tonic-gate #include <sys/sysmacros.h> 5560Sstevel@tonic-gate #include <sys/systm.h> 5570Sstevel@tonic-gate #include <sys/atomic.h> 5580Sstevel@tonic-gate #include <sys/kmem.h> 5590Sstevel@tonic-gate #include <sys/cmn_err.h> 5600Sstevel@tonic-gate #include <sys/ddi.h> 561*5864Sesaxe #include <sys/sdt.h> 5620Sstevel@tonic-gate 5630Sstevel@tonic-gate #ifdef CYCLIC_TRACE 5640Sstevel@tonic-gate 5650Sstevel@tonic-gate /* 5660Sstevel@tonic-gate * cyc_trace_enabled is for the benefit of kernel debuggers. 5670Sstevel@tonic-gate */ 5680Sstevel@tonic-gate int cyc_trace_enabled = 1; 5690Sstevel@tonic-gate static cyc_tracebuf_t cyc_ptrace; 5700Sstevel@tonic-gate static cyc_coverage_t cyc_coverage[CY_NCOVERAGE]; 5710Sstevel@tonic-gate 5720Sstevel@tonic-gate /* 5730Sstevel@tonic-gate * Seen this anywhere? 5740Sstevel@tonic-gate */ 5750Sstevel@tonic-gate static uint_t 5760Sstevel@tonic-gate cyclic_coverage_hash(char *p) 5770Sstevel@tonic-gate { 5780Sstevel@tonic-gate unsigned int g; 5790Sstevel@tonic-gate uint_t hval; 5800Sstevel@tonic-gate 5810Sstevel@tonic-gate hval = 0; 5820Sstevel@tonic-gate while (*p) { 5830Sstevel@tonic-gate hval = (hval << 4) + *p++; 5840Sstevel@tonic-gate if ((g = (hval & 0xf0000000)) != 0) 5850Sstevel@tonic-gate hval ^= g >> 24; 5860Sstevel@tonic-gate hval &= ~g; 5870Sstevel@tonic-gate } 5880Sstevel@tonic-gate return (hval); 5890Sstevel@tonic-gate } 5900Sstevel@tonic-gate 5910Sstevel@tonic-gate static void 5920Sstevel@tonic-gate cyclic_coverage(char *why, int level, uint64_t arg0, uint64_t arg1) 5930Sstevel@tonic-gate { 5940Sstevel@tonic-gate uint_t ndx, orig; 5950Sstevel@tonic-gate 5960Sstevel@tonic-gate for (ndx = orig = cyclic_coverage_hash(why) % CY_NCOVERAGE; ; ) { 5970Sstevel@tonic-gate if (cyc_coverage[ndx].cyv_why == why) 5980Sstevel@tonic-gate break; 5990Sstevel@tonic-gate 6000Sstevel@tonic-gate if (cyc_coverage[ndx].cyv_why != NULL || 6010Sstevel@tonic-gate casptr(&cyc_coverage[ndx].cyv_why, NULL, why) != NULL) { 6020Sstevel@tonic-gate 6030Sstevel@tonic-gate if (++ndx == CY_NCOVERAGE) 6040Sstevel@tonic-gate ndx = 0; 6050Sstevel@tonic-gate 6060Sstevel@tonic-gate if (ndx == orig) 6070Sstevel@tonic-gate panic("too many cyclic coverage points"); 6080Sstevel@tonic-gate continue; 6090Sstevel@tonic-gate } 6100Sstevel@tonic-gate 6110Sstevel@tonic-gate /* 6120Sstevel@tonic-gate * If we're here, we have successfully swung our guy into 6130Sstevel@tonic-gate * the position at "ndx". 6140Sstevel@tonic-gate */ 6150Sstevel@tonic-gate break; 6160Sstevel@tonic-gate } 6170Sstevel@tonic-gate 6180Sstevel@tonic-gate if (level == CY_PASSIVE_LEVEL) 6190Sstevel@tonic-gate cyc_coverage[ndx].cyv_passive_count++; 6200Sstevel@tonic-gate else 6210Sstevel@tonic-gate cyc_coverage[ndx].cyv_count[level]++; 6220Sstevel@tonic-gate 6230Sstevel@tonic-gate cyc_coverage[ndx].cyv_arg0 = arg0; 6240Sstevel@tonic-gate cyc_coverage[ndx].cyv_arg1 = arg1; 6250Sstevel@tonic-gate } 6260Sstevel@tonic-gate 6270Sstevel@tonic-gate #define CYC_TRACE(cpu, level, why, arg0, arg1) \ 6280Sstevel@tonic-gate CYC_TRACE_IMPL(&cpu->cyp_trace[level], level, why, arg0, arg1) 6290Sstevel@tonic-gate 6300Sstevel@tonic-gate #define CYC_PTRACE(why, arg0, arg1) \ 6310Sstevel@tonic-gate CYC_TRACE_IMPL(&cyc_ptrace, CY_PASSIVE_LEVEL, why, arg0, arg1) 6320Sstevel@tonic-gate 6330Sstevel@tonic-gate #define CYC_TRACE_IMPL(buf, level, why, a0, a1) { \ 6340Sstevel@tonic-gate if (panicstr == NULL) { \ 6350Sstevel@tonic-gate int _ndx = (buf)->cyt_ndx; \ 6360Sstevel@tonic-gate cyc_tracerec_t *_rec = &(buf)->cyt_buf[_ndx]; \ 6370Sstevel@tonic-gate (buf)->cyt_ndx = (++_ndx == CY_NTRACEREC) ? 0 : _ndx; \ 6380Sstevel@tonic-gate _rec->cyt_tstamp = gethrtime_unscaled(); \ 6390Sstevel@tonic-gate _rec->cyt_why = (why); \ 6400Sstevel@tonic-gate _rec->cyt_arg0 = (uint64_t)(uintptr_t)(a0); \ 6410Sstevel@tonic-gate _rec->cyt_arg1 = (uint64_t)(uintptr_t)(a1); \ 6420Sstevel@tonic-gate cyclic_coverage(why, level, \ 6430Sstevel@tonic-gate (uint64_t)(uintptr_t)(a0), (uint64_t)(uintptr_t)(a1)); \ 6440Sstevel@tonic-gate } \ 6450Sstevel@tonic-gate } 6460Sstevel@tonic-gate 6470Sstevel@tonic-gate #else 6480Sstevel@tonic-gate 6490Sstevel@tonic-gate static int cyc_trace_enabled = 0; 6500Sstevel@tonic-gate 6510Sstevel@tonic-gate #define CYC_TRACE(cpu, level, why, arg0, arg1) 6520Sstevel@tonic-gate #define CYC_PTRACE(why, arg0, arg1) 6530Sstevel@tonic-gate 6540Sstevel@tonic-gate #endif 6550Sstevel@tonic-gate 6560Sstevel@tonic-gate #define CYC_TRACE0(cpu, level, why) CYC_TRACE(cpu, level, why, 0, 0) 6570Sstevel@tonic-gate #define CYC_TRACE1(cpu, level, why, arg0) CYC_TRACE(cpu, level, why, arg0, 0) 6580Sstevel@tonic-gate 6590Sstevel@tonic-gate #define CYC_PTRACE0(why) CYC_PTRACE(why, 0, 0) 6600Sstevel@tonic-gate #define CYC_PTRACE1(why, arg0) CYC_PTRACE(why, arg0, 0) 6610Sstevel@tonic-gate 6620Sstevel@tonic-gate static kmem_cache_t *cyclic_id_cache; 6630Sstevel@tonic-gate static cyc_id_t *cyclic_id_head; 6640Sstevel@tonic-gate static hrtime_t cyclic_resolution; 6650Sstevel@tonic-gate static cyc_backend_t cyclic_backend; 6660Sstevel@tonic-gate 6670Sstevel@tonic-gate /* 6680Sstevel@tonic-gate * Returns 1 if the upheap propagated to the root, 0 if it did not. This 6690Sstevel@tonic-gate * allows the caller to reprogram the backend only when the root has been 6700Sstevel@tonic-gate * modified. 6710Sstevel@tonic-gate */ 6720Sstevel@tonic-gate static int 6730Sstevel@tonic-gate cyclic_upheap(cyc_cpu_t *cpu, cyc_index_t ndx) 6740Sstevel@tonic-gate { 6750Sstevel@tonic-gate cyclic_t *cyclics; 6760Sstevel@tonic-gate cyc_index_t *heap; 6770Sstevel@tonic-gate cyc_index_t heap_parent, heap_current = ndx; 6780Sstevel@tonic-gate cyc_index_t parent, current; 6790Sstevel@tonic-gate 6800Sstevel@tonic-gate if (heap_current == 0) 6810Sstevel@tonic-gate return (1); 6820Sstevel@tonic-gate 6830Sstevel@tonic-gate heap = cpu->cyp_heap; 6840Sstevel@tonic-gate cyclics = cpu->cyp_cyclics; 6850Sstevel@tonic-gate heap_parent = CYC_HEAP_PARENT(heap_current); 6860Sstevel@tonic-gate 6870Sstevel@tonic-gate for (;;) { 6880Sstevel@tonic-gate current = heap[heap_current]; 6890Sstevel@tonic-gate parent = heap[heap_parent]; 6900Sstevel@tonic-gate 6910Sstevel@tonic-gate /* 6920Sstevel@tonic-gate * We have an expiration time later than our parent; we're 6930Sstevel@tonic-gate * done. 6940Sstevel@tonic-gate */ 6950Sstevel@tonic-gate if (cyclics[current].cy_expire >= cyclics[parent].cy_expire) 6960Sstevel@tonic-gate return (0); 6970Sstevel@tonic-gate 6980Sstevel@tonic-gate /* 6990Sstevel@tonic-gate * We need to swap with our parent, and continue up the heap. 7000Sstevel@tonic-gate */ 7010Sstevel@tonic-gate heap[heap_parent] = current; 7020Sstevel@tonic-gate heap[heap_current] = parent; 7030Sstevel@tonic-gate 7040Sstevel@tonic-gate /* 7050Sstevel@tonic-gate * If we just reached the root, we're done. 7060Sstevel@tonic-gate */ 7070Sstevel@tonic-gate if (heap_parent == 0) 7080Sstevel@tonic-gate return (1); 7090Sstevel@tonic-gate 7100Sstevel@tonic-gate heap_current = heap_parent; 7110Sstevel@tonic-gate heap_parent = CYC_HEAP_PARENT(heap_current); 7120Sstevel@tonic-gate } 7130Sstevel@tonic-gate } 7140Sstevel@tonic-gate 7150Sstevel@tonic-gate static void 7160Sstevel@tonic-gate cyclic_downheap(cyc_cpu_t *cpu, cyc_index_t ndx) 7170Sstevel@tonic-gate { 7180Sstevel@tonic-gate cyclic_t *cyclics = cpu->cyp_cyclics; 7190Sstevel@tonic-gate cyc_index_t *heap = cpu->cyp_heap; 7200Sstevel@tonic-gate 7210Sstevel@tonic-gate cyc_index_t heap_left, heap_right, heap_me = ndx; 7220Sstevel@tonic-gate cyc_index_t left, right, me; 7230Sstevel@tonic-gate cyc_index_t nelems = cpu->cyp_nelems; 7240Sstevel@tonic-gate 7250Sstevel@tonic-gate for (;;) { 7260Sstevel@tonic-gate /* 7270Sstevel@tonic-gate * If we don't have a left child (i.e., we're a leaf), we're 7280Sstevel@tonic-gate * done. 7290Sstevel@tonic-gate */ 7300Sstevel@tonic-gate if ((heap_left = CYC_HEAP_LEFT(heap_me)) >= nelems) 7310Sstevel@tonic-gate return; 7320Sstevel@tonic-gate 7330Sstevel@tonic-gate left = heap[heap_left]; 7340Sstevel@tonic-gate me = heap[heap_me]; 7350Sstevel@tonic-gate 7360Sstevel@tonic-gate heap_right = CYC_HEAP_RIGHT(heap_me); 7370Sstevel@tonic-gate 7380Sstevel@tonic-gate /* 7390Sstevel@tonic-gate * Even if we don't have a right child, we still need to compare 7400Sstevel@tonic-gate * our expiration time against that of our left child. 7410Sstevel@tonic-gate */ 7420Sstevel@tonic-gate if (heap_right >= nelems) 7430Sstevel@tonic-gate goto comp_left; 7440Sstevel@tonic-gate 7450Sstevel@tonic-gate right = heap[heap_right]; 7460Sstevel@tonic-gate 7470Sstevel@tonic-gate /* 7480Sstevel@tonic-gate * We have both a left and a right child. We need to compare 7490Sstevel@tonic-gate * the expiration times of the children to determine which 7500Sstevel@tonic-gate * expires earlier. 7510Sstevel@tonic-gate */ 7520Sstevel@tonic-gate if (cyclics[right].cy_expire < cyclics[left].cy_expire) { 7530Sstevel@tonic-gate /* 7540Sstevel@tonic-gate * Our right child is the earlier of our children. 7550Sstevel@tonic-gate * We'll now compare our expiration time to its; if 7560Sstevel@tonic-gate * ours is the earlier, we're done. 7570Sstevel@tonic-gate */ 7580Sstevel@tonic-gate if (cyclics[me].cy_expire <= cyclics[right].cy_expire) 7590Sstevel@tonic-gate return; 7600Sstevel@tonic-gate 7610Sstevel@tonic-gate /* 7620Sstevel@tonic-gate * Our right child expires earlier than we do; swap 7630Sstevel@tonic-gate * with our right child, and descend right. 7640Sstevel@tonic-gate */ 7650Sstevel@tonic-gate heap[heap_right] = me; 7660Sstevel@tonic-gate heap[heap_me] = right; 7670Sstevel@tonic-gate heap_me = heap_right; 7680Sstevel@tonic-gate continue; 7690Sstevel@tonic-gate } 7700Sstevel@tonic-gate 7710Sstevel@tonic-gate comp_left: 7720Sstevel@tonic-gate /* 7730Sstevel@tonic-gate * Our left child is the earlier of our children (or we have 7740Sstevel@tonic-gate * no right child). We'll now compare our expiration time 7750Sstevel@tonic-gate * to its; if ours is the earlier, we're done. 7760Sstevel@tonic-gate */ 7770Sstevel@tonic-gate if (cyclics[me].cy_expire <= cyclics[left].cy_expire) 7780Sstevel@tonic-gate return; 7790Sstevel@tonic-gate 7800Sstevel@tonic-gate /* 7810Sstevel@tonic-gate * Our left child expires earlier than we do; swap with our 7820Sstevel@tonic-gate * left child, and descend left. 7830Sstevel@tonic-gate */ 7840Sstevel@tonic-gate heap[heap_left] = me; 7850Sstevel@tonic-gate heap[heap_me] = left; 7860Sstevel@tonic-gate heap_me = heap_left; 7870Sstevel@tonic-gate } 7880Sstevel@tonic-gate } 7890Sstevel@tonic-gate 7900Sstevel@tonic-gate static void 7910Sstevel@tonic-gate cyclic_expire(cyc_cpu_t *cpu, cyc_index_t ndx, cyclic_t *cyclic) 7920Sstevel@tonic-gate { 7930Sstevel@tonic-gate cyc_backend_t *be = cpu->cyp_backend; 7940Sstevel@tonic-gate cyc_level_t level = cyclic->cy_level; 7950Sstevel@tonic-gate 7960Sstevel@tonic-gate /* 7970Sstevel@tonic-gate * If this is a CY_HIGH_LEVEL cyclic, just call the handler; we don't 7980Sstevel@tonic-gate * need to worry about the pend count for CY_HIGH_LEVEL cyclics. 7990Sstevel@tonic-gate */ 8000Sstevel@tonic-gate if (level == CY_HIGH_LEVEL) { 8010Sstevel@tonic-gate cyc_func_t handler = cyclic->cy_handler; 8020Sstevel@tonic-gate void *arg = cyclic->cy_arg; 8030Sstevel@tonic-gate 8040Sstevel@tonic-gate CYC_TRACE(cpu, CY_HIGH_LEVEL, "handler-in", handler, arg); 805*5864Sesaxe DTRACE_PROBE1(cyclic__start, cyclic_t *, cyclic); 806*5864Sesaxe 8070Sstevel@tonic-gate (*handler)(arg); 808*5864Sesaxe 809*5864Sesaxe DTRACE_PROBE1(cyclic__end, cyclic_t *, cyclic); 8100Sstevel@tonic-gate CYC_TRACE(cpu, CY_HIGH_LEVEL, "handler-out", handler, arg); 8110Sstevel@tonic-gate 8120Sstevel@tonic-gate return; 8130Sstevel@tonic-gate } 8140Sstevel@tonic-gate 8150Sstevel@tonic-gate /* 8160Sstevel@tonic-gate * We're at CY_HIGH_LEVEL; this modification to cy_pend need not 8170Sstevel@tonic-gate * be atomic (the high interrupt level assures that it will appear 8180Sstevel@tonic-gate * atomic to any softint currently running). 8190Sstevel@tonic-gate */ 8200Sstevel@tonic-gate if (cyclic->cy_pend++ == 0) { 8210Sstevel@tonic-gate cyc_softbuf_t *softbuf = &cpu->cyp_softbuf[level]; 8220Sstevel@tonic-gate cyc_pcbuffer_t *pc = &softbuf->cys_buf[softbuf->cys_hard]; 8230Sstevel@tonic-gate 8240Sstevel@tonic-gate /* 8250Sstevel@tonic-gate * We need to enqueue this cyclic in the soft buffer. 8260Sstevel@tonic-gate */ 8270Sstevel@tonic-gate CYC_TRACE(cpu, CY_HIGH_LEVEL, "expire-enq", cyclic, 8280Sstevel@tonic-gate pc->cypc_prodndx); 8290Sstevel@tonic-gate pc->cypc_buf[pc->cypc_prodndx++ & pc->cypc_sizemask] = ndx; 8300Sstevel@tonic-gate 8310Sstevel@tonic-gate ASSERT(pc->cypc_prodndx != pc->cypc_consndx); 8320Sstevel@tonic-gate } else { 8330Sstevel@tonic-gate /* 8340Sstevel@tonic-gate * If the pend count is zero after we incremented it, then 8350Sstevel@tonic-gate * we've wrapped (i.e. we had a cy_pend count of over four 8360Sstevel@tonic-gate * billion. In this case, we clamp the pend count at 8370Sstevel@tonic-gate * UINT32_MAX. Yes, cyclics can be lost in this case. 8380Sstevel@tonic-gate */ 8390Sstevel@tonic-gate if (cyclic->cy_pend == 0) { 8400Sstevel@tonic-gate CYC_TRACE1(cpu, CY_HIGH_LEVEL, "expire-wrap", cyclic); 8410Sstevel@tonic-gate cyclic->cy_pend = UINT32_MAX; 8420Sstevel@tonic-gate } 8430Sstevel@tonic-gate 8440Sstevel@tonic-gate CYC_TRACE(cpu, CY_HIGH_LEVEL, "expire-bump", cyclic, 0); 8450Sstevel@tonic-gate } 8460Sstevel@tonic-gate 8470Sstevel@tonic-gate be->cyb_softint(be->cyb_arg, cyclic->cy_level); 8480Sstevel@tonic-gate } 8490Sstevel@tonic-gate 8500Sstevel@tonic-gate /* 8510Sstevel@tonic-gate * cyclic_fire(cpu_t *) 8520Sstevel@tonic-gate * 8530Sstevel@tonic-gate * Overview 8540Sstevel@tonic-gate * 8550Sstevel@tonic-gate * cyclic_fire() is the cyclic subsystem's CY_HIGH_LEVEL interrupt handler. 8560Sstevel@tonic-gate * Called by the cyclic backend. 8570Sstevel@tonic-gate * 8580Sstevel@tonic-gate * Arguments and notes 8590Sstevel@tonic-gate * 8600Sstevel@tonic-gate * The only argument is the CPU on which the interrupt is executing; 8610Sstevel@tonic-gate * backends must call into cyclic_fire() on the specified CPU. 8620Sstevel@tonic-gate * 8630Sstevel@tonic-gate * cyclic_fire() may be called spuriously without ill effect. Optimal 8640Sstevel@tonic-gate * backends will call into cyclic_fire() at or shortly after the time 8650Sstevel@tonic-gate * requested via cyb_reprogram(). However, calling cyclic_fire() 8660Sstevel@tonic-gate * arbitrarily late will only manifest latency bubbles; the correctness 8670Sstevel@tonic-gate * of the cyclic subsystem does not rely on the timeliness of the backend. 8680Sstevel@tonic-gate * 8690Sstevel@tonic-gate * cyclic_fire() is wait-free; it will not block or spin. 8700Sstevel@tonic-gate * 8710Sstevel@tonic-gate * Return values 8720Sstevel@tonic-gate * 8730Sstevel@tonic-gate * None. 8740Sstevel@tonic-gate * 8750Sstevel@tonic-gate * Caller's context 8760Sstevel@tonic-gate * 8770Sstevel@tonic-gate * cyclic_fire() must be called from CY_HIGH_LEVEL interrupt context. 8780Sstevel@tonic-gate */ 8790Sstevel@tonic-gate void 8800Sstevel@tonic-gate cyclic_fire(cpu_t *c) 8810Sstevel@tonic-gate { 8820Sstevel@tonic-gate cyc_cpu_t *cpu = c->cpu_cyclic; 8830Sstevel@tonic-gate cyc_backend_t *be = cpu->cyp_backend; 8840Sstevel@tonic-gate cyc_index_t *heap = cpu->cyp_heap; 8850Sstevel@tonic-gate cyclic_t *cyclic, *cyclics = cpu->cyp_cyclics; 8860Sstevel@tonic-gate void *arg = be->cyb_arg; 8870Sstevel@tonic-gate hrtime_t now = gethrtime(); 8880Sstevel@tonic-gate hrtime_t exp; 8890Sstevel@tonic-gate 8900Sstevel@tonic-gate CYC_TRACE(cpu, CY_HIGH_LEVEL, "fire", now, 0); 8910Sstevel@tonic-gate 8920Sstevel@tonic-gate if (cpu->cyp_nelems == 0) { 8930Sstevel@tonic-gate /* 8940Sstevel@tonic-gate * This is a spurious fire. Count it as such, and blow 8950Sstevel@tonic-gate * out of here. 8960Sstevel@tonic-gate */ 8970Sstevel@tonic-gate CYC_TRACE0(cpu, CY_HIGH_LEVEL, "fire-spurious"); 8980Sstevel@tonic-gate return; 8990Sstevel@tonic-gate } 9000Sstevel@tonic-gate 9010Sstevel@tonic-gate for (;;) { 9020Sstevel@tonic-gate cyc_index_t ndx = heap[0]; 9030Sstevel@tonic-gate 9040Sstevel@tonic-gate cyclic = &cyclics[ndx]; 9050Sstevel@tonic-gate 9060Sstevel@tonic-gate ASSERT(!(cyclic->cy_flags & CYF_FREE)); 9070Sstevel@tonic-gate 9080Sstevel@tonic-gate CYC_TRACE(cpu, CY_HIGH_LEVEL, "fire-check", cyclic, 9090Sstevel@tonic-gate cyclic->cy_expire); 9100Sstevel@tonic-gate 9110Sstevel@tonic-gate if ((exp = cyclic->cy_expire) > now) 9120Sstevel@tonic-gate break; 9130Sstevel@tonic-gate 9140Sstevel@tonic-gate cyclic_expire(cpu, ndx, cyclic); 9150Sstevel@tonic-gate 9160Sstevel@tonic-gate /* 9170Sstevel@tonic-gate * If this cyclic will be set to next expire in the distant 9180Sstevel@tonic-gate * past, we have one of two situations: 9190Sstevel@tonic-gate * 9200Sstevel@tonic-gate * a) This is the first firing of a cyclic which had 9210Sstevel@tonic-gate * cy_expire set to 0. 9220Sstevel@tonic-gate * 9230Sstevel@tonic-gate * b) We are tragically late for a cyclic -- most likely 9240Sstevel@tonic-gate * due to being in the debugger. 9250Sstevel@tonic-gate * 9260Sstevel@tonic-gate * In either case, we set the new expiration time to be the 9270Sstevel@tonic-gate * the next interval boundary. This assures that the 9280Sstevel@tonic-gate * expiration time modulo the interval is invariant. 9290Sstevel@tonic-gate * 9300Sstevel@tonic-gate * We arbitrarily define "distant" to be one second (one second 9310Sstevel@tonic-gate * is chosen because it's shorter than any foray to the 9320Sstevel@tonic-gate * debugger while still being longer than any legitimate 9330Sstevel@tonic-gate * stretch at CY_HIGH_LEVEL). 9340Sstevel@tonic-gate */ 9350Sstevel@tonic-gate exp += cyclic->cy_interval; 9360Sstevel@tonic-gate 9370Sstevel@tonic-gate if (now - exp > NANOSEC) { 9380Sstevel@tonic-gate hrtime_t interval = cyclic->cy_interval; 9390Sstevel@tonic-gate 9400Sstevel@tonic-gate CYC_TRACE(cpu, CY_HIGH_LEVEL, exp == interval ? 9410Sstevel@tonic-gate "fire-first" : "fire-swing", now, exp); 9420Sstevel@tonic-gate 9430Sstevel@tonic-gate exp += ((now - exp) / interval + 1) * interval; 9440Sstevel@tonic-gate } 9450Sstevel@tonic-gate 9460Sstevel@tonic-gate cyclic->cy_expire = exp; 9470Sstevel@tonic-gate cyclic_downheap(cpu, 0); 9480Sstevel@tonic-gate } 9490Sstevel@tonic-gate 9500Sstevel@tonic-gate /* 9510Sstevel@tonic-gate * Now we have a cyclic in the root slot which isn't in the past; 9520Sstevel@tonic-gate * reprogram the interrupt source. 9530Sstevel@tonic-gate */ 9540Sstevel@tonic-gate be->cyb_reprogram(arg, exp); 9550Sstevel@tonic-gate } 9560Sstevel@tonic-gate 9570Sstevel@tonic-gate static void 9580Sstevel@tonic-gate cyclic_remove_pend(cyc_cpu_t *cpu, cyc_level_t level, cyclic_t *cyclic) 9590Sstevel@tonic-gate { 9600Sstevel@tonic-gate cyc_func_t handler = cyclic->cy_handler; 9610Sstevel@tonic-gate void *arg = cyclic->cy_arg; 9620Sstevel@tonic-gate uint32_t i, rpend = cpu->cyp_rpend - 1; 9630Sstevel@tonic-gate 9640Sstevel@tonic-gate ASSERT(cyclic->cy_flags & CYF_FREE); 9650Sstevel@tonic-gate ASSERT(cyclic->cy_pend == 0); 9660Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_REMOVING); 9670Sstevel@tonic-gate ASSERT(cpu->cyp_rpend > 0); 9680Sstevel@tonic-gate 9690Sstevel@tonic-gate CYC_TRACE(cpu, level, "remove-rpend", cyclic, cpu->cyp_rpend); 9700Sstevel@tonic-gate 9710Sstevel@tonic-gate /* 9720Sstevel@tonic-gate * Note that we only call the handler cyp_rpend - 1 times; this is 9730Sstevel@tonic-gate * to account for the handler call in cyclic_softint(). 9740Sstevel@tonic-gate */ 9750Sstevel@tonic-gate for (i = 0; i < rpend; i++) { 9760Sstevel@tonic-gate CYC_TRACE(cpu, level, "rpend-in", handler, arg); 977*5864Sesaxe DTRACE_PROBE1(cyclic__start, cyclic_t *, cyclic); 978*5864Sesaxe 9790Sstevel@tonic-gate (*handler)(arg); 980*5864Sesaxe 981*5864Sesaxe DTRACE_PROBE1(cyclic__end, cyclic_t *, cyclic); 9820Sstevel@tonic-gate CYC_TRACE(cpu, level, "rpend-out", handler, arg); 9830Sstevel@tonic-gate } 9840Sstevel@tonic-gate 9850Sstevel@tonic-gate /* 9860Sstevel@tonic-gate * We can now let the remove operation complete. 9870Sstevel@tonic-gate */ 9880Sstevel@tonic-gate sema_v(&cpu->cyp_modify_wait); 9890Sstevel@tonic-gate } 9900Sstevel@tonic-gate 9910Sstevel@tonic-gate /* 9920Sstevel@tonic-gate * cyclic_softint(cpu_t *cpu, cyc_level_t level) 9930Sstevel@tonic-gate * 9940Sstevel@tonic-gate * Overview 9950Sstevel@tonic-gate * 9960Sstevel@tonic-gate * cyclic_softint() is the cyclic subsystem's CY_LOCK_LEVEL and CY_LOW_LEVEL 9970Sstevel@tonic-gate * soft interrupt handler. Called by the cyclic backend. 9980Sstevel@tonic-gate * 9990Sstevel@tonic-gate * Arguments and notes 10000Sstevel@tonic-gate * 10010Sstevel@tonic-gate * The first argument to cyclic_softint() is the CPU on which the interrupt 10020Sstevel@tonic-gate * is executing; backends must call into cyclic_softint() on the specified 10030Sstevel@tonic-gate * CPU. The second argument is the level of the soft interrupt; it must 10040Sstevel@tonic-gate * be one of CY_LOCK_LEVEL or CY_LOW_LEVEL. 10050Sstevel@tonic-gate * 10060Sstevel@tonic-gate * cyclic_softint() will call the handlers for cyclics pending at the 10070Sstevel@tonic-gate * specified level. cyclic_softint() will not return until all pending 10080Sstevel@tonic-gate * cyclics at the specified level have been dealt with; intervening 10090Sstevel@tonic-gate * CY_HIGH_LEVEL interrupts which enqueue cyclics at the specified level 10100Sstevel@tonic-gate * may therefore prolong cyclic_softint(). 10110Sstevel@tonic-gate * 10120Sstevel@tonic-gate * cyclic_softint() never disables interrupts, and, if neither a 10130Sstevel@tonic-gate * cyclic_add() nor a cyclic_remove() is pending on the specified CPU, is 10140Sstevel@tonic-gate * lock-free. This assures that in the common case, cyclic_softint() 10150Sstevel@tonic-gate * completes without blocking, and never starves cyclic_fire(). If either 10160Sstevel@tonic-gate * cyclic_add() or cyclic_remove() is pending, cyclic_softint() may grab 10170Sstevel@tonic-gate * a dispatcher lock. 10180Sstevel@tonic-gate * 10190Sstevel@tonic-gate * While cyclic_softint() is designed for bounded latency, it is obviously 10200Sstevel@tonic-gate * at the mercy of its cyclic handlers. Because cyclic handlers may block 10210Sstevel@tonic-gate * arbitrarily, callers of cyclic_softint() should not rely upon 10220Sstevel@tonic-gate * deterministic completion. 10230Sstevel@tonic-gate * 10240Sstevel@tonic-gate * cyclic_softint() may be called spuriously without ill effect. 10250Sstevel@tonic-gate * 10260Sstevel@tonic-gate * Return value 10270Sstevel@tonic-gate * 10280Sstevel@tonic-gate * None. 10290Sstevel@tonic-gate * 10300Sstevel@tonic-gate * Caller's context 10310Sstevel@tonic-gate * 10320Sstevel@tonic-gate * The caller must be executing in soft interrupt context at either 10330Sstevel@tonic-gate * CY_LOCK_LEVEL or CY_LOW_LEVEL. The level passed to cyclic_softint() 10340Sstevel@tonic-gate * must match the level at which it is executing. On optimal backends, 10350Sstevel@tonic-gate * the caller will hold no locks. In any case, the caller may not hold 10360Sstevel@tonic-gate * cpu_lock or any lock acquired by any cyclic handler or held across 10370Sstevel@tonic-gate * any of cyclic_add(), cyclic_remove(), cyclic_bind() or cyclic_juggle(). 10380Sstevel@tonic-gate */ 10390Sstevel@tonic-gate void 10400Sstevel@tonic-gate cyclic_softint(cpu_t *c, cyc_level_t level) 10410Sstevel@tonic-gate { 10420Sstevel@tonic-gate cyc_cpu_t *cpu = c->cpu_cyclic; 10430Sstevel@tonic-gate cyc_softbuf_t *softbuf; 10440Sstevel@tonic-gate int soft, *buf, consndx, resized = 0, intr_resized = 0; 10450Sstevel@tonic-gate cyc_pcbuffer_t *pc; 10460Sstevel@tonic-gate cyclic_t *cyclics = cpu->cyp_cyclics; 10470Sstevel@tonic-gate int sizemask; 10480Sstevel@tonic-gate 10490Sstevel@tonic-gate CYC_TRACE(cpu, level, "softint", cyclics, 0); 10500Sstevel@tonic-gate 10510Sstevel@tonic-gate ASSERT(level < CY_LOW_LEVEL + CY_SOFT_LEVELS); 10520Sstevel@tonic-gate 10530Sstevel@tonic-gate softbuf = &cpu->cyp_softbuf[level]; 10540Sstevel@tonic-gate top: 10550Sstevel@tonic-gate soft = softbuf->cys_soft; 10560Sstevel@tonic-gate ASSERT(soft == 0 || soft == 1); 10570Sstevel@tonic-gate 10580Sstevel@tonic-gate pc = &softbuf->cys_buf[soft]; 10590Sstevel@tonic-gate buf = pc->cypc_buf; 10600Sstevel@tonic-gate consndx = pc->cypc_consndx; 10610Sstevel@tonic-gate sizemask = pc->cypc_sizemask; 10620Sstevel@tonic-gate 10630Sstevel@tonic-gate CYC_TRACE(cpu, level, "softint-top", cyclics, pc); 10640Sstevel@tonic-gate 10650Sstevel@tonic-gate while (consndx != pc->cypc_prodndx) { 10660Sstevel@tonic-gate int pend, npend, opend; 10670Sstevel@tonic-gate int consmasked = consndx & sizemask; 10680Sstevel@tonic-gate cyclic_t *cyclic = &cyclics[buf[consmasked]]; 10690Sstevel@tonic-gate cyc_func_t handler = cyclic->cy_handler; 10700Sstevel@tonic-gate void *arg = cyclic->cy_arg; 10710Sstevel@tonic-gate 10720Sstevel@tonic-gate ASSERT(buf[consmasked] < cpu->cyp_size); 10730Sstevel@tonic-gate CYC_TRACE(cpu, level, "consuming", consndx, cyclic); 10740Sstevel@tonic-gate 10750Sstevel@tonic-gate /* 10760Sstevel@tonic-gate * We have found this cyclic in the pcbuffer. We know that 10770Sstevel@tonic-gate * one of the following is true: 10780Sstevel@tonic-gate * 10790Sstevel@tonic-gate * (a) The pend is non-zero. We need to execute the handler 10800Sstevel@tonic-gate * at least once. 10810Sstevel@tonic-gate * 10820Sstevel@tonic-gate * (b) The pend _was_ non-zero, but it's now zero due to a 10830Sstevel@tonic-gate * resize. We will call the handler once, see that we 10840Sstevel@tonic-gate * are in this case, and read the new cyclics buffer 10850Sstevel@tonic-gate * (and hence the old non-zero pend). 10860Sstevel@tonic-gate * 10870Sstevel@tonic-gate * (c) The pend _was_ non-zero, but it's now zero due to a 10880Sstevel@tonic-gate * removal. We will call the handler once, see that we 10890Sstevel@tonic-gate * are in this case, and call into cyclic_remove_pend() 10900Sstevel@tonic-gate * to call the cyclic rpend times. We will take into 10910Sstevel@tonic-gate * account that we have already called the handler once. 10920Sstevel@tonic-gate * 10930Sstevel@tonic-gate * Point is: it's safe to call the handler without first 10940Sstevel@tonic-gate * checking the pend. 10950Sstevel@tonic-gate */ 10960Sstevel@tonic-gate do { 10970Sstevel@tonic-gate CYC_TRACE(cpu, level, "handler-in", handler, arg); 1098*5864Sesaxe DTRACE_PROBE1(cyclic__start, cyclic_t *, cyclic); 1099*5864Sesaxe 11000Sstevel@tonic-gate (*handler)(arg); 1101*5864Sesaxe 1102*5864Sesaxe DTRACE_PROBE1(cyclic__end, cyclic_t *, cyclic); 11030Sstevel@tonic-gate CYC_TRACE(cpu, level, "handler-out", handler, arg); 11040Sstevel@tonic-gate reread: 11050Sstevel@tonic-gate pend = cyclic->cy_pend; 11060Sstevel@tonic-gate npend = pend - 1; 11070Sstevel@tonic-gate 11080Sstevel@tonic-gate if (pend == 0) { 11090Sstevel@tonic-gate if (cpu->cyp_state == CYS_REMOVING) { 11100Sstevel@tonic-gate /* 11110Sstevel@tonic-gate * This cyclic has been removed while 11120Sstevel@tonic-gate * it had a non-zero pend count (we 11130Sstevel@tonic-gate * know it was non-zero because we 11140Sstevel@tonic-gate * found this cyclic in the pcbuffer). 11150Sstevel@tonic-gate * There must be a non-zero rpend for 11160Sstevel@tonic-gate * this CPU, and there must be a remove 11170Sstevel@tonic-gate * operation blocking; we'll call into 11180Sstevel@tonic-gate * cyclic_remove_pend() to clean this 11190Sstevel@tonic-gate * up, and break out of the pend loop. 11200Sstevel@tonic-gate */ 11210Sstevel@tonic-gate cyclic_remove_pend(cpu, level, cyclic); 11220Sstevel@tonic-gate break; 11230Sstevel@tonic-gate } 11240Sstevel@tonic-gate 11250Sstevel@tonic-gate /* 11260Sstevel@tonic-gate * We must have had a resize interrupt us. 11270Sstevel@tonic-gate */ 11280Sstevel@tonic-gate CYC_TRACE(cpu, level, "resize-int", cyclics, 0); 11290Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_EXPANDING); 11300Sstevel@tonic-gate ASSERT(cyclics != cpu->cyp_cyclics); 11310Sstevel@tonic-gate ASSERT(resized == 0); 11320Sstevel@tonic-gate ASSERT(intr_resized == 0); 11330Sstevel@tonic-gate intr_resized = 1; 11340Sstevel@tonic-gate cyclics = cpu->cyp_cyclics; 11350Sstevel@tonic-gate cyclic = &cyclics[buf[consmasked]]; 11360Sstevel@tonic-gate ASSERT(cyclic->cy_handler == handler); 11370Sstevel@tonic-gate ASSERT(cyclic->cy_arg == arg); 11380Sstevel@tonic-gate goto reread; 11390Sstevel@tonic-gate } 11400Sstevel@tonic-gate 11410Sstevel@tonic-gate if ((opend = 11420Sstevel@tonic-gate cas32(&cyclic->cy_pend, pend, npend)) != pend) { 11430Sstevel@tonic-gate /* 11440Sstevel@tonic-gate * Our cas32 can fail for one of several 11450Sstevel@tonic-gate * reasons: 11460Sstevel@tonic-gate * 11470Sstevel@tonic-gate * (a) An intervening high level bumped up the 11480Sstevel@tonic-gate * pend count on this cyclic. In this 11490Sstevel@tonic-gate * case, we will see a higher pend. 11500Sstevel@tonic-gate * 11510Sstevel@tonic-gate * (b) The cyclics array has been yanked out 11520Sstevel@tonic-gate * from underneath us by a resize 11530Sstevel@tonic-gate * operation. In this case, pend is 0 and 11540Sstevel@tonic-gate * cyp_state is CYS_EXPANDING. 11550Sstevel@tonic-gate * 11560Sstevel@tonic-gate * (c) The cyclic has been removed by an 11570Sstevel@tonic-gate * intervening remove-xcall. In this case, 11580Sstevel@tonic-gate * pend will be 0, the cyp_state will be 11590Sstevel@tonic-gate * CYS_REMOVING, and the cyclic will be 11600Sstevel@tonic-gate * marked CYF_FREE. 11610Sstevel@tonic-gate * 11620Sstevel@tonic-gate * The assertion below checks that we are 11630Sstevel@tonic-gate * in one of the above situations. The 11640Sstevel@tonic-gate * action under all three is to return to 11650Sstevel@tonic-gate * the top of the loop. 11660Sstevel@tonic-gate */ 11670Sstevel@tonic-gate CYC_TRACE(cpu, level, "cas-fail", opend, pend); 11680Sstevel@tonic-gate ASSERT(opend > pend || (opend == 0 && 11690Sstevel@tonic-gate ((cyclics != cpu->cyp_cyclics && 11700Sstevel@tonic-gate cpu->cyp_state == CYS_EXPANDING) || 11710Sstevel@tonic-gate (cpu->cyp_state == CYS_REMOVING && 11720Sstevel@tonic-gate (cyclic->cy_flags & CYF_FREE))))); 11730Sstevel@tonic-gate goto reread; 11740Sstevel@tonic-gate } 11750Sstevel@tonic-gate 11760Sstevel@tonic-gate /* 11770Sstevel@tonic-gate * Okay, so we've managed to successfully decrement 11780Sstevel@tonic-gate * pend. If we just decremented the pend to 0, we're 11790Sstevel@tonic-gate * done. 11800Sstevel@tonic-gate */ 11810Sstevel@tonic-gate } while (npend > 0); 11820Sstevel@tonic-gate 11830Sstevel@tonic-gate pc->cypc_consndx = ++consndx; 11840Sstevel@tonic-gate } 11850Sstevel@tonic-gate 11860Sstevel@tonic-gate /* 11870Sstevel@tonic-gate * If the high level handler is no longer writing to the same 11880Sstevel@tonic-gate * buffer, then we've had a resize. We need to switch our soft 11890Sstevel@tonic-gate * index, and goto top. 11900Sstevel@tonic-gate */ 11910Sstevel@tonic-gate if (soft != softbuf->cys_hard) { 11920Sstevel@tonic-gate /* 11930Sstevel@tonic-gate * We can assert that the other buffer has grown by exactly 11940Sstevel@tonic-gate * one factor of two. 11950Sstevel@tonic-gate */ 11960Sstevel@tonic-gate CYC_TRACE(cpu, level, "buffer-grow", 0, 0); 11970Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_EXPANDING); 11980Sstevel@tonic-gate ASSERT(softbuf->cys_buf[softbuf->cys_hard].cypc_sizemask == 11990Sstevel@tonic-gate (softbuf->cys_buf[soft].cypc_sizemask << 1) + 1 || 12000Sstevel@tonic-gate softbuf->cys_buf[soft].cypc_sizemask == 0); 12010Sstevel@tonic-gate ASSERT(softbuf->cys_hard == (softbuf->cys_soft ^ 1)); 12020Sstevel@tonic-gate 12030Sstevel@tonic-gate /* 12040Sstevel@tonic-gate * If our cached cyclics pointer doesn't match cyp_cyclics, 12050Sstevel@tonic-gate * then we took a resize between our last iteration of the 12060Sstevel@tonic-gate * pend loop and the check against softbuf->cys_hard. 12070Sstevel@tonic-gate */ 12080Sstevel@tonic-gate if (cpu->cyp_cyclics != cyclics) { 12090Sstevel@tonic-gate CYC_TRACE1(cpu, level, "resize-int-int", consndx); 12100Sstevel@tonic-gate cyclics = cpu->cyp_cyclics; 12110Sstevel@tonic-gate } 12120Sstevel@tonic-gate 12130Sstevel@tonic-gate softbuf->cys_soft = softbuf->cys_hard; 12140Sstevel@tonic-gate 12150Sstevel@tonic-gate ASSERT(resized == 0); 12160Sstevel@tonic-gate resized = 1; 12170Sstevel@tonic-gate goto top; 12180Sstevel@tonic-gate } 12190Sstevel@tonic-gate 12200Sstevel@tonic-gate /* 12210Sstevel@tonic-gate * If we were interrupted by a resize operation, then we must have 12220Sstevel@tonic-gate * seen the hard index change. 12230Sstevel@tonic-gate */ 12240Sstevel@tonic-gate ASSERT(!(intr_resized == 1 && resized == 0)); 12250Sstevel@tonic-gate 12260Sstevel@tonic-gate if (resized) { 12270Sstevel@tonic-gate uint32_t lev, nlev; 12280Sstevel@tonic-gate 12290Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_EXPANDING); 12300Sstevel@tonic-gate 12310Sstevel@tonic-gate do { 12320Sstevel@tonic-gate lev = cpu->cyp_modify_levels; 12330Sstevel@tonic-gate nlev = lev + 1; 12340Sstevel@tonic-gate } while (cas32(&cpu->cyp_modify_levels, lev, nlev) != lev); 12350Sstevel@tonic-gate 12360Sstevel@tonic-gate /* 12370Sstevel@tonic-gate * If we are the last soft level to see the modification, 12380Sstevel@tonic-gate * post on cyp_modify_wait. Otherwise, (if we're not 12390Sstevel@tonic-gate * already at low level), post down to the next soft level. 12400Sstevel@tonic-gate */ 12410Sstevel@tonic-gate if (nlev == CY_SOFT_LEVELS) { 12420Sstevel@tonic-gate CYC_TRACE0(cpu, level, "resize-kick"); 12430Sstevel@tonic-gate sema_v(&cpu->cyp_modify_wait); 12440Sstevel@tonic-gate } else { 12450Sstevel@tonic-gate ASSERT(nlev < CY_SOFT_LEVELS); 12460Sstevel@tonic-gate if (level != CY_LOW_LEVEL) { 12470Sstevel@tonic-gate cyc_backend_t *be = cpu->cyp_backend; 12480Sstevel@tonic-gate 12490Sstevel@tonic-gate CYC_TRACE0(cpu, level, "resize-post"); 12500Sstevel@tonic-gate be->cyb_softint(be->cyb_arg, level - 1); 12510Sstevel@tonic-gate } 12520Sstevel@tonic-gate } 12530Sstevel@tonic-gate } 12540Sstevel@tonic-gate } 12550Sstevel@tonic-gate 12560Sstevel@tonic-gate static void 12570Sstevel@tonic-gate cyclic_expand_xcall(cyc_xcallarg_t *arg) 12580Sstevel@tonic-gate { 12590Sstevel@tonic-gate cyc_cpu_t *cpu = arg->cyx_cpu; 12600Sstevel@tonic-gate cyc_backend_t *be = cpu->cyp_backend; 12610Sstevel@tonic-gate cyb_arg_t bar = be->cyb_arg; 12620Sstevel@tonic-gate cyc_cookie_t cookie; 12630Sstevel@tonic-gate cyc_index_t new_size = arg->cyx_size, size = cpu->cyp_size, i; 12640Sstevel@tonic-gate cyc_index_t *new_heap = arg->cyx_heap; 12650Sstevel@tonic-gate cyclic_t *cyclics = cpu->cyp_cyclics, *new_cyclics = arg->cyx_cyclics; 12660Sstevel@tonic-gate 12670Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_EXPANDING); 12680Sstevel@tonic-gate 12690Sstevel@tonic-gate /* 12700Sstevel@tonic-gate * This is a little dicey. First, we'll raise our interrupt level 12710Sstevel@tonic-gate * to CY_HIGH_LEVEL. This CPU already has a new heap, cyclic array, 12720Sstevel@tonic-gate * etc.; we just need to bcopy them across. As for the softint 12730Sstevel@tonic-gate * buffers, we'll switch the active buffers. The actual softints will 12740Sstevel@tonic-gate * take care of consuming any pending cyclics in the old buffer. 12750Sstevel@tonic-gate */ 12760Sstevel@tonic-gate cookie = be->cyb_set_level(bar, CY_HIGH_LEVEL); 12770Sstevel@tonic-gate 12780Sstevel@tonic-gate CYC_TRACE(cpu, CY_HIGH_LEVEL, "expand", new_size, 0); 12790Sstevel@tonic-gate 12800Sstevel@tonic-gate /* 12810Sstevel@tonic-gate * Assert that the new size is a power of 2. 12820Sstevel@tonic-gate */ 12830Sstevel@tonic-gate ASSERT((new_size & new_size - 1) == 0); 12840Sstevel@tonic-gate ASSERT(new_size == (size << 1)); 12850Sstevel@tonic-gate ASSERT(cpu->cyp_heap != NULL && cpu->cyp_cyclics != NULL); 12860Sstevel@tonic-gate 12870Sstevel@tonic-gate bcopy(cpu->cyp_heap, new_heap, sizeof (cyc_index_t) * size); 12880Sstevel@tonic-gate bcopy(cyclics, new_cyclics, sizeof (cyclic_t) * size); 12890Sstevel@tonic-gate 12900Sstevel@tonic-gate /* 12910Sstevel@tonic-gate * Now run through the old cyclics array, setting pend to 0. To 12920Sstevel@tonic-gate * softints (which are executing at a lower priority level), the 12930Sstevel@tonic-gate * pends dropping to 0 will appear atomic with the cyp_cyclics 12940Sstevel@tonic-gate * pointer changing. 12950Sstevel@tonic-gate */ 12960Sstevel@tonic-gate for (i = 0; i < size; i++) 12970Sstevel@tonic-gate cyclics[i].cy_pend = 0; 12980Sstevel@tonic-gate 12990Sstevel@tonic-gate /* 13000Sstevel@tonic-gate * Set up the free list, and set all of the new cyclics to be CYF_FREE. 13010Sstevel@tonic-gate */ 13020Sstevel@tonic-gate for (i = size; i < new_size; i++) { 13030Sstevel@tonic-gate new_heap[i] = i; 13040Sstevel@tonic-gate new_cyclics[i].cy_flags = CYF_FREE; 13050Sstevel@tonic-gate } 13060Sstevel@tonic-gate 13070Sstevel@tonic-gate /* 13080Sstevel@tonic-gate * We can go ahead and plow the value of cyp_heap and cyp_cyclics; 13090Sstevel@tonic-gate * cyclic_expand() has kept a copy. 13100Sstevel@tonic-gate */ 13110Sstevel@tonic-gate cpu->cyp_heap = new_heap; 13120Sstevel@tonic-gate cpu->cyp_cyclics = new_cyclics; 13130Sstevel@tonic-gate cpu->cyp_size = new_size; 13140Sstevel@tonic-gate 13150Sstevel@tonic-gate /* 13160Sstevel@tonic-gate * We've switched over the heap and the cyclics array. Now we need 13170Sstevel@tonic-gate * to switch over our active softint buffer pointers. 13180Sstevel@tonic-gate */ 13190Sstevel@tonic-gate for (i = CY_LOW_LEVEL; i < CY_LOW_LEVEL + CY_SOFT_LEVELS; i++) { 13200Sstevel@tonic-gate cyc_softbuf_t *softbuf = &cpu->cyp_softbuf[i]; 13210Sstevel@tonic-gate uchar_t hard = softbuf->cys_hard; 13220Sstevel@tonic-gate 13230Sstevel@tonic-gate /* 13240Sstevel@tonic-gate * Assert that we're not in the middle of a resize operation. 13250Sstevel@tonic-gate */ 13260Sstevel@tonic-gate ASSERT(hard == softbuf->cys_soft); 13270Sstevel@tonic-gate ASSERT(hard == 0 || hard == 1); 13280Sstevel@tonic-gate ASSERT(softbuf->cys_buf[hard].cypc_buf != NULL); 13290Sstevel@tonic-gate 13300Sstevel@tonic-gate softbuf->cys_hard = hard ^ 1; 13310Sstevel@tonic-gate 13320Sstevel@tonic-gate /* 13330Sstevel@tonic-gate * The caller (cyclic_expand()) is responsible for setting 13340Sstevel@tonic-gate * up the new producer-consumer buffer; assert that it's 13350Sstevel@tonic-gate * been done correctly. 13360Sstevel@tonic-gate */ 13370Sstevel@tonic-gate ASSERT(softbuf->cys_buf[hard ^ 1].cypc_buf != NULL); 13380Sstevel@tonic-gate ASSERT(softbuf->cys_buf[hard ^ 1].cypc_prodndx == 0); 13390Sstevel@tonic-gate ASSERT(softbuf->cys_buf[hard ^ 1].cypc_consndx == 0); 13400Sstevel@tonic-gate } 13410Sstevel@tonic-gate 13420Sstevel@tonic-gate /* 13430Sstevel@tonic-gate * That's all there is to it; now we just need to postdown to 13440Sstevel@tonic-gate * get the softint chain going. 13450Sstevel@tonic-gate */ 13460Sstevel@tonic-gate be->cyb_softint(bar, CY_HIGH_LEVEL - 1); 13470Sstevel@tonic-gate be->cyb_restore_level(bar, cookie); 13480Sstevel@tonic-gate } 13490Sstevel@tonic-gate 13500Sstevel@tonic-gate /* 13510Sstevel@tonic-gate * cyclic_expand() will cross call onto the CPU to perform the actual 13520Sstevel@tonic-gate * expand operation. 13530Sstevel@tonic-gate */ 13540Sstevel@tonic-gate static void 13550Sstevel@tonic-gate cyclic_expand(cyc_cpu_t *cpu) 13560Sstevel@tonic-gate { 13570Sstevel@tonic-gate cyc_index_t new_size, old_size; 13580Sstevel@tonic-gate cyc_index_t *new_heap, *old_heap; 13590Sstevel@tonic-gate cyclic_t *new_cyclics, *old_cyclics; 13600Sstevel@tonic-gate cyc_xcallarg_t arg; 13610Sstevel@tonic-gate cyc_backend_t *be = cpu->cyp_backend; 13620Sstevel@tonic-gate char old_hard; 13630Sstevel@tonic-gate int i; 13640Sstevel@tonic-gate 13650Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 13660Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE); 13670Sstevel@tonic-gate 13680Sstevel@tonic-gate cpu->cyp_state = CYS_EXPANDING; 13690Sstevel@tonic-gate 13700Sstevel@tonic-gate old_heap = cpu->cyp_heap; 13710Sstevel@tonic-gate old_cyclics = cpu->cyp_cyclics; 13720Sstevel@tonic-gate 13730Sstevel@tonic-gate if ((new_size = ((old_size = cpu->cyp_size) << 1)) == 0) { 13740Sstevel@tonic-gate new_size = CY_DEFAULT_PERCPU; 13750Sstevel@tonic-gate ASSERT(old_heap == NULL && old_cyclics == NULL); 13760Sstevel@tonic-gate } 13770Sstevel@tonic-gate 13780Sstevel@tonic-gate /* 13790Sstevel@tonic-gate * Check that the new_size is a power of 2. 13800Sstevel@tonic-gate */ 13810Sstevel@tonic-gate ASSERT((new_size - 1 & new_size) == 0); 13820Sstevel@tonic-gate 13830Sstevel@tonic-gate new_heap = kmem_alloc(sizeof (cyc_index_t) * new_size, KM_SLEEP); 13840Sstevel@tonic-gate new_cyclics = kmem_zalloc(sizeof (cyclic_t) * new_size, KM_SLEEP); 13850Sstevel@tonic-gate 13860Sstevel@tonic-gate /* 13870Sstevel@tonic-gate * We know that no other expansions are in progress (they serialize 13880Sstevel@tonic-gate * on cpu_lock), so we can safely read the softbuf metadata. 13890Sstevel@tonic-gate */ 13900Sstevel@tonic-gate old_hard = cpu->cyp_softbuf[0].cys_hard; 13910Sstevel@tonic-gate 13920Sstevel@tonic-gate for (i = CY_LOW_LEVEL; i < CY_LOW_LEVEL + CY_SOFT_LEVELS; i++) { 13930Sstevel@tonic-gate cyc_softbuf_t *softbuf = &cpu->cyp_softbuf[i]; 13940Sstevel@tonic-gate char hard = softbuf->cys_hard; 13950Sstevel@tonic-gate cyc_pcbuffer_t *pc = &softbuf->cys_buf[hard ^ 1]; 13960Sstevel@tonic-gate 13970Sstevel@tonic-gate ASSERT(hard == old_hard); 13980Sstevel@tonic-gate ASSERT(hard == softbuf->cys_soft); 13990Sstevel@tonic-gate ASSERT(pc->cypc_buf == NULL); 14000Sstevel@tonic-gate 14010Sstevel@tonic-gate pc->cypc_buf = 14020Sstevel@tonic-gate kmem_alloc(sizeof (cyc_index_t) * new_size, KM_SLEEP); 14030Sstevel@tonic-gate pc->cypc_prodndx = pc->cypc_consndx = 0; 14040Sstevel@tonic-gate pc->cypc_sizemask = new_size - 1; 14050Sstevel@tonic-gate } 14060Sstevel@tonic-gate 14070Sstevel@tonic-gate arg.cyx_cpu = cpu; 14080Sstevel@tonic-gate arg.cyx_heap = new_heap; 14090Sstevel@tonic-gate arg.cyx_cyclics = new_cyclics; 14100Sstevel@tonic-gate arg.cyx_size = new_size; 14110Sstevel@tonic-gate 14120Sstevel@tonic-gate cpu->cyp_modify_levels = 0; 14130Sstevel@tonic-gate 14140Sstevel@tonic-gate be->cyb_xcall(be->cyb_arg, cpu->cyp_cpu, 14150Sstevel@tonic-gate (cyc_func_t)cyclic_expand_xcall, &arg); 14160Sstevel@tonic-gate 14170Sstevel@tonic-gate /* 14180Sstevel@tonic-gate * Now block, waiting for the resize operation to complete. 14190Sstevel@tonic-gate */ 14200Sstevel@tonic-gate sema_p(&cpu->cyp_modify_wait); 14210Sstevel@tonic-gate ASSERT(cpu->cyp_modify_levels == CY_SOFT_LEVELS); 14220Sstevel@tonic-gate 14230Sstevel@tonic-gate /* 14240Sstevel@tonic-gate * The operation is complete; we can now free the old buffers. 14250Sstevel@tonic-gate */ 14260Sstevel@tonic-gate for (i = CY_LOW_LEVEL; i < CY_LOW_LEVEL + CY_SOFT_LEVELS; i++) { 14270Sstevel@tonic-gate cyc_softbuf_t *softbuf = &cpu->cyp_softbuf[i]; 14280Sstevel@tonic-gate char hard = softbuf->cys_hard; 14290Sstevel@tonic-gate cyc_pcbuffer_t *pc = &softbuf->cys_buf[hard ^ 1]; 14300Sstevel@tonic-gate 14310Sstevel@tonic-gate ASSERT(hard == (old_hard ^ 1)); 14320Sstevel@tonic-gate ASSERT(hard == softbuf->cys_soft); 14330Sstevel@tonic-gate 14340Sstevel@tonic-gate if (pc->cypc_buf == NULL) 14350Sstevel@tonic-gate continue; 14360Sstevel@tonic-gate 14370Sstevel@tonic-gate ASSERT(pc->cypc_sizemask == ((new_size - 1) >> 1)); 14380Sstevel@tonic-gate 14390Sstevel@tonic-gate kmem_free(pc->cypc_buf, 14400Sstevel@tonic-gate sizeof (cyc_index_t) * (pc->cypc_sizemask + 1)); 14410Sstevel@tonic-gate pc->cypc_buf = NULL; 14420Sstevel@tonic-gate } 14430Sstevel@tonic-gate 14440Sstevel@tonic-gate if (old_cyclics != NULL) { 14450Sstevel@tonic-gate ASSERT(old_heap != NULL); 14460Sstevel@tonic-gate ASSERT(old_size != 0); 14470Sstevel@tonic-gate kmem_free(old_cyclics, sizeof (cyclic_t) * old_size); 14480Sstevel@tonic-gate kmem_free(old_heap, sizeof (cyc_index_t) * old_size); 14490Sstevel@tonic-gate } 14500Sstevel@tonic-gate 14510Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_EXPANDING); 14520Sstevel@tonic-gate cpu->cyp_state = CYS_ONLINE; 14530Sstevel@tonic-gate } 14540Sstevel@tonic-gate 14550Sstevel@tonic-gate /* 14560Sstevel@tonic-gate * cyclic_pick_cpu will attempt to pick a CPU according to the constraints 14570Sstevel@tonic-gate * specified by the partition, bound CPU, and flags. Additionally, 14580Sstevel@tonic-gate * cyclic_pick_cpu() will not pick the avoid CPU; it will return NULL if 14590Sstevel@tonic-gate * the avoid CPU is the only CPU which satisfies the constraints. 14600Sstevel@tonic-gate * 14610Sstevel@tonic-gate * If CYF_CPU_BOUND is set in flags, the specified CPU must be non-NULL. 14620Sstevel@tonic-gate * If CYF_PART_BOUND is set in flags, the specified partition must be non-NULL. 14630Sstevel@tonic-gate * If both CYF_CPU_BOUND and CYF_PART_BOUND are set, the specified CPU must 14640Sstevel@tonic-gate * be in the specified partition. 14650Sstevel@tonic-gate */ 14660Sstevel@tonic-gate static cyc_cpu_t * 14670Sstevel@tonic-gate cyclic_pick_cpu(cpupart_t *part, cpu_t *bound, cpu_t *avoid, uint16_t flags) 14680Sstevel@tonic-gate { 14690Sstevel@tonic-gate cpu_t *c, *start = (part != NULL) ? part->cp_cpulist : CPU; 14700Sstevel@tonic-gate cpu_t *online = NULL; 14710Sstevel@tonic-gate uintptr_t offset; 14720Sstevel@tonic-gate 14730Sstevel@tonic-gate CYC_PTRACE("pick-cpu", part, bound); 14740Sstevel@tonic-gate 14750Sstevel@tonic-gate ASSERT(!(flags & CYF_CPU_BOUND) || bound != NULL); 14760Sstevel@tonic-gate ASSERT(!(flags & CYF_PART_BOUND) || part != NULL); 14770Sstevel@tonic-gate 14780Sstevel@tonic-gate /* 14790Sstevel@tonic-gate * If we're bound to our CPU, there isn't much choice involved. We 14800Sstevel@tonic-gate * need to check that the CPU passed as bound is in the cpupart, and 14810Sstevel@tonic-gate * that the CPU that we're binding to has been configured. 14820Sstevel@tonic-gate */ 14830Sstevel@tonic-gate if (flags & CYF_CPU_BOUND) { 14840Sstevel@tonic-gate CYC_PTRACE("pick-cpu-bound", bound, avoid); 14850Sstevel@tonic-gate 14860Sstevel@tonic-gate if ((flags & CYF_PART_BOUND) && bound->cpu_part != part) 14870Sstevel@tonic-gate panic("cyclic_pick_cpu: " 14880Sstevel@tonic-gate "CPU binding contradicts partition binding"); 14890Sstevel@tonic-gate 14900Sstevel@tonic-gate if (bound == avoid) 14910Sstevel@tonic-gate return (NULL); 14920Sstevel@tonic-gate 14930Sstevel@tonic-gate if (bound->cpu_cyclic == NULL) 14940Sstevel@tonic-gate panic("cyclic_pick_cpu: " 14950Sstevel@tonic-gate "attempt to bind to non-configured CPU"); 14960Sstevel@tonic-gate 14970Sstevel@tonic-gate return (bound->cpu_cyclic); 14980Sstevel@tonic-gate } 14990Sstevel@tonic-gate 15000Sstevel@tonic-gate if (flags & CYF_PART_BOUND) { 15010Sstevel@tonic-gate CYC_PTRACE("pick-part-bound", bound, avoid); 15020Sstevel@tonic-gate offset = offsetof(cpu_t, cpu_next_part); 15030Sstevel@tonic-gate } else { 15040Sstevel@tonic-gate offset = offsetof(cpu_t, cpu_next_onln); 15050Sstevel@tonic-gate } 15060Sstevel@tonic-gate 15070Sstevel@tonic-gate c = start; 15080Sstevel@tonic-gate do { 15090Sstevel@tonic-gate if (c->cpu_cyclic == NULL) 15100Sstevel@tonic-gate continue; 15110Sstevel@tonic-gate 15120Sstevel@tonic-gate if (c->cpu_cyclic->cyp_state == CYS_OFFLINE) 15130Sstevel@tonic-gate continue; 15140Sstevel@tonic-gate 15150Sstevel@tonic-gate if (c == avoid) 15160Sstevel@tonic-gate continue; 15170Sstevel@tonic-gate 15180Sstevel@tonic-gate if (c->cpu_flags & CPU_ENABLE) 15190Sstevel@tonic-gate goto found; 15200Sstevel@tonic-gate 15210Sstevel@tonic-gate if (online == NULL) 15220Sstevel@tonic-gate online = c; 15230Sstevel@tonic-gate } while ((c = *(cpu_t **)((uintptr_t)c + offset)) != start); 15240Sstevel@tonic-gate 15250Sstevel@tonic-gate /* 15260Sstevel@tonic-gate * If we're here, we're in one of two situations: 15270Sstevel@tonic-gate * 15280Sstevel@tonic-gate * (a) We have a partition-bound cyclic, and there is no CPU in 15290Sstevel@tonic-gate * our partition which is CPU_ENABLE'd. If we saw another 15300Sstevel@tonic-gate * non-CYS_OFFLINE CPU in our partition, we'll go with it. 15310Sstevel@tonic-gate * If not, the avoid CPU must be the only non-CYS_OFFLINE 15320Sstevel@tonic-gate * CPU in the partition; we're forced to return NULL. 15330Sstevel@tonic-gate * 15340Sstevel@tonic-gate * (b) We have a partition-unbound cyclic, in which case there 15350Sstevel@tonic-gate * must only be one CPU CPU_ENABLE'd, and it must be the one 15360Sstevel@tonic-gate * we're trying to avoid. If cyclic_juggle()/cyclic_offline() 15370Sstevel@tonic-gate * are called appropriately, this generally shouldn't happen 15380Sstevel@tonic-gate * (the offline should fail before getting to this code). 15390Sstevel@tonic-gate * At any rate: we can't avoid the avoid CPU, so we return 15400Sstevel@tonic-gate * NULL. 15410Sstevel@tonic-gate */ 15420Sstevel@tonic-gate if (!(flags & CYF_PART_BOUND)) { 15430Sstevel@tonic-gate ASSERT(avoid->cpu_flags & CPU_ENABLE); 15440Sstevel@tonic-gate return (NULL); 15450Sstevel@tonic-gate } 15460Sstevel@tonic-gate 15470Sstevel@tonic-gate CYC_PTRACE("pick-no-intr", part, avoid); 15480Sstevel@tonic-gate 15490Sstevel@tonic-gate if ((c = online) != NULL) 15500Sstevel@tonic-gate goto found; 15510Sstevel@tonic-gate 15520Sstevel@tonic-gate CYC_PTRACE("pick-fail", part, avoid); 15530Sstevel@tonic-gate ASSERT(avoid->cpu_part == start->cpu_part); 15540Sstevel@tonic-gate return (NULL); 15550Sstevel@tonic-gate 15560Sstevel@tonic-gate found: 15570Sstevel@tonic-gate CYC_PTRACE("pick-cpu-found", c, avoid); 15580Sstevel@tonic-gate ASSERT(c != avoid); 15590Sstevel@tonic-gate ASSERT(c->cpu_cyclic != NULL); 15600Sstevel@tonic-gate 15610Sstevel@tonic-gate return (c->cpu_cyclic); 15620Sstevel@tonic-gate } 15630Sstevel@tonic-gate 15640Sstevel@tonic-gate static void 15650Sstevel@tonic-gate cyclic_add_xcall(cyc_xcallarg_t *arg) 15660Sstevel@tonic-gate { 15670Sstevel@tonic-gate cyc_cpu_t *cpu = arg->cyx_cpu; 15680Sstevel@tonic-gate cyc_handler_t *hdlr = arg->cyx_hdlr; 15690Sstevel@tonic-gate cyc_time_t *when = arg->cyx_when; 15700Sstevel@tonic-gate cyc_backend_t *be = cpu->cyp_backend; 15710Sstevel@tonic-gate cyc_index_t ndx, nelems; 15720Sstevel@tonic-gate cyc_cookie_t cookie; 15730Sstevel@tonic-gate cyb_arg_t bar = be->cyb_arg; 15740Sstevel@tonic-gate cyclic_t *cyclic; 15750Sstevel@tonic-gate 15760Sstevel@tonic-gate ASSERT(cpu->cyp_nelems < cpu->cyp_size); 15770Sstevel@tonic-gate 15780Sstevel@tonic-gate cookie = be->cyb_set_level(bar, CY_HIGH_LEVEL); 15790Sstevel@tonic-gate 15800Sstevel@tonic-gate CYC_TRACE(cpu, CY_HIGH_LEVEL, 15810Sstevel@tonic-gate "add-xcall", when->cyt_when, when->cyt_interval); 15820Sstevel@tonic-gate 15830Sstevel@tonic-gate nelems = cpu->cyp_nelems++; 15840Sstevel@tonic-gate 15850Sstevel@tonic-gate if (nelems == 0) { 15860Sstevel@tonic-gate /* 15870Sstevel@tonic-gate * If this is the first element, we need to enable the 15880Sstevel@tonic-gate * backend on this CPU. 15890Sstevel@tonic-gate */ 15900Sstevel@tonic-gate CYC_TRACE0(cpu, CY_HIGH_LEVEL, "enabled"); 15910Sstevel@tonic-gate be->cyb_enable(bar); 15920Sstevel@tonic-gate } 15930Sstevel@tonic-gate 15940Sstevel@tonic-gate ndx = cpu->cyp_heap[nelems]; 15950Sstevel@tonic-gate cyclic = &cpu->cyp_cyclics[ndx]; 15960Sstevel@tonic-gate 15970Sstevel@tonic-gate ASSERT(cyclic->cy_flags == CYF_FREE); 15980Sstevel@tonic-gate cyclic->cy_interval = when->cyt_interval; 15990Sstevel@tonic-gate 16000Sstevel@tonic-gate if (when->cyt_when == 0) { 16010Sstevel@tonic-gate /* 16020Sstevel@tonic-gate * If a start time hasn't been explicitly specified, we'll 16030Sstevel@tonic-gate * start on the next interval boundary. 16040Sstevel@tonic-gate */ 16050Sstevel@tonic-gate cyclic->cy_expire = (gethrtime() / cyclic->cy_interval + 1) * 16060Sstevel@tonic-gate cyclic->cy_interval; 16070Sstevel@tonic-gate } else { 16080Sstevel@tonic-gate cyclic->cy_expire = when->cyt_when; 16090Sstevel@tonic-gate } 16100Sstevel@tonic-gate 16110Sstevel@tonic-gate cyclic->cy_handler = hdlr->cyh_func; 16120Sstevel@tonic-gate cyclic->cy_arg = hdlr->cyh_arg; 16130Sstevel@tonic-gate cyclic->cy_level = hdlr->cyh_level; 16140Sstevel@tonic-gate cyclic->cy_flags = arg->cyx_flags; 16150Sstevel@tonic-gate 16160Sstevel@tonic-gate if (cyclic_upheap(cpu, nelems)) { 16170Sstevel@tonic-gate hrtime_t exp = cyclic->cy_expire; 16180Sstevel@tonic-gate 16190Sstevel@tonic-gate CYC_TRACE(cpu, CY_HIGH_LEVEL, "add-reprog", cyclic, exp); 16200Sstevel@tonic-gate 16210Sstevel@tonic-gate /* 16220Sstevel@tonic-gate * If our upheap propagated to the root, we need to 16230Sstevel@tonic-gate * reprogram the interrupt source. 16240Sstevel@tonic-gate */ 16250Sstevel@tonic-gate be->cyb_reprogram(bar, exp); 16260Sstevel@tonic-gate } 16270Sstevel@tonic-gate be->cyb_restore_level(bar, cookie); 16280Sstevel@tonic-gate 16290Sstevel@tonic-gate arg->cyx_ndx = ndx; 16300Sstevel@tonic-gate } 16310Sstevel@tonic-gate 16320Sstevel@tonic-gate static cyc_index_t 16330Sstevel@tonic-gate cyclic_add_here(cyc_cpu_t *cpu, cyc_handler_t *hdlr, 16340Sstevel@tonic-gate cyc_time_t *when, uint16_t flags) 16350Sstevel@tonic-gate { 16360Sstevel@tonic-gate cyc_backend_t *be = cpu->cyp_backend; 16370Sstevel@tonic-gate cyb_arg_t bar = be->cyb_arg; 16380Sstevel@tonic-gate cyc_xcallarg_t arg; 16390Sstevel@tonic-gate 16400Sstevel@tonic-gate CYC_PTRACE("add-cpu", cpu, hdlr->cyh_func); 16410Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 16420Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE); 16430Sstevel@tonic-gate ASSERT(!(cpu->cyp_cpu->cpu_flags & CPU_OFFLINE)); 16440Sstevel@tonic-gate ASSERT(when->cyt_when >= 0 && when->cyt_interval > 0); 16450Sstevel@tonic-gate 16460Sstevel@tonic-gate if (cpu->cyp_nelems == cpu->cyp_size) { 16470Sstevel@tonic-gate /* 16480Sstevel@tonic-gate * This is expensive; it will cross call onto the other 16490Sstevel@tonic-gate * CPU to perform the expansion. 16500Sstevel@tonic-gate */ 16510Sstevel@tonic-gate cyclic_expand(cpu); 16520Sstevel@tonic-gate ASSERT(cpu->cyp_nelems < cpu->cyp_size); 16530Sstevel@tonic-gate } 16540Sstevel@tonic-gate 16550Sstevel@tonic-gate /* 16560Sstevel@tonic-gate * By now, we know that we're going to be able to successfully 16570Sstevel@tonic-gate * perform the add. Now cross call over to the CPU of interest to 16580Sstevel@tonic-gate * actually add our cyclic. 16590Sstevel@tonic-gate */ 16600Sstevel@tonic-gate arg.cyx_cpu = cpu; 16610Sstevel@tonic-gate arg.cyx_hdlr = hdlr; 16620Sstevel@tonic-gate arg.cyx_when = when; 16630Sstevel@tonic-gate arg.cyx_flags = flags; 16640Sstevel@tonic-gate 16650Sstevel@tonic-gate be->cyb_xcall(bar, cpu->cyp_cpu, (cyc_func_t)cyclic_add_xcall, &arg); 16660Sstevel@tonic-gate 16670Sstevel@tonic-gate CYC_PTRACE("add-cpu-done", cpu, arg.cyx_ndx); 16680Sstevel@tonic-gate 16690Sstevel@tonic-gate return (arg.cyx_ndx); 16700Sstevel@tonic-gate } 16710Sstevel@tonic-gate 16720Sstevel@tonic-gate static void 16730Sstevel@tonic-gate cyclic_remove_xcall(cyc_xcallarg_t *arg) 16740Sstevel@tonic-gate { 16750Sstevel@tonic-gate cyc_cpu_t *cpu = arg->cyx_cpu; 16760Sstevel@tonic-gate cyc_backend_t *be = cpu->cyp_backend; 16770Sstevel@tonic-gate cyb_arg_t bar = be->cyb_arg; 16780Sstevel@tonic-gate cyc_cookie_t cookie; 16790Sstevel@tonic-gate cyc_index_t ndx = arg->cyx_ndx, nelems = cpu->cyp_nelems, i; 16800Sstevel@tonic-gate cyc_index_t *heap = cpu->cyp_heap, last; 16810Sstevel@tonic-gate cyclic_t *cyclic; 16820Sstevel@tonic-gate #ifdef DEBUG 16830Sstevel@tonic-gate cyc_index_t root; 16840Sstevel@tonic-gate #endif 16850Sstevel@tonic-gate 16860Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_REMOVING); 16870Sstevel@tonic-gate ASSERT(nelems > 0); 16880Sstevel@tonic-gate 16890Sstevel@tonic-gate cookie = be->cyb_set_level(bar, CY_HIGH_LEVEL); 16900Sstevel@tonic-gate 16910Sstevel@tonic-gate CYC_TRACE1(cpu, CY_HIGH_LEVEL, "remove-xcall", ndx); 16920Sstevel@tonic-gate 16930Sstevel@tonic-gate cyclic = &cpu->cyp_cyclics[ndx]; 16940Sstevel@tonic-gate 16950Sstevel@tonic-gate /* 16960Sstevel@tonic-gate * Grab the current expiration time. If this cyclic is being 16970Sstevel@tonic-gate * removed as part of a juggling operation, the expiration time 16980Sstevel@tonic-gate * will be used when the cyclic is added to the new CPU. 16990Sstevel@tonic-gate */ 17000Sstevel@tonic-gate if (arg->cyx_when != NULL) { 17010Sstevel@tonic-gate arg->cyx_when->cyt_when = cyclic->cy_expire; 17020Sstevel@tonic-gate arg->cyx_when->cyt_interval = cyclic->cy_interval; 17030Sstevel@tonic-gate } 17040Sstevel@tonic-gate 17050Sstevel@tonic-gate if (cyclic->cy_pend != 0) { 17060Sstevel@tonic-gate /* 17070Sstevel@tonic-gate * The pend is non-zero; this cyclic is currently being 17080Sstevel@tonic-gate * executed (or will be executed shortly). If the caller 17090Sstevel@tonic-gate * refuses to wait, we must return (doing nothing). Otherwise, 17100Sstevel@tonic-gate * we will stash the pend value * in this CPU's rpend, and 17110Sstevel@tonic-gate * then zero it out. The softint in the pend loop will see 17120Sstevel@tonic-gate * that we have zeroed out pend, and will call the cyclic 17130Sstevel@tonic-gate * handler rpend times. The caller will wait until the 17140Sstevel@tonic-gate * softint has completed calling the cyclic handler. 17150Sstevel@tonic-gate */ 17160Sstevel@tonic-gate if (arg->cyx_wait == CY_NOWAIT) { 17170Sstevel@tonic-gate arg->cyx_wait = CY_WAIT; 17180Sstevel@tonic-gate goto out; 17190Sstevel@tonic-gate } 17200Sstevel@tonic-gate 17210Sstevel@tonic-gate ASSERT(cyclic->cy_level != CY_HIGH_LEVEL); 17220Sstevel@tonic-gate CYC_TRACE1(cpu, CY_HIGH_LEVEL, "remove-pend", cyclic->cy_pend); 17230Sstevel@tonic-gate cpu->cyp_rpend = cyclic->cy_pend; 17240Sstevel@tonic-gate cyclic->cy_pend = 0; 17250Sstevel@tonic-gate } 17260Sstevel@tonic-gate 17270Sstevel@tonic-gate /* 17280Sstevel@tonic-gate * Now set the flags to CYF_FREE. We don't need a membar_enter() 17290Sstevel@tonic-gate * between zeroing pend and setting the flags because we're at 17300Sstevel@tonic-gate * CY_HIGH_LEVEL (that is, the zeroing of pend and the setting 17310Sstevel@tonic-gate * of cy_flags appear atomic to softints). 17320Sstevel@tonic-gate */ 17330Sstevel@tonic-gate cyclic->cy_flags = CYF_FREE; 17340Sstevel@tonic-gate 17350Sstevel@tonic-gate for (i = 0; i < nelems; i++) { 17360Sstevel@tonic-gate if (heap[i] == ndx) 17370Sstevel@tonic-gate break; 17380Sstevel@tonic-gate } 17390Sstevel@tonic-gate 17400Sstevel@tonic-gate if (i == nelems) 17410Sstevel@tonic-gate panic("attempt to remove non-existent cyclic"); 17420Sstevel@tonic-gate 17430Sstevel@tonic-gate cpu->cyp_nelems = --nelems; 17440Sstevel@tonic-gate 17450Sstevel@tonic-gate if (nelems == 0) { 17460Sstevel@tonic-gate /* 17470Sstevel@tonic-gate * If we just removed the last element, then we need to 17480Sstevel@tonic-gate * disable the backend on this CPU. 17490Sstevel@tonic-gate */ 17500Sstevel@tonic-gate CYC_TRACE0(cpu, CY_HIGH_LEVEL, "disabled"); 17510Sstevel@tonic-gate be->cyb_disable(bar); 17520Sstevel@tonic-gate } 17530Sstevel@tonic-gate 17540Sstevel@tonic-gate if (i == nelems) { 17550Sstevel@tonic-gate /* 17560Sstevel@tonic-gate * If we just removed the last element of the heap, then 17570Sstevel@tonic-gate * we don't have to downheap. 17580Sstevel@tonic-gate */ 17590Sstevel@tonic-gate CYC_TRACE0(cpu, CY_HIGH_LEVEL, "remove-bottom"); 17600Sstevel@tonic-gate goto out; 17610Sstevel@tonic-gate } 17620Sstevel@tonic-gate 17630Sstevel@tonic-gate #ifdef DEBUG 17640Sstevel@tonic-gate root = heap[0]; 17650Sstevel@tonic-gate #endif 17660Sstevel@tonic-gate 17670Sstevel@tonic-gate /* 17680Sstevel@tonic-gate * Swap the last element of the heap with the one we want to 17690Sstevel@tonic-gate * remove, and downheap (this has the implicit effect of putting 17700Sstevel@tonic-gate * the newly freed element on the free list). 17710Sstevel@tonic-gate */ 17720Sstevel@tonic-gate heap[i] = (last = heap[nelems]); 17730Sstevel@tonic-gate heap[nelems] = ndx; 17740Sstevel@tonic-gate 17750Sstevel@tonic-gate if (i == 0) { 17760Sstevel@tonic-gate CYC_TRACE0(cpu, CY_HIGH_LEVEL, "remove-root"); 17770Sstevel@tonic-gate cyclic_downheap(cpu, 0); 17780Sstevel@tonic-gate } else { 17790Sstevel@tonic-gate if (cyclic_upheap(cpu, i) == 0) { 17800Sstevel@tonic-gate /* 17810Sstevel@tonic-gate * The upheap didn't propagate to the root; if it 17820Sstevel@tonic-gate * didn't propagate at all, we need to downheap. 17830Sstevel@tonic-gate */ 17840Sstevel@tonic-gate CYC_TRACE0(cpu, CY_HIGH_LEVEL, "remove-no-root"); 17850Sstevel@tonic-gate if (heap[i] == last) { 17860Sstevel@tonic-gate CYC_TRACE0(cpu, CY_HIGH_LEVEL, "remove-no-up"); 17870Sstevel@tonic-gate cyclic_downheap(cpu, i); 17880Sstevel@tonic-gate } 17890Sstevel@tonic-gate ASSERT(heap[0] == root); 17900Sstevel@tonic-gate goto out; 17910Sstevel@tonic-gate } 17920Sstevel@tonic-gate } 17930Sstevel@tonic-gate 17940Sstevel@tonic-gate /* 17950Sstevel@tonic-gate * We're here because we changed the root; we need to reprogram 17960Sstevel@tonic-gate * the clock source. 17970Sstevel@tonic-gate */ 17980Sstevel@tonic-gate cyclic = &cpu->cyp_cyclics[heap[0]]; 17990Sstevel@tonic-gate 18000Sstevel@tonic-gate CYC_TRACE0(cpu, CY_HIGH_LEVEL, "remove-reprog"); 18010Sstevel@tonic-gate 18020Sstevel@tonic-gate ASSERT(nelems != 0); 18030Sstevel@tonic-gate be->cyb_reprogram(bar, cyclic->cy_expire); 18040Sstevel@tonic-gate out: 18050Sstevel@tonic-gate be->cyb_restore_level(bar, cookie); 18060Sstevel@tonic-gate } 18070Sstevel@tonic-gate 18080Sstevel@tonic-gate static int 18090Sstevel@tonic-gate cyclic_remove_here(cyc_cpu_t *cpu, cyc_index_t ndx, cyc_time_t *when, int wait) 18100Sstevel@tonic-gate { 18110Sstevel@tonic-gate cyc_backend_t *be = cpu->cyp_backend; 18120Sstevel@tonic-gate cyc_xcallarg_t arg; 18130Sstevel@tonic-gate cyclic_t *cyclic = &cpu->cyp_cyclics[ndx]; 18140Sstevel@tonic-gate cyc_level_t level = cyclic->cy_level; 18150Sstevel@tonic-gate 18160Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 18170Sstevel@tonic-gate ASSERT(cpu->cyp_rpend == 0); 18180Sstevel@tonic-gate ASSERT(wait == CY_WAIT || wait == CY_NOWAIT); 18190Sstevel@tonic-gate 18200Sstevel@tonic-gate arg.cyx_ndx = ndx; 18210Sstevel@tonic-gate arg.cyx_cpu = cpu; 18220Sstevel@tonic-gate arg.cyx_when = when; 18230Sstevel@tonic-gate arg.cyx_wait = wait; 18240Sstevel@tonic-gate 18250Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE); 18260Sstevel@tonic-gate cpu->cyp_state = CYS_REMOVING; 18270Sstevel@tonic-gate 18280Sstevel@tonic-gate be->cyb_xcall(be->cyb_arg, cpu->cyp_cpu, 18290Sstevel@tonic-gate (cyc_func_t)cyclic_remove_xcall, &arg); 18300Sstevel@tonic-gate 18310Sstevel@tonic-gate /* 18320Sstevel@tonic-gate * If the cyclic we removed wasn't at CY_HIGH_LEVEL, then we need to 18330Sstevel@tonic-gate * check the cyp_rpend. If it's non-zero, then we need to wait here 18340Sstevel@tonic-gate * for all pending cyclic handlers to run. 18350Sstevel@tonic-gate */ 18360Sstevel@tonic-gate ASSERT(!(level == CY_HIGH_LEVEL && cpu->cyp_rpend != 0)); 18370Sstevel@tonic-gate ASSERT(!(wait == CY_NOWAIT && cpu->cyp_rpend != 0)); 18380Sstevel@tonic-gate ASSERT(!(arg.cyx_wait == CY_NOWAIT && cpu->cyp_rpend != 0)); 18390Sstevel@tonic-gate 18400Sstevel@tonic-gate if (wait != arg.cyx_wait) { 18410Sstevel@tonic-gate /* 18420Sstevel@tonic-gate * We are being told that we must wait if we want to 18430Sstevel@tonic-gate * remove this cyclic; put the CPU back in the CYS_ONLINE 18440Sstevel@tonic-gate * state and return failure. 18450Sstevel@tonic-gate */ 18460Sstevel@tonic-gate ASSERT(wait == CY_NOWAIT && arg.cyx_wait == CY_WAIT); 18470Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_REMOVING); 18480Sstevel@tonic-gate cpu->cyp_state = CYS_ONLINE; 18490Sstevel@tonic-gate 18500Sstevel@tonic-gate return (0); 18510Sstevel@tonic-gate } 18520Sstevel@tonic-gate 18530Sstevel@tonic-gate if (cpu->cyp_rpend != 0) 18540Sstevel@tonic-gate sema_p(&cpu->cyp_modify_wait); 18550Sstevel@tonic-gate 18560Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_REMOVING); 18570Sstevel@tonic-gate 18580Sstevel@tonic-gate cpu->cyp_rpend = 0; 18590Sstevel@tonic-gate cpu->cyp_state = CYS_ONLINE; 18600Sstevel@tonic-gate 18610Sstevel@tonic-gate return (1); 18620Sstevel@tonic-gate } 18630Sstevel@tonic-gate 18640Sstevel@tonic-gate /* 18650Sstevel@tonic-gate * cyclic_juggle_one_to() should only be called when the source cyclic 18660Sstevel@tonic-gate * can be juggled and the destination CPU is known to be able to accept 18670Sstevel@tonic-gate * it. 18680Sstevel@tonic-gate */ 18690Sstevel@tonic-gate static void 18700Sstevel@tonic-gate cyclic_juggle_one_to(cyc_id_t *idp, cyc_cpu_t *dest) 18710Sstevel@tonic-gate { 18720Sstevel@tonic-gate cyc_cpu_t *src = idp->cyi_cpu; 18730Sstevel@tonic-gate cyc_index_t ndx = idp->cyi_ndx; 18740Sstevel@tonic-gate cyc_time_t when; 18750Sstevel@tonic-gate cyc_handler_t hdlr; 18760Sstevel@tonic-gate cyclic_t *cyclic; 18770Sstevel@tonic-gate uint16_t flags; 18780Sstevel@tonic-gate hrtime_t delay; 18790Sstevel@tonic-gate 18800Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 18810Sstevel@tonic-gate ASSERT(src != NULL && idp->cyi_omni_list == NULL); 18820Sstevel@tonic-gate ASSERT(!(dest->cyp_cpu->cpu_flags & (CPU_QUIESCED | CPU_OFFLINE))); 18830Sstevel@tonic-gate CYC_PTRACE("juggle-one-to", idp, dest); 18840Sstevel@tonic-gate 18850Sstevel@tonic-gate cyclic = &src->cyp_cyclics[ndx]; 18860Sstevel@tonic-gate 18870Sstevel@tonic-gate flags = cyclic->cy_flags; 18880Sstevel@tonic-gate ASSERT(!(flags & CYF_CPU_BOUND) && !(flags & CYF_FREE)); 18890Sstevel@tonic-gate 18900Sstevel@tonic-gate hdlr.cyh_func = cyclic->cy_handler; 18910Sstevel@tonic-gate hdlr.cyh_level = cyclic->cy_level; 18920Sstevel@tonic-gate hdlr.cyh_arg = cyclic->cy_arg; 18930Sstevel@tonic-gate 18940Sstevel@tonic-gate /* 18950Sstevel@tonic-gate * Before we begin the juggling process, see if the destination 18960Sstevel@tonic-gate * CPU requires an expansion. If it does, we'll perform the 18970Sstevel@tonic-gate * expansion before removing the cyclic. This is to prevent us 18980Sstevel@tonic-gate * from blocking while a system-critical cyclic (notably, the clock 18990Sstevel@tonic-gate * cyclic) isn't on a CPU. 19000Sstevel@tonic-gate */ 19010Sstevel@tonic-gate if (dest->cyp_nelems == dest->cyp_size) { 19020Sstevel@tonic-gate CYC_PTRACE("remove-expand", idp, dest); 19030Sstevel@tonic-gate cyclic_expand(dest); 19040Sstevel@tonic-gate ASSERT(dest->cyp_nelems < dest->cyp_size); 19050Sstevel@tonic-gate } 19060Sstevel@tonic-gate 19070Sstevel@tonic-gate /* 19080Sstevel@tonic-gate * Remove the cyclic from the source. As mentioned above, we cannot 19090Sstevel@tonic-gate * block during this operation; if we cannot remove the cyclic 19100Sstevel@tonic-gate * without waiting, we spin for a time shorter than the interval, and 19110Sstevel@tonic-gate * reattempt the (non-blocking) removal. If we continue to fail, 19120Sstevel@tonic-gate * we will exponentially back off (up to half of the interval). 19130Sstevel@tonic-gate * Note that the removal will ultimately succeed -- even if the 19140Sstevel@tonic-gate * cyclic handler is blocked on a resource held by a thread which we 19150Sstevel@tonic-gate * have preempted, priority inheritance assures that the preempted 19160Sstevel@tonic-gate * thread will preempt us and continue to progress. 19170Sstevel@tonic-gate */ 19180Sstevel@tonic-gate for (delay = NANOSEC / MICROSEC; ; delay <<= 1) { 19190Sstevel@tonic-gate /* 19200Sstevel@tonic-gate * Before we begin this operation, disable kernel preemption. 19210Sstevel@tonic-gate */ 19220Sstevel@tonic-gate kpreempt_disable(); 19230Sstevel@tonic-gate if (cyclic_remove_here(src, ndx, &when, CY_NOWAIT)) 19240Sstevel@tonic-gate break; 19250Sstevel@tonic-gate 19260Sstevel@tonic-gate /* 19270Sstevel@tonic-gate * The operation failed; enable kernel preemption while 19280Sstevel@tonic-gate * spinning. 19290Sstevel@tonic-gate */ 19300Sstevel@tonic-gate kpreempt_enable(); 19310Sstevel@tonic-gate 19320Sstevel@tonic-gate CYC_PTRACE("remove-retry", idp, src); 19330Sstevel@tonic-gate 19340Sstevel@tonic-gate if (delay > (cyclic->cy_interval >> 1)) 19350Sstevel@tonic-gate delay = cyclic->cy_interval >> 1; 19360Sstevel@tonic-gate 19370Sstevel@tonic-gate drv_usecwait((clock_t)(delay / (NANOSEC / MICROSEC))); 19380Sstevel@tonic-gate } 19390Sstevel@tonic-gate 19400Sstevel@tonic-gate /* 19410Sstevel@tonic-gate * Now add the cyclic to the destination. This won't block; we 19420Sstevel@tonic-gate * performed any necessary (blocking) expansion of the destination 19430Sstevel@tonic-gate * CPU before removing the cyclic from the source CPU. 19440Sstevel@tonic-gate */ 19450Sstevel@tonic-gate idp->cyi_ndx = cyclic_add_here(dest, &hdlr, &when, flags); 19460Sstevel@tonic-gate idp->cyi_cpu = dest; 19470Sstevel@tonic-gate kpreempt_enable(); 19480Sstevel@tonic-gate } 19490Sstevel@tonic-gate 19500Sstevel@tonic-gate static int 19510Sstevel@tonic-gate cyclic_juggle_one(cyc_id_t *idp) 19520Sstevel@tonic-gate { 19530Sstevel@tonic-gate cyc_index_t ndx = idp->cyi_ndx; 19540Sstevel@tonic-gate cyc_cpu_t *cpu = idp->cyi_cpu, *dest; 19550Sstevel@tonic-gate cyclic_t *cyclic = &cpu->cyp_cyclics[ndx]; 19560Sstevel@tonic-gate cpu_t *c = cpu->cyp_cpu; 19570Sstevel@tonic-gate cpupart_t *part = c->cpu_part; 19580Sstevel@tonic-gate 19590Sstevel@tonic-gate CYC_PTRACE("juggle-one", idp, cpu); 19600Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 19610Sstevel@tonic-gate ASSERT(!(c->cpu_flags & CPU_OFFLINE)); 19620Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE); 19630Sstevel@tonic-gate ASSERT(!(cyclic->cy_flags & CYF_FREE)); 19640Sstevel@tonic-gate 19650Sstevel@tonic-gate if ((dest = cyclic_pick_cpu(part, c, c, cyclic->cy_flags)) == NULL) { 19660Sstevel@tonic-gate /* 19670Sstevel@tonic-gate * Bad news: this cyclic can't be juggled. 19680Sstevel@tonic-gate */ 19690Sstevel@tonic-gate CYC_PTRACE("juggle-fail", idp, cpu) 19700Sstevel@tonic-gate return (0); 19710Sstevel@tonic-gate } 19720Sstevel@tonic-gate 19730Sstevel@tonic-gate cyclic_juggle_one_to(idp, dest); 19740Sstevel@tonic-gate 19750Sstevel@tonic-gate return (1); 19760Sstevel@tonic-gate } 19770Sstevel@tonic-gate 19780Sstevel@tonic-gate static void 19790Sstevel@tonic-gate cyclic_unbind_cpu(cyclic_id_t id) 19800Sstevel@tonic-gate { 19810Sstevel@tonic-gate cyc_id_t *idp = (cyc_id_t *)id; 19820Sstevel@tonic-gate cyc_cpu_t *cpu = idp->cyi_cpu; 19830Sstevel@tonic-gate cpu_t *c = cpu->cyp_cpu; 19840Sstevel@tonic-gate cyclic_t *cyclic = &cpu->cyp_cyclics[idp->cyi_ndx]; 19850Sstevel@tonic-gate 19860Sstevel@tonic-gate CYC_PTRACE("unbind-cpu", id, cpu); 19870Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 19880Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE); 19890Sstevel@tonic-gate ASSERT(!(cyclic->cy_flags & CYF_FREE)); 19900Sstevel@tonic-gate ASSERT(cyclic->cy_flags & CYF_CPU_BOUND); 19910Sstevel@tonic-gate 19920Sstevel@tonic-gate cyclic->cy_flags &= ~CYF_CPU_BOUND; 19930Sstevel@tonic-gate 19940Sstevel@tonic-gate /* 19950Sstevel@tonic-gate * If we were bound to CPU which has interrupts disabled, we need 19960Sstevel@tonic-gate * to juggle away. This can only fail if we are bound to a 19970Sstevel@tonic-gate * processor set, and if every CPU in the processor set has 19980Sstevel@tonic-gate * interrupts disabled. 19990Sstevel@tonic-gate */ 20000Sstevel@tonic-gate if (!(c->cpu_flags & CPU_ENABLE)) { 20010Sstevel@tonic-gate int res = cyclic_juggle_one(idp); 20020Sstevel@tonic-gate 20030Sstevel@tonic-gate ASSERT((res && idp->cyi_cpu != cpu) || 20040Sstevel@tonic-gate (!res && (cyclic->cy_flags & CYF_PART_BOUND))); 20050Sstevel@tonic-gate } 20060Sstevel@tonic-gate } 20070Sstevel@tonic-gate 20080Sstevel@tonic-gate static void 20090Sstevel@tonic-gate cyclic_bind_cpu(cyclic_id_t id, cpu_t *d) 20100Sstevel@tonic-gate { 20110Sstevel@tonic-gate cyc_id_t *idp = (cyc_id_t *)id; 20120Sstevel@tonic-gate cyc_cpu_t *dest = d->cpu_cyclic, *cpu = idp->cyi_cpu; 20130Sstevel@tonic-gate cpu_t *c = cpu->cyp_cpu; 20140Sstevel@tonic-gate cyclic_t *cyclic = &cpu->cyp_cyclics[idp->cyi_ndx]; 20150Sstevel@tonic-gate cpupart_t *part = c->cpu_part; 20160Sstevel@tonic-gate 20170Sstevel@tonic-gate CYC_PTRACE("bind-cpu", id, dest); 20180Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 20190Sstevel@tonic-gate ASSERT(!(d->cpu_flags & CPU_OFFLINE)); 20200Sstevel@tonic-gate ASSERT(!(c->cpu_flags & CPU_OFFLINE)); 20210Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE); 20220Sstevel@tonic-gate ASSERT(dest != NULL); 20230Sstevel@tonic-gate ASSERT(dest->cyp_state == CYS_ONLINE); 20240Sstevel@tonic-gate ASSERT(!(cyclic->cy_flags & CYF_FREE)); 20250Sstevel@tonic-gate ASSERT(!(cyclic->cy_flags & CYF_CPU_BOUND)); 20260Sstevel@tonic-gate 20270Sstevel@tonic-gate dest = cyclic_pick_cpu(part, d, NULL, cyclic->cy_flags | CYF_CPU_BOUND); 20280Sstevel@tonic-gate 20290Sstevel@tonic-gate if (dest != cpu) { 20300Sstevel@tonic-gate cyclic_juggle_one_to(idp, dest); 20310Sstevel@tonic-gate cyclic = &dest->cyp_cyclics[idp->cyi_ndx]; 20320Sstevel@tonic-gate } 20330Sstevel@tonic-gate 20340Sstevel@tonic-gate cyclic->cy_flags |= CYF_CPU_BOUND; 20350Sstevel@tonic-gate } 20360Sstevel@tonic-gate 20370Sstevel@tonic-gate static void 20380Sstevel@tonic-gate cyclic_unbind_cpupart(cyclic_id_t id) 20390Sstevel@tonic-gate { 20400Sstevel@tonic-gate cyc_id_t *idp = (cyc_id_t *)id; 20410Sstevel@tonic-gate cyc_cpu_t *cpu = idp->cyi_cpu; 20420Sstevel@tonic-gate cpu_t *c = cpu->cyp_cpu; 20430Sstevel@tonic-gate cyclic_t *cyc = &cpu->cyp_cyclics[idp->cyi_ndx]; 20440Sstevel@tonic-gate 20450Sstevel@tonic-gate CYC_PTRACE("unbind-part", idp, c->cpu_part); 20460Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 20470Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE); 20480Sstevel@tonic-gate ASSERT(!(cyc->cy_flags & CYF_FREE)); 20490Sstevel@tonic-gate ASSERT(cyc->cy_flags & CYF_PART_BOUND); 20500Sstevel@tonic-gate 20510Sstevel@tonic-gate cyc->cy_flags &= ~CYF_PART_BOUND; 20520Sstevel@tonic-gate 20530Sstevel@tonic-gate /* 20540Sstevel@tonic-gate * If we're on a CPU which has interrupts disabled (and if this cyclic 20550Sstevel@tonic-gate * isn't bound to the CPU), we need to juggle away. 20560Sstevel@tonic-gate */ 20570Sstevel@tonic-gate if (!(c->cpu_flags & CPU_ENABLE) && !(cyc->cy_flags & CYF_CPU_BOUND)) { 20580Sstevel@tonic-gate int res = cyclic_juggle_one(idp); 20590Sstevel@tonic-gate 20600Sstevel@tonic-gate ASSERT(res && idp->cyi_cpu != cpu); 20610Sstevel@tonic-gate } 20620Sstevel@tonic-gate } 20630Sstevel@tonic-gate 20640Sstevel@tonic-gate static void 20650Sstevel@tonic-gate cyclic_bind_cpupart(cyclic_id_t id, cpupart_t *part) 20660Sstevel@tonic-gate { 20670Sstevel@tonic-gate cyc_id_t *idp = (cyc_id_t *)id; 20680Sstevel@tonic-gate cyc_cpu_t *cpu = idp->cyi_cpu, *dest; 20690Sstevel@tonic-gate cpu_t *c = cpu->cyp_cpu; 20700Sstevel@tonic-gate cyclic_t *cyc = &cpu->cyp_cyclics[idp->cyi_ndx]; 20710Sstevel@tonic-gate 20720Sstevel@tonic-gate CYC_PTRACE("bind-part", idp, part); 20730Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 20740Sstevel@tonic-gate ASSERT(!(c->cpu_flags & CPU_OFFLINE)); 20750Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE); 20760Sstevel@tonic-gate ASSERT(!(cyc->cy_flags & CYF_FREE)); 20770Sstevel@tonic-gate ASSERT(!(cyc->cy_flags & CYF_PART_BOUND)); 20780Sstevel@tonic-gate ASSERT(part->cp_ncpus > 0); 20790Sstevel@tonic-gate 20800Sstevel@tonic-gate dest = cyclic_pick_cpu(part, c, NULL, cyc->cy_flags | CYF_PART_BOUND); 20810Sstevel@tonic-gate 20820Sstevel@tonic-gate if (dest != cpu) { 20830Sstevel@tonic-gate cyclic_juggle_one_to(idp, dest); 20840Sstevel@tonic-gate cyc = &dest->cyp_cyclics[idp->cyi_ndx]; 20850Sstevel@tonic-gate } 20860Sstevel@tonic-gate 20870Sstevel@tonic-gate cyc->cy_flags |= CYF_PART_BOUND; 20880Sstevel@tonic-gate } 20890Sstevel@tonic-gate 20900Sstevel@tonic-gate static void 20910Sstevel@tonic-gate cyclic_configure(cpu_t *c) 20920Sstevel@tonic-gate { 20930Sstevel@tonic-gate cyc_cpu_t *cpu = kmem_zalloc(sizeof (cyc_cpu_t), KM_SLEEP); 20940Sstevel@tonic-gate cyc_backend_t *nbe = kmem_zalloc(sizeof (cyc_backend_t), KM_SLEEP); 20950Sstevel@tonic-gate int i; 20960Sstevel@tonic-gate 20970Sstevel@tonic-gate CYC_PTRACE1("configure", cpu); 20980Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 20990Sstevel@tonic-gate 21000Sstevel@tonic-gate if (cyclic_id_cache == NULL) 21010Sstevel@tonic-gate cyclic_id_cache = kmem_cache_create("cyclic_id_cache", 21020Sstevel@tonic-gate sizeof (cyc_id_t), 0, NULL, NULL, NULL, NULL, NULL, 0); 21030Sstevel@tonic-gate 21040Sstevel@tonic-gate cpu->cyp_cpu = c; 21050Sstevel@tonic-gate 21060Sstevel@tonic-gate sema_init(&cpu->cyp_modify_wait, 0, NULL, SEMA_DEFAULT, NULL); 21070Sstevel@tonic-gate 21080Sstevel@tonic-gate cpu->cyp_size = 1; 21090Sstevel@tonic-gate cpu->cyp_heap = kmem_zalloc(sizeof (cyc_index_t), KM_SLEEP); 21100Sstevel@tonic-gate cpu->cyp_cyclics = kmem_zalloc(sizeof (cyclic_t), KM_SLEEP); 21110Sstevel@tonic-gate cpu->cyp_cyclics->cy_flags = CYF_FREE; 21120Sstevel@tonic-gate 21130Sstevel@tonic-gate for (i = CY_LOW_LEVEL; i < CY_LOW_LEVEL + CY_SOFT_LEVELS; i++) { 21140Sstevel@tonic-gate /* 21150Sstevel@tonic-gate * We don't need to set the sizemask; it's already zero 21160Sstevel@tonic-gate * (which is the appropriate sizemask for a size of 1). 21170Sstevel@tonic-gate */ 21180Sstevel@tonic-gate cpu->cyp_softbuf[i].cys_buf[0].cypc_buf = 21190Sstevel@tonic-gate kmem_alloc(sizeof (cyc_index_t), KM_SLEEP); 21200Sstevel@tonic-gate } 21210Sstevel@tonic-gate 21220Sstevel@tonic-gate cpu->cyp_state = CYS_OFFLINE; 21230Sstevel@tonic-gate 21240Sstevel@tonic-gate /* 21250Sstevel@tonic-gate * Setup the backend for this CPU. 21260Sstevel@tonic-gate */ 21270Sstevel@tonic-gate bcopy(&cyclic_backend, nbe, sizeof (cyc_backend_t)); 21280Sstevel@tonic-gate nbe->cyb_arg = nbe->cyb_configure(c); 21290Sstevel@tonic-gate cpu->cyp_backend = nbe; 21300Sstevel@tonic-gate 21310Sstevel@tonic-gate /* 21320Sstevel@tonic-gate * On platforms where stray interrupts may be taken during startup, 21330Sstevel@tonic-gate * the CPU's cpu_cyclic pointer serves as an indicator that the 21340Sstevel@tonic-gate * cyclic subsystem for this CPU is prepared to field interrupts. 21350Sstevel@tonic-gate */ 21360Sstevel@tonic-gate membar_producer(); 21370Sstevel@tonic-gate 21380Sstevel@tonic-gate c->cpu_cyclic = cpu; 21390Sstevel@tonic-gate } 21400Sstevel@tonic-gate 21410Sstevel@tonic-gate static void 21420Sstevel@tonic-gate cyclic_unconfigure(cpu_t *c) 21430Sstevel@tonic-gate { 21440Sstevel@tonic-gate cyc_cpu_t *cpu = c->cpu_cyclic; 21450Sstevel@tonic-gate cyc_backend_t *be = cpu->cyp_backend; 21460Sstevel@tonic-gate cyb_arg_t bar = be->cyb_arg; 21470Sstevel@tonic-gate int i; 21480Sstevel@tonic-gate 21490Sstevel@tonic-gate CYC_PTRACE1("unconfigure", cpu); 21500Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 21510Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_OFFLINE); 21520Sstevel@tonic-gate ASSERT(cpu->cyp_nelems == 0); 21530Sstevel@tonic-gate 21540Sstevel@tonic-gate /* 21550Sstevel@tonic-gate * Let the backend know that the CPU is being yanked, and free up 21560Sstevel@tonic-gate * the backend structure. 21570Sstevel@tonic-gate */ 21580Sstevel@tonic-gate be->cyb_unconfigure(bar); 21590Sstevel@tonic-gate kmem_free(be, sizeof (cyc_backend_t)); 21600Sstevel@tonic-gate cpu->cyp_backend = NULL; 21610Sstevel@tonic-gate 21620Sstevel@tonic-gate /* 21630Sstevel@tonic-gate * Free up the producer/consumer buffers at each of the soft levels. 21640Sstevel@tonic-gate */ 21650Sstevel@tonic-gate for (i = CY_LOW_LEVEL; i < CY_LOW_LEVEL + CY_SOFT_LEVELS; i++) { 21660Sstevel@tonic-gate cyc_softbuf_t *softbuf = &cpu->cyp_softbuf[i]; 21670Sstevel@tonic-gate uchar_t hard = softbuf->cys_hard; 21680Sstevel@tonic-gate cyc_pcbuffer_t *pc = &softbuf->cys_buf[hard]; 21690Sstevel@tonic-gate size_t bufsize = sizeof (cyc_index_t) * (pc->cypc_sizemask + 1); 21700Sstevel@tonic-gate 21710Sstevel@tonic-gate /* 21720Sstevel@tonic-gate * Assert that we're not in the middle of a resize operation. 21730Sstevel@tonic-gate */ 21740Sstevel@tonic-gate ASSERT(hard == softbuf->cys_soft); 21750Sstevel@tonic-gate ASSERT(hard == 0 || hard == 1); 21760Sstevel@tonic-gate ASSERT(pc->cypc_buf != NULL); 21770Sstevel@tonic-gate ASSERT(softbuf->cys_buf[hard ^ 1].cypc_buf == NULL); 21780Sstevel@tonic-gate 21790Sstevel@tonic-gate kmem_free(pc->cypc_buf, bufsize); 21800Sstevel@tonic-gate pc->cypc_buf = NULL; 21810Sstevel@tonic-gate } 21820Sstevel@tonic-gate 21830Sstevel@tonic-gate /* 21840Sstevel@tonic-gate * Finally, clean up our remaining dynamic structures and NULL out 21850Sstevel@tonic-gate * the cpu_cyclic pointer. 21860Sstevel@tonic-gate */ 21870Sstevel@tonic-gate kmem_free(cpu->cyp_cyclics, cpu->cyp_size * sizeof (cyclic_t)); 21880Sstevel@tonic-gate kmem_free(cpu->cyp_heap, cpu->cyp_size * sizeof (cyc_index_t)); 21890Sstevel@tonic-gate kmem_free(cpu, sizeof (cyc_cpu_t)); 21900Sstevel@tonic-gate 21910Sstevel@tonic-gate c->cpu_cyclic = NULL; 21920Sstevel@tonic-gate } 21930Sstevel@tonic-gate 21940Sstevel@tonic-gate static int 21950Sstevel@tonic-gate cyclic_cpu_setup(cpu_setup_t what, int id) 21960Sstevel@tonic-gate { 21970Sstevel@tonic-gate /* 21980Sstevel@tonic-gate * We are guaranteed that there is still/already an entry in the 21990Sstevel@tonic-gate * cpu array for this CPU. 22000Sstevel@tonic-gate */ 22010Sstevel@tonic-gate cpu_t *c = cpu[id]; 22020Sstevel@tonic-gate cyc_cpu_t *cyp = c->cpu_cyclic; 22030Sstevel@tonic-gate 22040Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 22050Sstevel@tonic-gate 22060Sstevel@tonic-gate switch (what) { 22070Sstevel@tonic-gate case CPU_CONFIG: 22080Sstevel@tonic-gate ASSERT(cyp == NULL); 22090Sstevel@tonic-gate cyclic_configure(c); 22100Sstevel@tonic-gate break; 22110Sstevel@tonic-gate 22120Sstevel@tonic-gate case CPU_UNCONFIG: 22130Sstevel@tonic-gate ASSERT(cyp != NULL && cyp->cyp_state == CYS_OFFLINE); 22140Sstevel@tonic-gate cyclic_unconfigure(c); 22150Sstevel@tonic-gate break; 22160Sstevel@tonic-gate 22170Sstevel@tonic-gate default: 22180Sstevel@tonic-gate break; 22190Sstevel@tonic-gate } 22200Sstevel@tonic-gate 22210Sstevel@tonic-gate return (0); 22220Sstevel@tonic-gate } 22230Sstevel@tonic-gate 22240Sstevel@tonic-gate static void 22250Sstevel@tonic-gate cyclic_suspend_xcall(cyc_xcallarg_t *arg) 22260Sstevel@tonic-gate { 22270Sstevel@tonic-gate cyc_cpu_t *cpu = arg->cyx_cpu; 22280Sstevel@tonic-gate cyc_backend_t *be = cpu->cyp_backend; 22290Sstevel@tonic-gate cyc_cookie_t cookie; 22300Sstevel@tonic-gate cyb_arg_t bar = be->cyb_arg; 22310Sstevel@tonic-gate 22320Sstevel@tonic-gate cookie = be->cyb_set_level(bar, CY_HIGH_LEVEL); 22330Sstevel@tonic-gate 22340Sstevel@tonic-gate CYC_TRACE1(cpu, CY_HIGH_LEVEL, "suspend-xcall", cpu->cyp_nelems); 22350Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE || cpu->cyp_state == CYS_OFFLINE); 22360Sstevel@tonic-gate 22370Sstevel@tonic-gate /* 22380Sstevel@tonic-gate * We won't disable this CPU unless it has a non-zero number of 22390Sstevel@tonic-gate * elements (cpu_lock assures that no one else may be attempting 22400Sstevel@tonic-gate * to disable this CPU). 22410Sstevel@tonic-gate */ 22420Sstevel@tonic-gate if (cpu->cyp_nelems > 0) { 22430Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE); 22440Sstevel@tonic-gate be->cyb_disable(bar); 22450Sstevel@tonic-gate } 22460Sstevel@tonic-gate 22470Sstevel@tonic-gate if (cpu->cyp_state == CYS_ONLINE) 22480Sstevel@tonic-gate cpu->cyp_state = CYS_SUSPENDED; 22490Sstevel@tonic-gate 22500Sstevel@tonic-gate be->cyb_suspend(bar); 22510Sstevel@tonic-gate be->cyb_restore_level(bar, cookie); 22520Sstevel@tonic-gate } 22530Sstevel@tonic-gate 22540Sstevel@tonic-gate static void 22550Sstevel@tonic-gate cyclic_resume_xcall(cyc_xcallarg_t *arg) 22560Sstevel@tonic-gate { 22570Sstevel@tonic-gate cyc_cpu_t *cpu = arg->cyx_cpu; 22580Sstevel@tonic-gate cyc_backend_t *be = cpu->cyp_backend; 22590Sstevel@tonic-gate cyc_cookie_t cookie; 22600Sstevel@tonic-gate cyb_arg_t bar = be->cyb_arg; 22610Sstevel@tonic-gate cyc_state_t state = cpu->cyp_state; 22620Sstevel@tonic-gate 22630Sstevel@tonic-gate cookie = be->cyb_set_level(bar, CY_HIGH_LEVEL); 22640Sstevel@tonic-gate 22650Sstevel@tonic-gate CYC_TRACE1(cpu, CY_HIGH_LEVEL, "resume-xcall", cpu->cyp_nelems); 22660Sstevel@tonic-gate ASSERT(state == CYS_SUSPENDED || state == CYS_OFFLINE); 22670Sstevel@tonic-gate 22680Sstevel@tonic-gate be->cyb_resume(bar); 22690Sstevel@tonic-gate 22700Sstevel@tonic-gate /* 22710Sstevel@tonic-gate * We won't enable this CPU unless it has a non-zero number of 22720Sstevel@tonic-gate * elements. 22730Sstevel@tonic-gate */ 22740Sstevel@tonic-gate if (cpu->cyp_nelems > 0) { 22750Sstevel@tonic-gate cyclic_t *cyclic = &cpu->cyp_cyclics[cpu->cyp_heap[0]]; 22760Sstevel@tonic-gate hrtime_t exp = cyclic->cy_expire; 22770Sstevel@tonic-gate 22780Sstevel@tonic-gate CYC_TRACE(cpu, CY_HIGH_LEVEL, "resume-reprog", cyclic, exp); 22790Sstevel@tonic-gate ASSERT(state == CYS_SUSPENDED); 22800Sstevel@tonic-gate be->cyb_enable(bar); 22810Sstevel@tonic-gate be->cyb_reprogram(bar, exp); 22820Sstevel@tonic-gate } 22830Sstevel@tonic-gate 22840Sstevel@tonic-gate if (state == CYS_SUSPENDED) 22850Sstevel@tonic-gate cpu->cyp_state = CYS_ONLINE; 22860Sstevel@tonic-gate 22870Sstevel@tonic-gate CYC_TRACE1(cpu, CY_HIGH_LEVEL, "resume-done", cpu->cyp_nelems); 22880Sstevel@tonic-gate be->cyb_restore_level(bar, cookie); 22890Sstevel@tonic-gate } 22900Sstevel@tonic-gate 22910Sstevel@tonic-gate static void 22920Sstevel@tonic-gate cyclic_omni_start(cyc_id_t *idp, cyc_cpu_t *cpu) 22930Sstevel@tonic-gate { 22940Sstevel@tonic-gate cyc_omni_handler_t *omni = &idp->cyi_omni_hdlr; 22950Sstevel@tonic-gate cyc_omni_cpu_t *ocpu = kmem_alloc(sizeof (cyc_omni_cpu_t), KM_SLEEP); 22960Sstevel@tonic-gate cyc_handler_t hdlr; 22970Sstevel@tonic-gate cyc_time_t when; 22980Sstevel@tonic-gate 22990Sstevel@tonic-gate CYC_PTRACE("omni-start", cpu, idp); 23000Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 23010Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE); 23020Sstevel@tonic-gate ASSERT(idp->cyi_cpu == NULL); 23030Sstevel@tonic-gate 23040Sstevel@tonic-gate hdlr.cyh_func = NULL; 23050Sstevel@tonic-gate hdlr.cyh_arg = NULL; 23060Sstevel@tonic-gate hdlr.cyh_level = CY_LEVELS; 23070Sstevel@tonic-gate 23080Sstevel@tonic-gate when.cyt_when = 0; 23090Sstevel@tonic-gate when.cyt_interval = 0; 23100Sstevel@tonic-gate 23110Sstevel@tonic-gate omni->cyo_online(omni->cyo_arg, cpu->cyp_cpu, &hdlr, &when); 23120Sstevel@tonic-gate 23130Sstevel@tonic-gate ASSERT(hdlr.cyh_func != NULL); 23140Sstevel@tonic-gate ASSERT(hdlr.cyh_level < CY_LEVELS); 23150Sstevel@tonic-gate ASSERT(when.cyt_when >= 0 && when.cyt_interval > 0); 23160Sstevel@tonic-gate 23170Sstevel@tonic-gate ocpu->cyo_cpu = cpu; 23180Sstevel@tonic-gate ocpu->cyo_arg = hdlr.cyh_arg; 23190Sstevel@tonic-gate ocpu->cyo_ndx = cyclic_add_here(cpu, &hdlr, &when, 0); 23200Sstevel@tonic-gate ocpu->cyo_next = idp->cyi_omni_list; 23210Sstevel@tonic-gate idp->cyi_omni_list = ocpu; 23220Sstevel@tonic-gate } 23230Sstevel@tonic-gate 23240Sstevel@tonic-gate static void 23250Sstevel@tonic-gate cyclic_omni_stop(cyc_id_t *idp, cyc_cpu_t *cpu) 23260Sstevel@tonic-gate { 23270Sstevel@tonic-gate cyc_omni_handler_t *omni = &idp->cyi_omni_hdlr; 23280Sstevel@tonic-gate cyc_omni_cpu_t *ocpu = idp->cyi_omni_list, *prev = NULL; 23290Sstevel@tonic-gate 23300Sstevel@tonic-gate CYC_PTRACE("omni-stop", cpu, idp); 23310Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 23320Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE); 23330Sstevel@tonic-gate ASSERT(idp->cyi_cpu == NULL); 23340Sstevel@tonic-gate ASSERT(ocpu != NULL); 23350Sstevel@tonic-gate 23360Sstevel@tonic-gate while (ocpu != NULL && ocpu->cyo_cpu != cpu) { 23370Sstevel@tonic-gate prev = ocpu; 23380Sstevel@tonic-gate ocpu = ocpu->cyo_next; 23390Sstevel@tonic-gate } 23400Sstevel@tonic-gate 23410Sstevel@tonic-gate /* 23420Sstevel@tonic-gate * We _must_ have found an cyc_omni_cpu which corresponds to this 23430Sstevel@tonic-gate * CPU -- the definition of an omnipresent cyclic is that it runs 23440Sstevel@tonic-gate * on all online CPUs. 23450Sstevel@tonic-gate */ 23460Sstevel@tonic-gate ASSERT(ocpu != NULL); 23470Sstevel@tonic-gate 23480Sstevel@tonic-gate if (prev == NULL) { 23490Sstevel@tonic-gate idp->cyi_omni_list = ocpu->cyo_next; 23500Sstevel@tonic-gate } else { 23510Sstevel@tonic-gate prev->cyo_next = ocpu->cyo_next; 23520Sstevel@tonic-gate } 23530Sstevel@tonic-gate 23540Sstevel@tonic-gate (void) cyclic_remove_here(ocpu->cyo_cpu, ocpu->cyo_ndx, NULL, CY_WAIT); 23550Sstevel@tonic-gate 23560Sstevel@tonic-gate /* 23570Sstevel@tonic-gate * The cyclic has been removed from this CPU; time to call the 23580Sstevel@tonic-gate * omnipresent offline handler. 23590Sstevel@tonic-gate */ 23600Sstevel@tonic-gate if (omni->cyo_offline != NULL) 23610Sstevel@tonic-gate omni->cyo_offline(omni->cyo_arg, cpu->cyp_cpu, ocpu->cyo_arg); 23620Sstevel@tonic-gate 23630Sstevel@tonic-gate kmem_free(ocpu, sizeof (cyc_omni_cpu_t)); 23640Sstevel@tonic-gate } 23650Sstevel@tonic-gate 23660Sstevel@tonic-gate static cyc_id_t * 23670Sstevel@tonic-gate cyclic_new_id() 23680Sstevel@tonic-gate { 23690Sstevel@tonic-gate cyc_id_t *idp; 23700Sstevel@tonic-gate 23710Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 23720Sstevel@tonic-gate 23730Sstevel@tonic-gate idp = kmem_cache_alloc(cyclic_id_cache, KM_SLEEP); 23740Sstevel@tonic-gate 23750Sstevel@tonic-gate /* 23760Sstevel@tonic-gate * The cyi_cpu field of the cyc_id_t structure tracks the CPU 23770Sstevel@tonic-gate * associated with the cyclic. If and only if this field is NULL, the 23780Sstevel@tonic-gate * cyc_id_t is an omnipresent cyclic. Note that cyi_omni_list may be 23790Sstevel@tonic-gate * NULL for an omnipresent cyclic while the cyclic is being created 23800Sstevel@tonic-gate * or destroyed. 23810Sstevel@tonic-gate */ 23820Sstevel@tonic-gate idp->cyi_cpu = NULL; 23830Sstevel@tonic-gate idp->cyi_ndx = 0; 23840Sstevel@tonic-gate 23850Sstevel@tonic-gate idp->cyi_next = cyclic_id_head; 23860Sstevel@tonic-gate idp->cyi_prev = NULL; 23870Sstevel@tonic-gate idp->cyi_omni_list = NULL; 23880Sstevel@tonic-gate 23890Sstevel@tonic-gate if (cyclic_id_head != NULL) { 23900Sstevel@tonic-gate ASSERT(cyclic_id_head->cyi_prev == NULL); 23910Sstevel@tonic-gate cyclic_id_head->cyi_prev = idp; 23920Sstevel@tonic-gate } 23930Sstevel@tonic-gate 23940Sstevel@tonic-gate cyclic_id_head = idp; 23950Sstevel@tonic-gate 23960Sstevel@tonic-gate return (idp); 23970Sstevel@tonic-gate } 23980Sstevel@tonic-gate 23990Sstevel@tonic-gate /* 24000Sstevel@tonic-gate * cyclic_id_t cyclic_add(cyc_handler_t *, cyc_time_t *) 24010Sstevel@tonic-gate * 24020Sstevel@tonic-gate * Overview 24030Sstevel@tonic-gate * 24040Sstevel@tonic-gate * cyclic_add() will create an unbound cyclic with the specified handler and 24050Sstevel@tonic-gate * interval. The cyclic will run on a CPU which both has interrupts enabled 24060Sstevel@tonic-gate * and is in the system CPU partition. 24070Sstevel@tonic-gate * 24080Sstevel@tonic-gate * Arguments and notes 24090Sstevel@tonic-gate * 24100Sstevel@tonic-gate * As its first argument, cyclic_add() takes a cyc_handler, which has the 24110Sstevel@tonic-gate * following members: 24120Sstevel@tonic-gate * 24130Sstevel@tonic-gate * cyc_func_t cyh_func <-- Cyclic handler 24140Sstevel@tonic-gate * void *cyh_arg <-- Argument to cyclic handler 24150Sstevel@tonic-gate * cyc_level_t cyh_level <-- Level at which to fire; must be one of 24160Sstevel@tonic-gate * CY_LOW_LEVEL, CY_LOCK_LEVEL or CY_HIGH_LEVEL 24170Sstevel@tonic-gate * 24180Sstevel@tonic-gate * Note that cyh_level is _not_ an ipl or spl; it must be one the 24190Sstevel@tonic-gate * CY_*_LEVELs. This layer of abstraction allows the platform to define 24200Sstevel@tonic-gate * the precise interrupt priority levels, within the following constraints: 24210Sstevel@tonic-gate * 24220Sstevel@tonic-gate * CY_LOCK_LEVEL must map to LOCK_LEVEL 24230Sstevel@tonic-gate * CY_HIGH_LEVEL must map to an ipl greater than LOCK_LEVEL 24240Sstevel@tonic-gate * CY_LOW_LEVEL must map to an ipl below LOCK_LEVEL 24250Sstevel@tonic-gate * 24260Sstevel@tonic-gate * In addition to a cyc_handler, cyclic_add() takes a cyc_time, which 24270Sstevel@tonic-gate * has the following members: 24280Sstevel@tonic-gate * 24290Sstevel@tonic-gate * hrtime_t cyt_when <-- Absolute time, in nanoseconds since boot, at 24300Sstevel@tonic-gate * which to start firing 24310Sstevel@tonic-gate * hrtime_t cyt_interval <-- Length of interval, in nanoseconds 24320Sstevel@tonic-gate * 24330Sstevel@tonic-gate * gethrtime() is the time source for nanoseconds since boot. If cyt_when 24340Sstevel@tonic-gate * is set to 0, the cyclic will start to fire when cyt_interval next 24350Sstevel@tonic-gate * divides the number of nanoseconds since boot. 24360Sstevel@tonic-gate * 24370Sstevel@tonic-gate * The cyt_interval field _must_ be filled in by the caller; one-shots are 24380Sstevel@tonic-gate * _not_ explicitly supported by the cyclic subsystem (cyclic_add() will 24390Sstevel@tonic-gate * assert that cyt_interval is non-zero). The maximum value for either 24400Sstevel@tonic-gate * field is INT64_MAX; the caller is responsible for assuring that 24410Sstevel@tonic-gate * cyt_when + cyt_interval <= INT64_MAX. Neither field may be negative. 24420Sstevel@tonic-gate * 24430Sstevel@tonic-gate * For an arbitrary time t in the future, the cyclic handler is guaranteed 24440Sstevel@tonic-gate * to have been called (t - cyt_when) / cyt_interval times. This will 24450Sstevel@tonic-gate * be true even if interrupts have been disabled for periods greater than 24460Sstevel@tonic-gate * cyt_interval nanoseconds. In order to compensate for such periods, 24470Sstevel@tonic-gate * the cyclic handler may be called a finite number of times with an 24480Sstevel@tonic-gate * arbitrarily small interval. 24490Sstevel@tonic-gate * 24500Sstevel@tonic-gate * The cyclic subsystem will not enforce any lower bound on the interval; 24510Sstevel@tonic-gate * if the interval is less than the time required to process an interrupt, 24520Sstevel@tonic-gate * the CPU will wedge. It's the responsibility of the caller to assure that 24530Sstevel@tonic-gate * either the value of the interval is sane, or that its caller has 24540Sstevel@tonic-gate * sufficient privilege to deny service (i.e. its caller is root). 24550Sstevel@tonic-gate * 24560Sstevel@tonic-gate * The cyclic handler is guaranteed to be single threaded, even while the 24570Sstevel@tonic-gate * cyclic is being juggled between CPUs (see cyclic_juggle(), below). 24580Sstevel@tonic-gate * That is, a given cyclic handler will never be executed simultaneously 24590Sstevel@tonic-gate * on different CPUs. 24600Sstevel@tonic-gate * 24610Sstevel@tonic-gate * Return value 24620Sstevel@tonic-gate * 24630Sstevel@tonic-gate * cyclic_add() returns a cyclic_id_t, which is guaranteed to be a value 24640Sstevel@tonic-gate * other than CYCLIC_NONE. cyclic_add() cannot fail. 24650Sstevel@tonic-gate * 24660Sstevel@tonic-gate * Caller's context 24670Sstevel@tonic-gate * 24680Sstevel@tonic-gate * cpu_lock must be held by the caller, and the caller must not be in 24690Sstevel@tonic-gate * interrupt context. cyclic_add() will perform a KM_SLEEP kernel 24700Sstevel@tonic-gate * memory allocation, so the usual rules (e.g. p_lock cannot be held) 24710Sstevel@tonic-gate * apply. A cyclic may be added even in the presence of CPUs that have 24720Sstevel@tonic-gate * not been configured with respect to the cyclic subsystem, but only 24730Sstevel@tonic-gate * configured CPUs will be eligible to run the new cyclic. 24740Sstevel@tonic-gate * 24750Sstevel@tonic-gate * Cyclic handler's context 24760Sstevel@tonic-gate * 24770Sstevel@tonic-gate * Cyclic handlers will be executed in the interrupt context corresponding 24780Sstevel@tonic-gate * to the specified level (i.e. either high, lock or low level). The 24790Sstevel@tonic-gate * usual context rules apply. 24800Sstevel@tonic-gate * 24810Sstevel@tonic-gate * A cyclic handler may not grab ANY locks held by the caller of any of 24820Sstevel@tonic-gate * cyclic_add(), cyclic_remove() or cyclic_bind(); the implementation of 24830Sstevel@tonic-gate * these functions may require blocking on cyclic handler completion. 24840Sstevel@tonic-gate * Moreover, cyclic handlers may not make any call back into the cyclic 24850Sstevel@tonic-gate * subsystem. 24860Sstevel@tonic-gate */ 24870Sstevel@tonic-gate cyclic_id_t 24880Sstevel@tonic-gate cyclic_add(cyc_handler_t *hdlr, cyc_time_t *when) 24890Sstevel@tonic-gate { 24900Sstevel@tonic-gate cyc_id_t *idp = cyclic_new_id(); 24910Sstevel@tonic-gate 24920Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 24930Sstevel@tonic-gate ASSERT(when->cyt_when >= 0 && when->cyt_interval > 0); 24940Sstevel@tonic-gate 24950Sstevel@tonic-gate idp->cyi_cpu = cyclic_pick_cpu(NULL, NULL, NULL, 0); 24960Sstevel@tonic-gate idp->cyi_ndx = cyclic_add_here(idp->cyi_cpu, hdlr, when, 0); 24970Sstevel@tonic-gate 24980Sstevel@tonic-gate return ((uintptr_t)idp); 24990Sstevel@tonic-gate } 25000Sstevel@tonic-gate 25010Sstevel@tonic-gate /* 25020Sstevel@tonic-gate * cyclic_id_t cyclic_add_omni(cyc_omni_handler_t *) 25030Sstevel@tonic-gate * 25040Sstevel@tonic-gate * Overview 25050Sstevel@tonic-gate * 25060Sstevel@tonic-gate * cyclic_add_omni() will create an omnipresent cyclic with the specified 25070Sstevel@tonic-gate * online and offline handlers. Omnipresent cyclics run on all online 25080Sstevel@tonic-gate * CPUs, including CPUs which have unbound interrupts disabled. 25090Sstevel@tonic-gate * 25100Sstevel@tonic-gate * Arguments 25110Sstevel@tonic-gate * 25120Sstevel@tonic-gate * As its only argument, cyclic_add_omni() takes a cyc_omni_handler, which 25130Sstevel@tonic-gate * has the following members: 25140Sstevel@tonic-gate * 25150Sstevel@tonic-gate * void (*cyo_online)() <-- Online handler 25160Sstevel@tonic-gate * void (*cyo_offline)() <-- Offline handler 25170Sstevel@tonic-gate * void *cyo_arg <-- Argument to be passed to on/offline handlers 25180Sstevel@tonic-gate * 25190Sstevel@tonic-gate * Online handler 25200Sstevel@tonic-gate * 25210Sstevel@tonic-gate * The cyo_online member is a pointer to a function which has the following 25220Sstevel@tonic-gate * four arguments: 25230Sstevel@tonic-gate * 25240Sstevel@tonic-gate * void * <-- Argument (cyo_arg) 25250Sstevel@tonic-gate * cpu_t * <-- Pointer to CPU about to be onlined 25260Sstevel@tonic-gate * cyc_handler_t * <-- Pointer to cyc_handler_t; must be filled in 25270Sstevel@tonic-gate * by omni online handler 25280Sstevel@tonic-gate * cyc_time_t * <-- Pointer to cyc_time_t; must be filled in by 25290Sstevel@tonic-gate * omni online handler 25300Sstevel@tonic-gate * 25310Sstevel@tonic-gate * The omni cyclic online handler is always called _before_ the omni 25320Sstevel@tonic-gate * cyclic begins to fire on the specified CPU. As the above argument 25330Sstevel@tonic-gate * description implies, the online handler must fill in the two structures 25340Sstevel@tonic-gate * passed to it: the cyc_handler_t and the cyc_time_t. These are the 25350Sstevel@tonic-gate * same two structures passed to cyclic_add(), outlined above. This 25360Sstevel@tonic-gate * allows the omni cyclic to have maximum flexibility; different CPUs may 25370Sstevel@tonic-gate * optionally 25380Sstevel@tonic-gate * 25390Sstevel@tonic-gate * (a) have different intervals 25400Sstevel@tonic-gate * (b) be explicitly in or out of phase with one another 25410Sstevel@tonic-gate * (c) have different handlers 25420Sstevel@tonic-gate * (d) have different handler arguments 25430Sstevel@tonic-gate * (e) fire at different levels 25440Sstevel@tonic-gate * 25450Sstevel@tonic-gate * Of these, (e) seems somewhat dubious, but is nonetheless allowed. 25460Sstevel@tonic-gate * 25470Sstevel@tonic-gate * The omni online handler is called in the same context as cyclic_add(), 25480Sstevel@tonic-gate * and has the same liberties: omni online handlers may perform KM_SLEEP 25490Sstevel@tonic-gate * kernel memory allocations, and may grab locks which are also acquired 25500Sstevel@tonic-gate * by cyclic handlers. However, omni cyclic online handlers may _not_ 25510Sstevel@tonic-gate * call back into the cyclic subsystem, and should be generally careful 25520Sstevel@tonic-gate * about calling into arbitrary kernel subsystems. 25530Sstevel@tonic-gate * 25540Sstevel@tonic-gate * Offline handler 25550Sstevel@tonic-gate * 25560Sstevel@tonic-gate * The cyo_offline member is a pointer to a function which has the following 25570Sstevel@tonic-gate * three arguments: 25580Sstevel@tonic-gate * 25590Sstevel@tonic-gate * void * <-- Argument (cyo_arg) 25600Sstevel@tonic-gate * cpu_t * <-- Pointer to CPU about to be offlined 25610Sstevel@tonic-gate * void * <-- CPU's cyclic argument (that is, value 25620Sstevel@tonic-gate * to which cyh_arg member of the cyc_handler_t 25630Sstevel@tonic-gate * was set in the omni online handler) 25640Sstevel@tonic-gate * 25650Sstevel@tonic-gate * The omni cyclic offline handler is always called _after_ the omni 25660Sstevel@tonic-gate * cyclic has ceased firing on the specified CPU. Its purpose is to 25670Sstevel@tonic-gate * allow cleanup of any resources dynamically allocated in the omni cyclic 25680Sstevel@tonic-gate * online handler. The context of the offline handler is identical to 25690Sstevel@tonic-gate * that of the online handler; the same constraints and liberties apply. 25700Sstevel@tonic-gate * 25710Sstevel@tonic-gate * The offline handler is optional; it may be NULL. 25720Sstevel@tonic-gate * 25730Sstevel@tonic-gate * Return value 25740Sstevel@tonic-gate * 25750Sstevel@tonic-gate * cyclic_add_omni() returns a cyclic_id_t, which is guaranteed to be a 25760Sstevel@tonic-gate * value other than CYCLIC_NONE. cyclic_add_omni() cannot fail. 25770Sstevel@tonic-gate * 25780Sstevel@tonic-gate * Caller's context 25790Sstevel@tonic-gate * 25800Sstevel@tonic-gate * The caller's context is identical to that of cyclic_add(), specified 25810Sstevel@tonic-gate * above. 25820Sstevel@tonic-gate */ 25830Sstevel@tonic-gate cyclic_id_t 25840Sstevel@tonic-gate cyclic_add_omni(cyc_omni_handler_t *omni) 25850Sstevel@tonic-gate { 25860Sstevel@tonic-gate cyc_id_t *idp = cyclic_new_id(); 25870Sstevel@tonic-gate cyc_cpu_t *cpu; 25880Sstevel@tonic-gate cpu_t *c; 25890Sstevel@tonic-gate 25900Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 25910Sstevel@tonic-gate ASSERT(omni != NULL && omni->cyo_online != NULL); 25920Sstevel@tonic-gate 25930Sstevel@tonic-gate idp->cyi_omni_hdlr = *omni; 25940Sstevel@tonic-gate 25950Sstevel@tonic-gate c = cpu_list; 25960Sstevel@tonic-gate do { 25970Sstevel@tonic-gate if ((cpu = c->cpu_cyclic) == NULL) 25980Sstevel@tonic-gate continue; 25990Sstevel@tonic-gate 26000Sstevel@tonic-gate if (cpu->cyp_state != CYS_ONLINE) { 26010Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_OFFLINE); 26020Sstevel@tonic-gate continue; 26030Sstevel@tonic-gate } 26040Sstevel@tonic-gate 26050Sstevel@tonic-gate cyclic_omni_start(idp, cpu); 26060Sstevel@tonic-gate } while ((c = c->cpu_next) != cpu_list); 26070Sstevel@tonic-gate 26080Sstevel@tonic-gate /* 26090Sstevel@tonic-gate * We must have found at least one online CPU on which to run 26100Sstevel@tonic-gate * this cyclic. 26110Sstevel@tonic-gate */ 26120Sstevel@tonic-gate ASSERT(idp->cyi_omni_list != NULL); 26130Sstevel@tonic-gate ASSERT(idp->cyi_cpu == NULL); 26140Sstevel@tonic-gate 26150Sstevel@tonic-gate return ((uintptr_t)idp); 26160Sstevel@tonic-gate } 26170Sstevel@tonic-gate 26180Sstevel@tonic-gate /* 26190Sstevel@tonic-gate * void cyclic_remove(cyclic_id_t) 26200Sstevel@tonic-gate * 26210Sstevel@tonic-gate * Overview 26220Sstevel@tonic-gate * 26230Sstevel@tonic-gate * cyclic_remove() will remove the specified cyclic from the system. 26240Sstevel@tonic-gate * 26250Sstevel@tonic-gate * Arguments and notes 26260Sstevel@tonic-gate * 26270Sstevel@tonic-gate * The only argument is a cyclic_id returned from either cyclic_add() or 26280Sstevel@tonic-gate * cyclic_add_omni(). 26290Sstevel@tonic-gate * 26300Sstevel@tonic-gate * By the time cyclic_remove() returns, the caller is guaranteed that the 26310Sstevel@tonic-gate * removed cyclic handler has completed execution (this is the same 26320Sstevel@tonic-gate * semantic that untimeout() provides). As a result, cyclic_remove() may 26330Sstevel@tonic-gate * need to block, waiting for the removed cyclic to complete execution. 26340Sstevel@tonic-gate * This leads to an important constraint on the caller: no lock may be 26350Sstevel@tonic-gate * held across cyclic_remove() that also may be acquired by a cyclic 26360Sstevel@tonic-gate * handler. 26370Sstevel@tonic-gate * 26380Sstevel@tonic-gate * Return value 26390Sstevel@tonic-gate * 26400Sstevel@tonic-gate * None; cyclic_remove() always succeeds. 26410Sstevel@tonic-gate * 26420Sstevel@tonic-gate * Caller's context 26430Sstevel@tonic-gate * 26440Sstevel@tonic-gate * cpu_lock must be held by the caller, and the caller must not be in 26450Sstevel@tonic-gate * interrupt context. The caller may not hold any locks which are also 26460Sstevel@tonic-gate * grabbed by any cyclic handler. See "Arguments and notes", above. 26470Sstevel@tonic-gate */ 26480Sstevel@tonic-gate void 26490Sstevel@tonic-gate cyclic_remove(cyclic_id_t id) 26500Sstevel@tonic-gate { 26510Sstevel@tonic-gate cyc_id_t *idp = (cyc_id_t *)id; 26520Sstevel@tonic-gate cyc_id_t *prev = idp->cyi_prev, *next = idp->cyi_next; 26530Sstevel@tonic-gate cyc_cpu_t *cpu = idp->cyi_cpu; 26540Sstevel@tonic-gate 26550Sstevel@tonic-gate CYC_PTRACE("remove", idp, idp->cyi_cpu); 26560Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 26570Sstevel@tonic-gate 26580Sstevel@tonic-gate if (cpu != NULL) { 26590Sstevel@tonic-gate (void) cyclic_remove_here(cpu, idp->cyi_ndx, NULL, CY_WAIT); 26600Sstevel@tonic-gate } else { 26610Sstevel@tonic-gate ASSERT(idp->cyi_omni_list != NULL); 26620Sstevel@tonic-gate while (idp->cyi_omni_list != NULL) 26630Sstevel@tonic-gate cyclic_omni_stop(idp, idp->cyi_omni_list->cyo_cpu); 26640Sstevel@tonic-gate } 26650Sstevel@tonic-gate 26660Sstevel@tonic-gate if (prev != NULL) { 26670Sstevel@tonic-gate ASSERT(cyclic_id_head != idp); 26680Sstevel@tonic-gate prev->cyi_next = next; 26690Sstevel@tonic-gate } else { 26700Sstevel@tonic-gate ASSERT(cyclic_id_head == idp); 26710Sstevel@tonic-gate cyclic_id_head = next; 26720Sstevel@tonic-gate } 26730Sstevel@tonic-gate 26740Sstevel@tonic-gate if (next != NULL) 26750Sstevel@tonic-gate next->cyi_prev = prev; 26760Sstevel@tonic-gate 26770Sstevel@tonic-gate kmem_cache_free(cyclic_id_cache, idp); 26780Sstevel@tonic-gate } 26790Sstevel@tonic-gate 26800Sstevel@tonic-gate /* 26810Sstevel@tonic-gate * void cyclic_bind(cyclic_id_t, cpu_t *, cpupart_t *) 26820Sstevel@tonic-gate * 26830Sstevel@tonic-gate * Overview 26840Sstevel@tonic-gate * 26850Sstevel@tonic-gate * cyclic_bind() atomically changes the CPU and CPU partition bindings 26860Sstevel@tonic-gate * of a cyclic. 26870Sstevel@tonic-gate * 26880Sstevel@tonic-gate * Arguments and notes 26890Sstevel@tonic-gate * 26900Sstevel@tonic-gate * The first argument is a cyclic_id retuned from cyclic_add(). 26910Sstevel@tonic-gate * cyclic_bind() may _not_ be called on a cyclic_id returned from 26920Sstevel@tonic-gate * cyclic_add_omni(). 26930Sstevel@tonic-gate * 26940Sstevel@tonic-gate * The second argument specifies the CPU to which to bind the specified 26950Sstevel@tonic-gate * cyclic. If the specified cyclic is bound to a CPU other than the one 26960Sstevel@tonic-gate * specified, it will be unbound from its bound CPU. Unbinding the cyclic 26970Sstevel@tonic-gate * from its CPU may cause it to be juggled to another CPU. If the specified 26980Sstevel@tonic-gate * CPU is non-NULL, the cyclic will be subsequently rebound to the specified 26990Sstevel@tonic-gate * CPU. 27000Sstevel@tonic-gate * 27010Sstevel@tonic-gate * If a CPU with bound cyclics is transitioned into the P_NOINTR state, 27020Sstevel@tonic-gate * only cyclics not bound to the CPU can be juggled away; CPU-bound cyclics 27030Sstevel@tonic-gate * will continue to fire on the P_NOINTR CPU. A CPU with bound cyclics 27040Sstevel@tonic-gate * cannot be offlined (attempts to offline the CPU will return EBUSY). 27050Sstevel@tonic-gate * Likewise, cyclics may not be bound to an offline CPU; if the caller 27060Sstevel@tonic-gate * attempts to bind a cyclic to an offline CPU, the cyclic subsystem will 27070Sstevel@tonic-gate * panic. 27080Sstevel@tonic-gate * 27090Sstevel@tonic-gate * The third argument specifies the CPU partition to which to bind the 27100Sstevel@tonic-gate * specified cyclic. If the specified cyclic is bound to a CPU partition 27110Sstevel@tonic-gate * other than the one specified, it will be unbound from its bound 27120Sstevel@tonic-gate * partition. Unbinding the cyclic from its CPU partition may cause it 27130Sstevel@tonic-gate * to be juggled to another CPU. If the specified CPU partition is 27140Sstevel@tonic-gate * non-NULL, the cyclic will be subsequently rebound to the specified CPU 27150Sstevel@tonic-gate * partition. 27160Sstevel@tonic-gate * 27170Sstevel@tonic-gate * It is the caller's responsibility to assure that the specified CPU 27180Sstevel@tonic-gate * partition contains a CPU. If it does not, the cyclic subsystem will 27190Sstevel@tonic-gate * panic. A CPU partition with bound cyclics cannot be destroyed (attempts 27200Sstevel@tonic-gate * to destroy the partition will return EBUSY). If a CPU with 27210Sstevel@tonic-gate * partition-bound cyclics is transitioned into the P_NOINTR state, cyclics 27220Sstevel@tonic-gate * bound to the CPU's partition (but not bound to the CPU) will be juggled 27230Sstevel@tonic-gate * away only if there exists another CPU in the partition in the P_ONLINE 27240Sstevel@tonic-gate * state. 27250Sstevel@tonic-gate * 27260Sstevel@tonic-gate * It is the caller's responsibility to assure that the specified CPU and 27270Sstevel@tonic-gate * CPU partition are self-consistent. If both parameters are non-NULL, 27280Sstevel@tonic-gate * and the specified CPU partition does not contain the specified CPU, the 27290Sstevel@tonic-gate * cyclic subsystem will panic. 27300Sstevel@tonic-gate * 27310Sstevel@tonic-gate * It is the caller's responsibility to assure that the specified CPU has 27320Sstevel@tonic-gate * been configured with respect to the cyclic subsystem. Generally, this 27330Sstevel@tonic-gate * is always true for valid, on-line CPUs. The only periods of time during 27340Sstevel@tonic-gate * which this may not be true are during MP boot (i.e. after cyclic_init() 27350Sstevel@tonic-gate * is called but before cyclic_mp_init() is called) or during dynamic 27360Sstevel@tonic-gate * reconfiguration; cyclic_bind() should only be called with great care 27370Sstevel@tonic-gate * from these contexts. 27380Sstevel@tonic-gate * 27390Sstevel@tonic-gate * Return value 27400Sstevel@tonic-gate * 27410Sstevel@tonic-gate * None; cyclic_bind() always succeeds. 27420Sstevel@tonic-gate * 27430Sstevel@tonic-gate * Caller's context 27440Sstevel@tonic-gate * 27450Sstevel@tonic-gate * cpu_lock must be held by the caller, and the caller must not be in 27460Sstevel@tonic-gate * interrupt context. The caller may not hold any locks which are also 27470Sstevel@tonic-gate * grabbed by any cyclic handler. 27480Sstevel@tonic-gate */ 27490Sstevel@tonic-gate void 27500Sstevel@tonic-gate cyclic_bind(cyclic_id_t id, cpu_t *d, cpupart_t *part) 27510Sstevel@tonic-gate { 27520Sstevel@tonic-gate cyc_id_t *idp = (cyc_id_t *)id; 27530Sstevel@tonic-gate cyc_cpu_t *cpu = idp->cyi_cpu; 27540Sstevel@tonic-gate cpu_t *c; 27550Sstevel@tonic-gate uint16_t flags; 27560Sstevel@tonic-gate 27570Sstevel@tonic-gate CYC_PTRACE("bind", d, part); 27580Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 27590Sstevel@tonic-gate ASSERT(part == NULL || d == NULL || d->cpu_part == part); 27600Sstevel@tonic-gate 27610Sstevel@tonic-gate if (cpu == NULL) { 27620Sstevel@tonic-gate ASSERT(idp->cyi_omni_list != NULL); 27630Sstevel@tonic-gate panic("attempt to change binding of omnipresent cyclic"); 27640Sstevel@tonic-gate } 27650Sstevel@tonic-gate 27660Sstevel@tonic-gate c = cpu->cyp_cpu; 27670Sstevel@tonic-gate flags = cpu->cyp_cyclics[idp->cyi_ndx].cy_flags; 27680Sstevel@tonic-gate 27690Sstevel@tonic-gate if (c != d && (flags & CYF_CPU_BOUND)) 27700Sstevel@tonic-gate cyclic_unbind_cpu(id); 27710Sstevel@tonic-gate 27720Sstevel@tonic-gate /* 27730Sstevel@tonic-gate * Reload our cpu (we may have migrated). We don't have to reload 27740Sstevel@tonic-gate * the flags field here; if we were CYF_PART_BOUND on entry, we are 27750Sstevel@tonic-gate * CYF_PART_BOUND now. 27760Sstevel@tonic-gate */ 27770Sstevel@tonic-gate cpu = idp->cyi_cpu; 27780Sstevel@tonic-gate c = cpu->cyp_cpu; 27790Sstevel@tonic-gate 27800Sstevel@tonic-gate if (part != c->cpu_part && (flags & CYF_PART_BOUND)) 27810Sstevel@tonic-gate cyclic_unbind_cpupart(id); 27820Sstevel@tonic-gate 27830Sstevel@tonic-gate /* 27840Sstevel@tonic-gate * Now reload the flags field, asserting that if we are CPU bound, 27850Sstevel@tonic-gate * the CPU was specified (and likewise, if we are partition bound, 27860Sstevel@tonic-gate * the partition was specified). 27870Sstevel@tonic-gate */ 27880Sstevel@tonic-gate cpu = idp->cyi_cpu; 27890Sstevel@tonic-gate c = cpu->cyp_cpu; 27900Sstevel@tonic-gate flags = cpu->cyp_cyclics[idp->cyi_ndx].cy_flags; 27910Sstevel@tonic-gate ASSERT(!(flags & CYF_CPU_BOUND) || c == d); 27920Sstevel@tonic-gate ASSERT(!(flags & CYF_PART_BOUND) || c->cpu_part == part); 27930Sstevel@tonic-gate 27940Sstevel@tonic-gate if (!(flags & CYF_CPU_BOUND) && d != NULL) 27950Sstevel@tonic-gate cyclic_bind_cpu(id, d); 27960Sstevel@tonic-gate 27970Sstevel@tonic-gate if (!(flags & CYF_PART_BOUND) && part != NULL) 27980Sstevel@tonic-gate cyclic_bind_cpupart(id, part); 27990Sstevel@tonic-gate } 28000Sstevel@tonic-gate 28010Sstevel@tonic-gate hrtime_t 28020Sstevel@tonic-gate cyclic_getres() 28030Sstevel@tonic-gate { 28040Sstevel@tonic-gate return (cyclic_resolution); 28050Sstevel@tonic-gate } 28060Sstevel@tonic-gate 28070Sstevel@tonic-gate void 28080Sstevel@tonic-gate cyclic_init(cyc_backend_t *be, hrtime_t resolution) 28090Sstevel@tonic-gate { 28100Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 28110Sstevel@tonic-gate 28120Sstevel@tonic-gate CYC_PTRACE("init", be, resolution); 28130Sstevel@tonic-gate cyclic_resolution = resolution; 28140Sstevel@tonic-gate 28150Sstevel@tonic-gate /* 28160Sstevel@tonic-gate * Copy the passed cyc_backend into the backend template. This must 28170Sstevel@tonic-gate * be done before the CPU can be configured. 28180Sstevel@tonic-gate */ 28190Sstevel@tonic-gate bcopy(be, &cyclic_backend, sizeof (cyc_backend_t)); 28200Sstevel@tonic-gate 28210Sstevel@tonic-gate /* 28220Sstevel@tonic-gate * It's safe to look at the "CPU" pointer without disabling kernel 28230Sstevel@tonic-gate * preemption; cyclic_init() is called only during startup by the 28240Sstevel@tonic-gate * cyclic backend. 28250Sstevel@tonic-gate */ 28260Sstevel@tonic-gate cyclic_configure(CPU); 28270Sstevel@tonic-gate cyclic_online(CPU); 28280Sstevel@tonic-gate } 28290Sstevel@tonic-gate 28300Sstevel@tonic-gate /* 28310Sstevel@tonic-gate * It is assumed that cyclic_mp_init() is called some time after cyclic 28320Sstevel@tonic-gate * init (and therefore, after cpu0 has been initialized). We grab cpu_lock, 28330Sstevel@tonic-gate * find the already initialized CPU, and initialize every other CPU with the 28340Sstevel@tonic-gate * same backend. Finally, we register a cpu_setup function. 28350Sstevel@tonic-gate */ 28360Sstevel@tonic-gate void 28370Sstevel@tonic-gate cyclic_mp_init() 28380Sstevel@tonic-gate { 28390Sstevel@tonic-gate cpu_t *c; 28400Sstevel@tonic-gate 28410Sstevel@tonic-gate mutex_enter(&cpu_lock); 28420Sstevel@tonic-gate 28430Sstevel@tonic-gate c = cpu_list; 28440Sstevel@tonic-gate do { 28450Sstevel@tonic-gate if (c->cpu_cyclic == NULL) { 28460Sstevel@tonic-gate cyclic_configure(c); 28470Sstevel@tonic-gate cyclic_online(c); 28480Sstevel@tonic-gate } 28490Sstevel@tonic-gate } while ((c = c->cpu_next) != cpu_list); 28500Sstevel@tonic-gate 28510Sstevel@tonic-gate register_cpu_setup_func((cpu_setup_func_t *)cyclic_cpu_setup, NULL); 28520Sstevel@tonic-gate mutex_exit(&cpu_lock); 28530Sstevel@tonic-gate } 28540Sstevel@tonic-gate 28550Sstevel@tonic-gate /* 28560Sstevel@tonic-gate * int cyclic_juggle(cpu_t *) 28570Sstevel@tonic-gate * 28580Sstevel@tonic-gate * Overview 28590Sstevel@tonic-gate * 28600Sstevel@tonic-gate * cyclic_juggle() juggles as many cyclics as possible away from the 28610Sstevel@tonic-gate * specified CPU; all remaining cyclics on the CPU will either be CPU- 28620Sstevel@tonic-gate * or partition-bound. 28630Sstevel@tonic-gate * 28640Sstevel@tonic-gate * Arguments and notes 28650Sstevel@tonic-gate * 28660Sstevel@tonic-gate * The only argument to cyclic_juggle() is the CPU from which cyclics 28670Sstevel@tonic-gate * should be juggled. CPU-bound cyclics are never juggled; partition-bound 28680Sstevel@tonic-gate * cyclics are only juggled if the specified CPU is in the P_NOINTR state 28690Sstevel@tonic-gate * and there exists a P_ONLINE CPU in the partition. The cyclic subsystem 28700Sstevel@tonic-gate * assures that a cyclic will never fire late or spuriously, even while 28710Sstevel@tonic-gate * being juggled. 28720Sstevel@tonic-gate * 28730Sstevel@tonic-gate * Return value 28740Sstevel@tonic-gate * 28750Sstevel@tonic-gate * cyclic_juggle() returns a non-zero value if all cyclics were able to 28760Sstevel@tonic-gate * be juggled away from the CPU, and zero if one or more cyclics could 28770Sstevel@tonic-gate * not be juggled away. 28780Sstevel@tonic-gate * 28790Sstevel@tonic-gate * Caller's context 28800Sstevel@tonic-gate * 28810Sstevel@tonic-gate * cpu_lock must be held by the caller, and the caller must not be in 28820Sstevel@tonic-gate * interrupt context. The caller may not hold any locks which are also 28830Sstevel@tonic-gate * grabbed by any cyclic handler. While cyclic_juggle() _may_ be called 28840Sstevel@tonic-gate * in any context satisfying these constraints, it _must_ be called 28850Sstevel@tonic-gate * immediately after clearing CPU_ENABLE (i.e. before dropping cpu_lock). 28860Sstevel@tonic-gate * Failure to do so could result in an assertion failure in the cyclic 28870Sstevel@tonic-gate * subsystem. 28880Sstevel@tonic-gate */ 28890Sstevel@tonic-gate int 28900Sstevel@tonic-gate cyclic_juggle(cpu_t *c) 28910Sstevel@tonic-gate { 28920Sstevel@tonic-gate cyc_cpu_t *cpu = c->cpu_cyclic; 28930Sstevel@tonic-gate cyc_id_t *idp; 28940Sstevel@tonic-gate int all_juggled = 1; 28950Sstevel@tonic-gate 28960Sstevel@tonic-gate CYC_PTRACE1("juggle", c); 28970Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 28980Sstevel@tonic-gate 28990Sstevel@tonic-gate /* 29000Sstevel@tonic-gate * We'll go through each cyclic on the CPU, attempting to juggle 29010Sstevel@tonic-gate * each one elsewhere. 29020Sstevel@tonic-gate */ 29030Sstevel@tonic-gate for (idp = cyclic_id_head; idp != NULL; idp = idp->cyi_next) { 29040Sstevel@tonic-gate if (idp->cyi_cpu != cpu) 29050Sstevel@tonic-gate continue; 29060Sstevel@tonic-gate 29070Sstevel@tonic-gate if (cyclic_juggle_one(idp) == 0) { 29080Sstevel@tonic-gate all_juggled = 0; 29090Sstevel@tonic-gate continue; 29100Sstevel@tonic-gate } 29110Sstevel@tonic-gate 29120Sstevel@tonic-gate ASSERT(idp->cyi_cpu != cpu); 29130Sstevel@tonic-gate } 29140Sstevel@tonic-gate 29150Sstevel@tonic-gate return (all_juggled); 29160Sstevel@tonic-gate } 29170Sstevel@tonic-gate 29180Sstevel@tonic-gate /* 29190Sstevel@tonic-gate * int cyclic_offline(cpu_t *) 29200Sstevel@tonic-gate * 29210Sstevel@tonic-gate * Overview 29220Sstevel@tonic-gate * 29230Sstevel@tonic-gate * cyclic_offline() offlines the cyclic subsystem on the specified CPU. 29240Sstevel@tonic-gate * 29250Sstevel@tonic-gate * Arguments and notes 29260Sstevel@tonic-gate * 29270Sstevel@tonic-gate * The only argument to cyclic_offline() is a CPU to offline. 29280Sstevel@tonic-gate * cyclic_offline() will attempt to juggle cyclics away from the specified 29290Sstevel@tonic-gate * CPU. 29300Sstevel@tonic-gate * 29310Sstevel@tonic-gate * Return value 29320Sstevel@tonic-gate * 29330Sstevel@tonic-gate * cyclic_offline() returns 1 if all cyclics on the CPU were juggled away 29340Sstevel@tonic-gate * and the cyclic subsystem on the CPU was successfully offlines. 29350Sstevel@tonic-gate * cyclic_offline returns 0 if some cyclics remain, blocking the cyclic 29360Sstevel@tonic-gate * offline operation. All remaining cyclics on the CPU will either be 29370Sstevel@tonic-gate * CPU- or partition-bound. 29380Sstevel@tonic-gate * 29390Sstevel@tonic-gate * See the "Arguments and notes" of cyclic_juggle(), below, for more detail 29400Sstevel@tonic-gate * on cyclic juggling. 29410Sstevel@tonic-gate * 29420Sstevel@tonic-gate * Caller's context 29430Sstevel@tonic-gate * 29440Sstevel@tonic-gate * The only caller of cyclic_offline() should be the processor management 29450Sstevel@tonic-gate * subsystem. It is expected that the caller of cyclic_offline() will 29460Sstevel@tonic-gate * offline the CPU immediately after cyclic_offline() returns success (i.e. 29470Sstevel@tonic-gate * before dropping cpu_lock). Moreover, it is expected that the caller will 29480Sstevel@tonic-gate * fail the CPU offline operation if cyclic_offline() returns failure. 29490Sstevel@tonic-gate */ 29500Sstevel@tonic-gate int 29510Sstevel@tonic-gate cyclic_offline(cpu_t *c) 29520Sstevel@tonic-gate { 29530Sstevel@tonic-gate cyc_cpu_t *cpu = c->cpu_cyclic; 29540Sstevel@tonic-gate cyc_id_t *idp; 29550Sstevel@tonic-gate 29560Sstevel@tonic-gate CYC_PTRACE1("offline", cpu); 29570Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 29580Sstevel@tonic-gate 29590Sstevel@tonic-gate if (!cyclic_juggle(c)) 29600Sstevel@tonic-gate return (0); 29610Sstevel@tonic-gate 29620Sstevel@tonic-gate /* 29630Sstevel@tonic-gate * This CPU is headed offline; we need to now stop omnipresent 29640Sstevel@tonic-gate * cyclic firing on this CPU. 29650Sstevel@tonic-gate */ 29660Sstevel@tonic-gate for (idp = cyclic_id_head; idp != NULL; idp = idp->cyi_next) { 29670Sstevel@tonic-gate if (idp->cyi_cpu != NULL) 29680Sstevel@tonic-gate continue; 29690Sstevel@tonic-gate 29700Sstevel@tonic-gate /* 29710Sstevel@tonic-gate * We cannot possibly be offlining the last CPU; cyi_omni_list 29720Sstevel@tonic-gate * must be non-NULL. 29730Sstevel@tonic-gate */ 29740Sstevel@tonic-gate ASSERT(idp->cyi_omni_list != NULL); 29750Sstevel@tonic-gate cyclic_omni_stop(idp, cpu); 29760Sstevel@tonic-gate } 29770Sstevel@tonic-gate 29780Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_ONLINE); 29790Sstevel@tonic-gate cpu->cyp_state = CYS_OFFLINE; 29800Sstevel@tonic-gate 29810Sstevel@tonic-gate return (1); 29820Sstevel@tonic-gate } 29830Sstevel@tonic-gate 29840Sstevel@tonic-gate /* 29850Sstevel@tonic-gate * void cyclic_online(cpu_t *) 29860Sstevel@tonic-gate * 29870Sstevel@tonic-gate * Overview 29880Sstevel@tonic-gate * 29890Sstevel@tonic-gate * cyclic_online() onlines a CPU previously offlined with cyclic_offline(). 29900Sstevel@tonic-gate * 29910Sstevel@tonic-gate * Arguments and notes 29920Sstevel@tonic-gate * 29930Sstevel@tonic-gate * cyclic_online()'s only argument is a CPU to online. The specified 29940Sstevel@tonic-gate * CPU must have been previously offlined with cyclic_offline(). After 29950Sstevel@tonic-gate * cyclic_online() returns, the specified CPU will be eligible to execute 29960Sstevel@tonic-gate * cyclics. 29970Sstevel@tonic-gate * 29980Sstevel@tonic-gate * Return value 29990Sstevel@tonic-gate * 30000Sstevel@tonic-gate * None; cyclic_online() always succeeds. 30010Sstevel@tonic-gate * 30020Sstevel@tonic-gate * Caller's context 30030Sstevel@tonic-gate * 30040Sstevel@tonic-gate * cyclic_online() should only be called by the processor management 30050Sstevel@tonic-gate * subsystem; cpu_lock must be held. 30060Sstevel@tonic-gate */ 30070Sstevel@tonic-gate void 30080Sstevel@tonic-gate cyclic_online(cpu_t *c) 30090Sstevel@tonic-gate { 30100Sstevel@tonic-gate cyc_cpu_t *cpu = c->cpu_cyclic; 30110Sstevel@tonic-gate cyc_id_t *idp; 30120Sstevel@tonic-gate 30130Sstevel@tonic-gate CYC_PTRACE1("online", cpu); 30140Sstevel@tonic-gate ASSERT(c->cpu_flags & CPU_ENABLE); 30150Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 30160Sstevel@tonic-gate ASSERT(cpu->cyp_state == CYS_OFFLINE); 30170Sstevel@tonic-gate 30180Sstevel@tonic-gate cpu->cyp_state = CYS_ONLINE; 30190Sstevel@tonic-gate 30200Sstevel@tonic-gate /* 30210Sstevel@tonic-gate * Now that this CPU is open for business, we need to start firing 30220Sstevel@tonic-gate * all omnipresent cyclics on it. 30230Sstevel@tonic-gate */ 30240Sstevel@tonic-gate for (idp = cyclic_id_head; idp != NULL; idp = idp->cyi_next) { 30250Sstevel@tonic-gate if (idp->cyi_cpu != NULL) 30260Sstevel@tonic-gate continue; 30270Sstevel@tonic-gate 30280Sstevel@tonic-gate cyclic_omni_start(idp, cpu); 30290Sstevel@tonic-gate } 30300Sstevel@tonic-gate } 30310Sstevel@tonic-gate 30320Sstevel@tonic-gate /* 30330Sstevel@tonic-gate * void cyclic_move_in(cpu_t *) 30340Sstevel@tonic-gate * 30350Sstevel@tonic-gate * Overview 30360Sstevel@tonic-gate * 30370Sstevel@tonic-gate * cyclic_move_in() is called by the CPU partition code immediately after 30380Sstevel@tonic-gate * the specified CPU has moved into a new partition. 30390Sstevel@tonic-gate * 30400Sstevel@tonic-gate * Arguments and notes 30410Sstevel@tonic-gate * 30420Sstevel@tonic-gate * The only argument to cyclic_move_in() is a CPU which has moved into a 30430Sstevel@tonic-gate * new partition. If the specified CPU is P_ONLINE, and every other 30440Sstevel@tonic-gate * CPU in the specified CPU's new partition is P_NOINTR, cyclic_move_in() 30450Sstevel@tonic-gate * will juggle all partition-bound, CPU-unbound cyclics to the specified 30460Sstevel@tonic-gate * CPU. 30470Sstevel@tonic-gate * 30480Sstevel@tonic-gate * Return value 30490Sstevel@tonic-gate * 30500Sstevel@tonic-gate * None; cyclic_move_in() always succeeds. 30510Sstevel@tonic-gate * 30520Sstevel@tonic-gate * Caller's context 30530Sstevel@tonic-gate * 30540Sstevel@tonic-gate * cyclic_move_in() should _only_ be called immediately after a CPU has 30550Sstevel@tonic-gate * moved into a new partition, with cpu_lock held. As with other calls 30560Sstevel@tonic-gate * into the cyclic subsystem, no lock may be held which is also grabbed 30570Sstevel@tonic-gate * by any cyclic handler. 30580Sstevel@tonic-gate */ 30590Sstevel@tonic-gate void 30600Sstevel@tonic-gate cyclic_move_in(cpu_t *d) 30610Sstevel@tonic-gate { 30620Sstevel@tonic-gate cyc_id_t *idp; 30630Sstevel@tonic-gate cyc_cpu_t *dest = d->cpu_cyclic; 30640Sstevel@tonic-gate cyclic_t *cyclic; 30650Sstevel@tonic-gate cpupart_t *part = d->cpu_part; 30660Sstevel@tonic-gate 30670Sstevel@tonic-gate CYC_PTRACE("move-in", dest, part); 30680Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 30690Sstevel@tonic-gate 30700Sstevel@tonic-gate /* 30710Sstevel@tonic-gate * Look for CYF_PART_BOUND cyclics in the new partition. If 30720Sstevel@tonic-gate * we find one, check to see if it is currently on a CPU which has 30730Sstevel@tonic-gate * interrupts disabled. If it is (and if this CPU currently has 30740Sstevel@tonic-gate * interrupts enabled), we'll juggle those cyclics over here. 30750Sstevel@tonic-gate */ 30760Sstevel@tonic-gate if (!(d->cpu_flags & CPU_ENABLE)) { 30770Sstevel@tonic-gate CYC_PTRACE1("move-in-none", dest); 30780Sstevel@tonic-gate return; 30790Sstevel@tonic-gate } 30800Sstevel@tonic-gate 30810Sstevel@tonic-gate for (idp = cyclic_id_head; idp != NULL; idp = idp->cyi_next) { 30820Sstevel@tonic-gate cyc_cpu_t *cpu = idp->cyi_cpu; 30830Sstevel@tonic-gate cpu_t *c; 30840Sstevel@tonic-gate 30850Sstevel@tonic-gate /* 30860Sstevel@tonic-gate * Omnipresent cyclics are exempt from juggling. 30870Sstevel@tonic-gate */ 30880Sstevel@tonic-gate if (cpu == NULL) 30890Sstevel@tonic-gate continue; 30900Sstevel@tonic-gate 30910Sstevel@tonic-gate c = cpu->cyp_cpu; 30920Sstevel@tonic-gate 30930Sstevel@tonic-gate if (c->cpu_part != part || (c->cpu_flags & CPU_ENABLE)) 30940Sstevel@tonic-gate continue; 30950Sstevel@tonic-gate 30960Sstevel@tonic-gate cyclic = &cpu->cyp_cyclics[idp->cyi_ndx]; 30970Sstevel@tonic-gate 30980Sstevel@tonic-gate if (cyclic->cy_flags & CYF_CPU_BOUND) 30990Sstevel@tonic-gate continue; 31000Sstevel@tonic-gate 31010Sstevel@tonic-gate /* 31020Sstevel@tonic-gate * We know that this cyclic is bound to its processor set 31030Sstevel@tonic-gate * (otherwise, it would not be on a CPU with interrupts 31040Sstevel@tonic-gate * disabled); juggle it to our CPU. 31050Sstevel@tonic-gate */ 31060Sstevel@tonic-gate ASSERT(cyclic->cy_flags & CYF_PART_BOUND); 31070Sstevel@tonic-gate cyclic_juggle_one_to(idp, dest); 31080Sstevel@tonic-gate } 31090Sstevel@tonic-gate 31100Sstevel@tonic-gate CYC_PTRACE1("move-in-done", dest); 31110Sstevel@tonic-gate } 31120Sstevel@tonic-gate 31130Sstevel@tonic-gate /* 31140Sstevel@tonic-gate * int cyclic_move_out(cpu_t *) 31150Sstevel@tonic-gate * 31160Sstevel@tonic-gate * Overview 31170Sstevel@tonic-gate * 31180Sstevel@tonic-gate * cyclic_move_out() is called by the CPU partition code immediately before 31190Sstevel@tonic-gate * the specified CPU is to move out of its partition. 31200Sstevel@tonic-gate * 31210Sstevel@tonic-gate * Arguments and notes 31220Sstevel@tonic-gate * 31230Sstevel@tonic-gate * The only argument to cyclic_move_out() is a CPU which is to move out of 31240Sstevel@tonic-gate * its partition. 31250Sstevel@tonic-gate * 31260Sstevel@tonic-gate * cyclic_move_out() will attempt to juggle away all partition-bound 31270Sstevel@tonic-gate * cyclics. If the specified CPU is the last CPU in a partition with 31280Sstevel@tonic-gate * partition-bound cyclics, cyclic_move_out() will fail. If there exists 31290Sstevel@tonic-gate * a partition-bound cyclic which is CPU-bound to the specified CPU, 31300Sstevel@tonic-gate * cyclic_move_out() will fail. 31310Sstevel@tonic-gate * 31320Sstevel@tonic-gate * Note that cyclic_move_out() will _only_ attempt to juggle away 31330Sstevel@tonic-gate * partition-bound cyclics; CPU-bound cyclics which are not partition-bound 31340Sstevel@tonic-gate * and unbound cyclics are not affected by changing the partition 31350Sstevel@tonic-gate * affiliation of the CPU. 31360Sstevel@tonic-gate * 31370Sstevel@tonic-gate * Return value 31380Sstevel@tonic-gate * 31390Sstevel@tonic-gate * cyclic_move_out() returns 1 if all partition-bound cyclics on the CPU 31400Sstevel@tonic-gate * were juggled away; 0 if some cyclics remain. 31410Sstevel@tonic-gate * 31420Sstevel@tonic-gate * Caller's context 31430Sstevel@tonic-gate * 31440Sstevel@tonic-gate * cyclic_move_out() should _only_ be called immediately before a CPU has 31450Sstevel@tonic-gate * moved out of its partition, with cpu_lock held. It is expected that 31460Sstevel@tonic-gate * the caller of cyclic_move_out() will change the processor set affiliation 31470Sstevel@tonic-gate * of the specified CPU immediately after cyclic_move_out() returns 31480Sstevel@tonic-gate * success (i.e. before dropping cpu_lock). Moreover, it is expected that 31490Sstevel@tonic-gate * the caller will fail the CPU repartitioning operation if cyclic_move_out() 31500Sstevel@tonic-gate * returns failure. As with other calls into the cyclic subsystem, no lock 31510Sstevel@tonic-gate * may be held which is also grabbed by any cyclic handler. 31520Sstevel@tonic-gate */ 31530Sstevel@tonic-gate int 31540Sstevel@tonic-gate cyclic_move_out(cpu_t *c) 31550Sstevel@tonic-gate { 31560Sstevel@tonic-gate cyc_id_t *idp; 31570Sstevel@tonic-gate cyc_cpu_t *cpu = c->cpu_cyclic, *dest; 31580Sstevel@tonic-gate cyclic_t *cyclic, *cyclics = cpu->cyp_cyclics; 31590Sstevel@tonic-gate cpupart_t *part = c->cpu_part; 31600Sstevel@tonic-gate 31610Sstevel@tonic-gate CYC_PTRACE1("move-out", cpu); 31620Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 31630Sstevel@tonic-gate 31640Sstevel@tonic-gate /* 31650Sstevel@tonic-gate * If there are any CYF_PART_BOUND cyclics on this CPU, we need 31660Sstevel@tonic-gate * to try to juggle them away. 31670Sstevel@tonic-gate */ 31680Sstevel@tonic-gate for (idp = cyclic_id_head; idp != NULL; idp = idp->cyi_next) { 31690Sstevel@tonic-gate 31700Sstevel@tonic-gate if (idp->cyi_cpu != cpu) 31710Sstevel@tonic-gate continue; 31720Sstevel@tonic-gate 31730Sstevel@tonic-gate cyclic = &cyclics[idp->cyi_ndx]; 31740Sstevel@tonic-gate 31750Sstevel@tonic-gate if (!(cyclic->cy_flags & CYF_PART_BOUND)) 31760Sstevel@tonic-gate continue; 31770Sstevel@tonic-gate 31780Sstevel@tonic-gate dest = cyclic_pick_cpu(part, c, c, cyclic->cy_flags); 31790Sstevel@tonic-gate 31800Sstevel@tonic-gate if (dest == NULL) { 31810Sstevel@tonic-gate /* 31820Sstevel@tonic-gate * We can't juggle this cyclic; we need to return 31830Sstevel@tonic-gate * failure (we won't bother trying to juggle away 31840Sstevel@tonic-gate * other cyclics). 31850Sstevel@tonic-gate */ 31860Sstevel@tonic-gate CYC_PTRACE("move-out-fail", cpu, idp); 31870Sstevel@tonic-gate return (0); 31880Sstevel@tonic-gate } 31890Sstevel@tonic-gate cyclic_juggle_one_to(idp, dest); 31900Sstevel@tonic-gate } 31910Sstevel@tonic-gate 31920Sstevel@tonic-gate CYC_PTRACE1("move-out-done", cpu); 31930Sstevel@tonic-gate return (1); 31940Sstevel@tonic-gate } 31950Sstevel@tonic-gate 31960Sstevel@tonic-gate /* 31970Sstevel@tonic-gate * void cyclic_suspend() 31980Sstevel@tonic-gate * 31990Sstevel@tonic-gate * Overview 32000Sstevel@tonic-gate * 32010Sstevel@tonic-gate * cyclic_suspend() suspends all cyclic activity throughout the cyclic 32020Sstevel@tonic-gate * subsystem. It should be called only by subsystems which are attempting 32030Sstevel@tonic-gate * to suspend the entire system (e.g. checkpoint/resume, dynamic 32040Sstevel@tonic-gate * reconfiguration). 32050Sstevel@tonic-gate * 32060Sstevel@tonic-gate * Arguments and notes 32070Sstevel@tonic-gate * 32080Sstevel@tonic-gate * cyclic_suspend() takes no arguments. Each CPU with an active cyclic 32090Sstevel@tonic-gate * disables its backend (offline CPUs disable their backends as part of 32100Sstevel@tonic-gate * the cyclic_offline() operation), thereby disabling future CY_HIGH_LEVEL 32110Sstevel@tonic-gate * interrupts. 32120Sstevel@tonic-gate * 32130Sstevel@tonic-gate * Note that disabling CY_HIGH_LEVEL interrupts does not completely preclude 32140Sstevel@tonic-gate * cyclic handlers from being called after cyclic_suspend() returns: if a 32150Sstevel@tonic-gate * CY_LOCK_LEVEL or CY_LOW_LEVEL interrupt thread was blocked at the time 32160Sstevel@tonic-gate * of cyclic_suspend(), cyclic handlers at its level may continue to be 32170Sstevel@tonic-gate * called after the interrupt thread becomes unblocked. The 32180Sstevel@tonic-gate * post-cyclic_suspend() activity is bounded by the pend count on all 32190Sstevel@tonic-gate * cyclics at the time of cyclic_suspend(). Callers concerned with more 32200Sstevel@tonic-gate * than simply disabling future CY_HIGH_LEVEL interrupts must check for 32210Sstevel@tonic-gate * this condition. 32220Sstevel@tonic-gate * 32230Sstevel@tonic-gate * On most platforms, timestamps from gethrtime() and gethrestime() are not 32240Sstevel@tonic-gate * guaranteed to monotonically increase between cyclic_suspend() and 32250Sstevel@tonic-gate * cyclic_resume(). However, timestamps are guaranteed to monotonically 32260Sstevel@tonic-gate * increase across the entire cyclic_suspend()/cyclic_resume() operation. 32270Sstevel@tonic-gate * That is, every timestamp obtained before cyclic_suspend() will be less 32280Sstevel@tonic-gate * than every timestamp obtained after cyclic_resume(). 32290Sstevel@tonic-gate * 32300Sstevel@tonic-gate * Return value 32310Sstevel@tonic-gate * 32320Sstevel@tonic-gate * None; cyclic_suspend() always succeeds. 32330Sstevel@tonic-gate * 32340Sstevel@tonic-gate * Caller's context 32350Sstevel@tonic-gate * 32360Sstevel@tonic-gate * The cyclic subsystem must be configured on every valid CPU; 32370Sstevel@tonic-gate * cyclic_suspend() may not be called during boot or during dynamic 32380Sstevel@tonic-gate * reconfiguration. Additionally, cpu_lock must be held, and the caller 32390Sstevel@tonic-gate * cannot be in high-level interrupt context. However, unlike most other 32400Sstevel@tonic-gate * cyclic entry points, cyclic_suspend() may be called with locks held 32410Sstevel@tonic-gate * which are also acquired by CY_LOCK_LEVEL or CY_LOW_LEVEL cyclic 32420Sstevel@tonic-gate * handlers. 32430Sstevel@tonic-gate */ 32440Sstevel@tonic-gate void 32450Sstevel@tonic-gate cyclic_suspend() 32460Sstevel@tonic-gate { 32470Sstevel@tonic-gate cpu_t *c; 32480Sstevel@tonic-gate cyc_cpu_t *cpu; 32490Sstevel@tonic-gate cyc_xcallarg_t arg; 32500Sstevel@tonic-gate cyc_backend_t *be; 32510Sstevel@tonic-gate 32520Sstevel@tonic-gate CYC_PTRACE0("suspend"); 32530Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 32540Sstevel@tonic-gate c = cpu_list; 32550Sstevel@tonic-gate 32560Sstevel@tonic-gate do { 32570Sstevel@tonic-gate cpu = c->cpu_cyclic; 32580Sstevel@tonic-gate be = cpu->cyp_backend; 32590Sstevel@tonic-gate arg.cyx_cpu = cpu; 32600Sstevel@tonic-gate 32610Sstevel@tonic-gate be->cyb_xcall(be->cyb_arg, c, 32620Sstevel@tonic-gate (cyc_func_t)cyclic_suspend_xcall, &arg); 32630Sstevel@tonic-gate } while ((c = c->cpu_next) != cpu_list); 32640Sstevel@tonic-gate } 32650Sstevel@tonic-gate 32660Sstevel@tonic-gate /* 32670Sstevel@tonic-gate * void cyclic_resume() 32680Sstevel@tonic-gate * 32690Sstevel@tonic-gate * cyclic_resume() resumes all cyclic activity throughout the cyclic 32700Sstevel@tonic-gate * subsystem. It should be called only by system-suspending subsystems. 32710Sstevel@tonic-gate * 32720Sstevel@tonic-gate * Arguments and notes 32730Sstevel@tonic-gate * 32740Sstevel@tonic-gate * cyclic_resume() takes no arguments. Each CPU with an active cyclic 32750Sstevel@tonic-gate * reenables and reprograms its backend (offline CPUs are not reenabled). 32760Sstevel@tonic-gate * On most platforms, timestamps from gethrtime() and gethrestime() are not 32770Sstevel@tonic-gate * guaranteed to monotonically increase between cyclic_suspend() and 32780Sstevel@tonic-gate * cyclic_resume(). However, timestamps are guaranteed to monotonically 32790Sstevel@tonic-gate * increase across the entire cyclic_suspend()/cyclic_resume() operation. 32800Sstevel@tonic-gate * That is, every timestamp obtained before cyclic_suspend() will be less 32810Sstevel@tonic-gate * than every timestamp obtained after cyclic_resume(). 32820Sstevel@tonic-gate * 32830Sstevel@tonic-gate * Return value 32840Sstevel@tonic-gate * 32850Sstevel@tonic-gate * None; cyclic_resume() always succeeds. 32860Sstevel@tonic-gate * 32870Sstevel@tonic-gate * Caller's context 32880Sstevel@tonic-gate * 32890Sstevel@tonic-gate * The cyclic subsystem must be configured on every valid CPU; 32900Sstevel@tonic-gate * cyclic_resume() may not be called during boot or during dynamic 32910Sstevel@tonic-gate * reconfiguration. Additionally, cpu_lock must be held, and the caller 32920Sstevel@tonic-gate * cannot be in high-level interrupt context. However, unlike most other 32930Sstevel@tonic-gate * cyclic entry points, cyclic_resume() may be called with locks held which 32940Sstevel@tonic-gate * are also acquired by CY_LOCK_LEVEL or CY_LOW_LEVEL cyclic handlers. 32950Sstevel@tonic-gate */ 32960Sstevel@tonic-gate void 32970Sstevel@tonic-gate cyclic_resume() 32980Sstevel@tonic-gate { 32990Sstevel@tonic-gate cpu_t *c; 33000Sstevel@tonic-gate cyc_cpu_t *cpu; 33010Sstevel@tonic-gate cyc_xcallarg_t arg; 33020Sstevel@tonic-gate cyc_backend_t *be; 33030Sstevel@tonic-gate 33040Sstevel@tonic-gate CYC_PTRACE0("resume"); 33050Sstevel@tonic-gate ASSERT(MUTEX_HELD(&cpu_lock)); 33060Sstevel@tonic-gate 33070Sstevel@tonic-gate c = cpu_list; 33080Sstevel@tonic-gate 33090Sstevel@tonic-gate do { 33100Sstevel@tonic-gate cpu = c->cpu_cyclic; 33110Sstevel@tonic-gate be = cpu->cyp_backend; 33120Sstevel@tonic-gate arg.cyx_cpu = cpu; 33130Sstevel@tonic-gate 33140Sstevel@tonic-gate be->cyb_xcall(be->cyb_arg, c, 33150Sstevel@tonic-gate (cyc_func_t)cyclic_resume_xcall, &arg); 33160Sstevel@tonic-gate } while ((c = c->cpu_next) != cpu_list); 33170Sstevel@tonic-gate } 3318