1*11389SAlexander.Kolbasov@Sun.COM /* 2*11389SAlexander.Kolbasov@Sun.COM * CDDL HEADER START 3*11389SAlexander.Kolbasov@Sun.COM * 4*11389SAlexander.Kolbasov@Sun.COM * The contents of this file are subject to the terms of the 5*11389SAlexander.Kolbasov@Sun.COM * Common Development and Distribution License (the "License"). 6*11389SAlexander.Kolbasov@Sun.COM * You may not use this file except in compliance with the License. 7*11389SAlexander.Kolbasov@Sun.COM * 8*11389SAlexander.Kolbasov@Sun.COM * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 9*11389SAlexander.Kolbasov@Sun.COM * or http://www.opensolaris.org/os/licensing. 10*11389SAlexander.Kolbasov@Sun.COM * See the License for the specific language governing permissions 11*11389SAlexander.Kolbasov@Sun.COM * and limitations under the License. 12*11389SAlexander.Kolbasov@Sun.COM * 13*11389SAlexander.Kolbasov@Sun.COM * When distributing Covered Code, include this CDDL HEADER in each 14*11389SAlexander.Kolbasov@Sun.COM * file and include the License file at usr/src/OPENSOLARIS.LICENSE. 15*11389SAlexander.Kolbasov@Sun.COM * If applicable, add the following below this CDDL HEADER, with the 16*11389SAlexander.Kolbasov@Sun.COM * fields enclosed by brackets "[]" replaced with your own identifying 17*11389SAlexander.Kolbasov@Sun.COM * information: Portions Copyright [yyyy] [name of copyright owner] 18*11389SAlexander.Kolbasov@Sun.COM * 19*11389SAlexander.Kolbasov@Sun.COM * CDDL HEADER END 20*11389SAlexander.Kolbasov@Sun.COM */ 21*11389SAlexander.Kolbasov@Sun.COM 22*11389SAlexander.Kolbasov@Sun.COM /* 23*11389SAlexander.Kolbasov@Sun.COM * Copyright 2009 Sun Microsystems, Inc. All rights reserved. 24*11389SAlexander.Kolbasov@Sun.COM * Use is subject to license terms. 25*11389SAlexander.Kolbasov@Sun.COM */ 26*11389SAlexander.Kolbasov@Sun.COM 27*11389SAlexander.Kolbasov@Sun.COM /* 28*11389SAlexander.Kolbasov@Sun.COM * Support for determining capacity and utilization of performance relevant 29*11389SAlexander.Kolbasov@Sun.COM * hardware components in a computer 30*11389SAlexander.Kolbasov@Sun.COM * 31*11389SAlexander.Kolbasov@Sun.COM * THEORY 32*11389SAlexander.Kolbasov@Sun.COM * ------ 33*11389SAlexander.Kolbasov@Sun.COM * The capacity and utilization of the performance relevant hardware components 34*11389SAlexander.Kolbasov@Sun.COM * is needed to be able to optimize performance while minimizing the amount of 35*11389SAlexander.Kolbasov@Sun.COM * power used on a system. The idea is to use hardware performance counters 36*11389SAlexander.Kolbasov@Sun.COM * and potentially other means to determine the capacity and utilization of 37*11389SAlexander.Kolbasov@Sun.COM * performance relevant hardware components (eg. execution pipeline, cache, 38*11389SAlexander.Kolbasov@Sun.COM * memory, etc.) and attribute the utilization to the responsible CPU and the 39*11389SAlexander.Kolbasov@Sun.COM * thread running there. 40*11389SAlexander.Kolbasov@Sun.COM * 41*11389SAlexander.Kolbasov@Sun.COM * This will help characterize the utilization of performance relevant 42*11389SAlexander.Kolbasov@Sun.COM * components and how much is used by each CPU and each thread. With 43*11389SAlexander.Kolbasov@Sun.COM * that data, the utilization can be aggregated to all the CPUs sharing each 44*11389SAlexander.Kolbasov@Sun.COM * performance relevant hardware component to calculate the total utilization 45*11389SAlexander.Kolbasov@Sun.COM * of each component and compare that with the component's capacity to 46*11389SAlexander.Kolbasov@Sun.COM * essentially determine the actual hardware load of the component. The 47*11389SAlexander.Kolbasov@Sun.COM * hardware utilization attributed to each running thread can also be 48*11389SAlexander.Kolbasov@Sun.COM * aggregated to determine the total hardware utilization of each component to 49*11389SAlexander.Kolbasov@Sun.COM * a workload. 50*11389SAlexander.Kolbasov@Sun.COM * 51*11389SAlexander.Kolbasov@Sun.COM * Once that is done, one can determine how much of each performance relevant 52*11389SAlexander.Kolbasov@Sun.COM * hardware component is needed by a given thread or set of threads (eg. a 53*11389SAlexander.Kolbasov@Sun.COM * workload) and size up exactly what hardware is needed by the threads and how 54*11389SAlexander.Kolbasov@Sun.COM * much. With this info, we can better place threads among CPUs to match their 55*11389SAlexander.Kolbasov@Sun.COM * exact hardware resource needs and potentially lower or raise the power based 56*11389SAlexander.Kolbasov@Sun.COM * on their utilization or pack threads onto the fewest hardware components 57*11389SAlexander.Kolbasov@Sun.COM * needed and power off any remaining unused components to minimize power 58*11389SAlexander.Kolbasov@Sun.COM * without sacrificing performance. 59*11389SAlexander.Kolbasov@Sun.COM * 60*11389SAlexander.Kolbasov@Sun.COM * IMPLEMENTATION 61*11389SAlexander.Kolbasov@Sun.COM * -------------- 62*11389SAlexander.Kolbasov@Sun.COM * The code has been designed and implemented to make (un)programming and 63*11389SAlexander.Kolbasov@Sun.COM * reading the counters for a given CPU as lightweight and fast as possible. 64*11389SAlexander.Kolbasov@Sun.COM * This is very important because we need to read and potentially (un)program 65*11389SAlexander.Kolbasov@Sun.COM * the counters very often and in performance sensitive code. Specifically, 66*11389SAlexander.Kolbasov@Sun.COM * the counters may need to be (un)programmed during context switch and/or a 67*11389SAlexander.Kolbasov@Sun.COM * cyclic handler when there are more counter events to count than existing 68*11389SAlexander.Kolbasov@Sun.COM * counters. 69*11389SAlexander.Kolbasov@Sun.COM * 70*11389SAlexander.Kolbasov@Sun.COM * Consequently, the code has been split up to allow allocating and 71*11389SAlexander.Kolbasov@Sun.COM * initializing everything needed to program and read the counters on a given 72*11389SAlexander.Kolbasov@Sun.COM * CPU once and make (un)programming and reading the counters for a given CPU 73*11389SAlexander.Kolbasov@Sun.COM * not have to allocate/free memory or grab any locks. To do this, all the 74*11389SAlexander.Kolbasov@Sun.COM * state needed to (un)program and read the counters on a CPU is kept per CPU 75*11389SAlexander.Kolbasov@Sun.COM * and is made lock free by forcing any code that reads or manipulates the 76*11389SAlexander.Kolbasov@Sun.COM * counters or the state needed to (un)program or read the counters to run on 77*11389SAlexander.Kolbasov@Sun.COM * the target CPU and disable preemption while running on the target CPU to 78*11389SAlexander.Kolbasov@Sun.COM * protect any critical sections. All counter manipulation on the target CPU is 79*11389SAlexander.Kolbasov@Sun.COM * happening either from a cross-call to the target CPU or at the same PIL as 80*11389SAlexander.Kolbasov@Sun.COM * used by the cross-call subsystem. This guarantees that counter manipulation 81*11389SAlexander.Kolbasov@Sun.COM * is not interrupted by cross-calls from other CPUs. 82*11389SAlexander.Kolbasov@Sun.COM * 83*11389SAlexander.Kolbasov@Sun.COM * The synchronization has been made lock free or as simple as possible for 84*11389SAlexander.Kolbasov@Sun.COM * performance and to avoid getting the locking all tangled up when we interpose 85*11389SAlexander.Kolbasov@Sun.COM * on the CPC routines that (un)program the counters to manage the counters 86*11389SAlexander.Kolbasov@Sun.COM * between the kernel and user on each CPU. When the user starts using the 87*11389SAlexander.Kolbasov@Sun.COM * counters on a given CPU, the kernel will unprogram the counters that it is 88*11389SAlexander.Kolbasov@Sun.COM * using on that CPU just before they are programmed for the user. Then the 89*11389SAlexander.Kolbasov@Sun.COM * kernel will program the counters on a given CPU for its own use when the user 90*11389SAlexander.Kolbasov@Sun.COM * stops using them. 91*11389SAlexander.Kolbasov@Sun.COM * 92*11389SAlexander.Kolbasov@Sun.COM * There is a special interaction with DTrace cpc provider (dcpc). Before dcpc 93*11389SAlexander.Kolbasov@Sun.COM * enables any probe, it requests to disable and unprogram all counters used for 94*11389SAlexander.Kolbasov@Sun.COM * capacity and utilizations. These counters are never re-programmed back until 95*11389SAlexander.Kolbasov@Sun.COM * dcpc completes. When all DTrace cpc probes are removed, dcpc notifies CU 96*11389SAlexander.Kolbasov@Sun.COM * framework and it re-programs the counters. 97*11389SAlexander.Kolbasov@Sun.COM * 98*11389SAlexander.Kolbasov@Sun.COM * When a CPU is going offline, its CU counters are unprogrammed and disabled, 99*11389SAlexander.Kolbasov@Sun.COM * so that they would not be re-programmed again by some other activity on the 100*11389SAlexander.Kolbasov@Sun.COM * CPU that is going offline. 101*11389SAlexander.Kolbasov@Sun.COM * 102*11389SAlexander.Kolbasov@Sun.COM * The counters are programmed during boot. However, a flag is available to 103*11389SAlexander.Kolbasov@Sun.COM * disable this if necessary (see cu_flag below). A handler is provided to 104*11389SAlexander.Kolbasov@Sun.COM * (un)program the counters during CPU on/offline. Basic routines are provided 105*11389SAlexander.Kolbasov@Sun.COM * to initialize and tear down this module, initialize and tear down any state 106*11389SAlexander.Kolbasov@Sun.COM * needed for a given CPU, and (un)program the counters for a given CPU. 107*11389SAlexander.Kolbasov@Sun.COM * Lastly, a handler is provided to read the counters and attribute the 108*11389SAlexander.Kolbasov@Sun.COM * utilization to the responsible CPU. 109*11389SAlexander.Kolbasov@Sun.COM */ 110*11389SAlexander.Kolbasov@Sun.COM #include <sys/types.h> 111*11389SAlexander.Kolbasov@Sun.COM #include <sys/cmn_err.h> 112*11389SAlexander.Kolbasov@Sun.COM #include <sys/cpuvar.h> 113*11389SAlexander.Kolbasov@Sun.COM #include <sys/ddi.h> 114*11389SAlexander.Kolbasov@Sun.COM #include <sys/disp.h> 115*11389SAlexander.Kolbasov@Sun.COM #include <sys/sdt.h> 116*11389SAlexander.Kolbasov@Sun.COM #include <sys/sunddi.h> 117*11389SAlexander.Kolbasov@Sun.COM #include <sys/thread.h> 118*11389SAlexander.Kolbasov@Sun.COM #include <sys/pghw.h> 119*11389SAlexander.Kolbasov@Sun.COM #include <sys/cmt.h> 120*11389SAlexander.Kolbasov@Sun.COM #include <sys/x_call.h> 121*11389SAlexander.Kolbasov@Sun.COM #include <sys/cap_util.h> 122*11389SAlexander.Kolbasov@Sun.COM 123*11389SAlexander.Kolbasov@Sun.COM #include <sys/archsystm.h> 124*11389SAlexander.Kolbasov@Sun.COM #include <sys/promif.h> 125*11389SAlexander.Kolbasov@Sun.COM 126*11389SAlexander.Kolbasov@Sun.COM #if defined(__x86) 127*11389SAlexander.Kolbasov@Sun.COM #include <sys/xc_levels.h> 128*11389SAlexander.Kolbasov@Sun.COM #endif 129*11389SAlexander.Kolbasov@Sun.COM 130*11389SAlexander.Kolbasov@Sun.COM 131*11389SAlexander.Kolbasov@Sun.COM /* 132*11389SAlexander.Kolbasov@Sun.COM * Default CPU hardware performance counter flags to use for measuring capacity 133*11389SAlexander.Kolbasov@Sun.COM * and utilization 134*11389SAlexander.Kolbasov@Sun.COM */ 135*11389SAlexander.Kolbasov@Sun.COM #define CU_CPC_FLAGS_DEFAULT \ 136*11389SAlexander.Kolbasov@Sun.COM (CPC_COUNT_USER|CPC_COUNT_SYSTEM|CPC_OVF_NOTIFY_EMT) 137*11389SAlexander.Kolbasov@Sun.COM 138*11389SAlexander.Kolbasov@Sun.COM /* 139*11389SAlexander.Kolbasov@Sun.COM * Possible Flags for controlling this module. 140*11389SAlexander.Kolbasov@Sun.COM */ 141*11389SAlexander.Kolbasov@Sun.COM #define CU_FLAG_ENABLE 1 /* Enable module */ 142*11389SAlexander.Kolbasov@Sun.COM #define CU_FLAG_READY 2 /* Ready to setup module */ 143*11389SAlexander.Kolbasov@Sun.COM #define CU_FLAG_ON 4 /* Module is on */ 144*11389SAlexander.Kolbasov@Sun.COM 145*11389SAlexander.Kolbasov@Sun.COM /* 146*11389SAlexander.Kolbasov@Sun.COM * pg_cpu kstats calculate utilization rate and maximum utilization rate for 147*11389SAlexander.Kolbasov@Sun.COM * some CPUs. The rate is calculated based on data from two subsequent 148*11389SAlexander.Kolbasov@Sun.COM * snapshots. When the time between such two snapshots is too small, the 149*11389SAlexander.Kolbasov@Sun.COM * resulting rate may have low accuracy, so we only consider snapshots which 150*11389SAlexander.Kolbasov@Sun.COM * are separated by SAMPLE_INTERVAL nanoseconds from one another. We do not 151*11389SAlexander.Kolbasov@Sun.COM * update the rate if the interval is smaller than that. 152*11389SAlexander.Kolbasov@Sun.COM * 153*11389SAlexander.Kolbasov@Sun.COM * Use one tenth of a second as the minimum interval for utilization rate 154*11389SAlexander.Kolbasov@Sun.COM * calculation. 155*11389SAlexander.Kolbasov@Sun.COM * 156*11389SAlexander.Kolbasov@Sun.COM * NOTE: The CU_SAMPLE_INTERVAL_MIN should be higher than the scaling factor in 157*11389SAlexander.Kolbasov@Sun.COM * the CU_RATE() macro below to guarantee that we never divide by zero. 158*11389SAlexander.Kolbasov@Sun.COM * 159*11389SAlexander.Kolbasov@Sun.COM * Rate is the number of events per second. The rate is the number of events 160*11389SAlexander.Kolbasov@Sun.COM * divided by time and multiplied by the number of nanoseconds in a second. We 161*11389SAlexander.Kolbasov@Sun.COM * do not want time to be too small since it will cause large errors in 162*11389SAlexander.Kolbasov@Sun.COM * division. 163*11389SAlexander.Kolbasov@Sun.COM * 164*11389SAlexander.Kolbasov@Sun.COM * We do not want to multiply two large numbers (the instruction count and 165*11389SAlexander.Kolbasov@Sun.COM * NANOSEC) either since it may cause integer overflow. So we divide both the 166*11389SAlexander.Kolbasov@Sun.COM * numerator and the denominator by the same value. 167*11389SAlexander.Kolbasov@Sun.COM * 168*11389SAlexander.Kolbasov@Sun.COM * NOTE: The scaling factor below should be less than CU_SAMPLE_INTERVAL_MIN 169*11389SAlexander.Kolbasov@Sun.COM * above to guarantee that time divided by this value is always non-zero. 170*11389SAlexander.Kolbasov@Sun.COM */ 171*11389SAlexander.Kolbasov@Sun.COM #define CU_RATE(val, time) \ 172*11389SAlexander.Kolbasov@Sun.COM (((val) * (NANOSEC / CU_SCALE)) / ((time) / CU_SCALE)) 173*11389SAlexander.Kolbasov@Sun.COM 174*11389SAlexander.Kolbasov@Sun.COM #define CU_SAMPLE_INTERVAL_MIN (NANOSEC / 10) 175*11389SAlexander.Kolbasov@Sun.COM 176*11389SAlexander.Kolbasov@Sun.COM #define CU_SCALE (CU_SAMPLE_INTERVAL_MIN / 10000) 177*11389SAlexander.Kolbasov@Sun.COM 178*11389SAlexander.Kolbasov@Sun.COM /* 179*11389SAlexander.Kolbasov@Sun.COM * When the time between two kstat reads for the same CPU is less than 180*11389SAlexander.Kolbasov@Sun.COM * CU_UPDATE_THRESHOLD use the old counter data and skip updating counter values 181*11389SAlexander.Kolbasov@Sun.COM * for the CPU. This helps reduce cross-calls when kstat consumers read data 182*11389SAlexander.Kolbasov@Sun.COM * very often or when they read PG utilization data and then CPU utilization 183*11389SAlexander.Kolbasov@Sun.COM * data quickly after that. 184*11389SAlexander.Kolbasov@Sun.COM */ 185*11389SAlexander.Kolbasov@Sun.COM #define CU_UPDATE_THRESHOLD (NANOSEC / 10) 186*11389SAlexander.Kolbasov@Sun.COM 187*11389SAlexander.Kolbasov@Sun.COM /* 188*11389SAlexander.Kolbasov@Sun.COM * The IS_HIPIL() macro verifies that the code is executed either from a 189*11389SAlexander.Kolbasov@Sun.COM * cross-call or from high-PIL interrupt 190*11389SAlexander.Kolbasov@Sun.COM */ 191*11389SAlexander.Kolbasov@Sun.COM #ifdef DEBUG 192*11389SAlexander.Kolbasov@Sun.COM #define IS_HIPIL() (getpil() >= XCALL_PIL) 193*11389SAlexander.Kolbasov@Sun.COM #else 194*11389SAlexander.Kolbasov@Sun.COM #define IS_HIPIL() 195*11389SAlexander.Kolbasov@Sun.COM #endif /* DEBUG */ 196*11389SAlexander.Kolbasov@Sun.COM 197*11389SAlexander.Kolbasov@Sun.COM 198*11389SAlexander.Kolbasov@Sun.COM typedef void (*cu_cpu_func_t)(uintptr_t, int *); 199*11389SAlexander.Kolbasov@Sun.COM 200*11389SAlexander.Kolbasov@Sun.COM 201*11389SAlexander.Kolbasov@Sun.COM /* 202*11389SAlexander.Kolbasov@Sun.COM * Flags to use for programming CPU hardware performance counters to measure 203*11389SAlexander.Kolbasov@Sun.COM * capacity and utilization 204*11389SAlexander.Kolbasov@Sun.COM */ 205*11389SAlexander.Kolbasov@Sun.COM int cu_cpc_flags = CU_CPC_FLAGS_DEFAULT; 206*11389SAlexander.Kolbasov@Sun.COM 207*11389SAlexander.Kolbasov@Sun.COM /* 208*11389SAlexander.Kolbasov@Sun.COM * Initial value used for programming hardware counters 209*11389SAlexander.Kolbasov@Sun.COM */ 210*11389SAlexander.Kolbasov@Sun.COM uint64_t cu_cpc_preset_value = 0; 211*11389SAlexander.Kolbasov@Sun.COM 212*11389SAlexander.Kolbasov@Sun.COM /* 213*11389SAlexander.Kolbasov@Sun.COM * List of CPC event requests for capacity and utilization. 214*11389SAlexander.Kolbasov@Sun.COM */ 215*11389SAlexander.Kolbasov@Sun.COM static kcpc_request_list_t *cu_cpc_reqs = NULL; 216*11389SAlexander.Kolbasov@Sun.COM 217*11389SAlexander.Kolbasov@Sun.COM /* 218*11389SAlexander.Kolbasov@Sun.COM * When a CPU is a member of PG with a sharing relationship that is supported 219*11389SAlexander.Kolbasov@Sun.COM * by the capacity/utilization framework, a kstat is created for that CPU and 220*11389SAlexander.Kolbasov@Sun.COM * sharing relationship. 221*11389SAlexander.Kolbasov@Sun.COM * 222*11389SAlexander.Kolbasov@Sun.COM * These kstats are updated one at a time, so we can have a single scratch 223*11389SAlexander.Kolbasov@Sun.COM * space to fill the data. 224*11389SAlexander.Kolbasov@Sun.COM * 225*11389SAlexander.Kolbasov@Sun.COM * CPU counter kstats fields: 226*11389SAlexander.Kolbasov@Sun.COM * 227*11389SAlexander.Kolbasov@Sun.COM * cu_cpu_id CPU ID for this kstat 228*11389SAlexander.Kolbasov@Sun.COM * 229*11389SAlexander.Kolbasov@Sun.COM * cu_generation Generation value that increases whenever any CPU goes 230*11389SAlexander.Kolbasov@Sun.COM * offline or online. Two kstat snapshots for the same 231*11389SAlexander.Kolbasov@Sun.COM * CPU may only be compared if they have the same 232*11389SAlexander.Kolbasov@Sun.COM * generation. 233*11389SAlexander.Kolbasov@Sun.COM * 234*11389SAlexander.Kolbasov@Sun.COM * cu_pg_id PG ID for the relationship described by this kstat 235*11389SAlexander.Kolbasov@Sun.COM * 236*11389SAlexander.Kolbasov@Sun.COM * cu_cpu_util Running value of CPU utilization for the sharing 237*11389SAlexander.Kolbasov@Sun.COM * relationship 238*11389SAlexander.Kolbasov@Sun.COM * 239*11389SAlexander.Kolbasov@Sun.COM * cu_cpu_time_running Total time spent collecting CU data. The time may be 240*11389SAlexander.Kolbasov@Sun.COM * less than wall time if CU counters were stopped for 241*11389SAlexander.Kolbasov@Sun.COM * some time. 242*11389SAlexander.Kolbasov@Sun.COM * 243*11389SAlexander.Kolbasov@Sun.COM * cu_cpu_time_stopped Total time the CU counters were stopped. 244*11389SAlexander.Kolbasov@Sun.COM * 245*11389SAlexander.Kolbasov@Sun.COM * cu_cpu_rate Utilization rate, expressed in operations per second. 246*11389SAlexander.Kolbasov@Sun.COM * 247*11389SAlexander.Kolbasov@Sun.COM * cu_cpu_rate_max Maximum observed value of utilization rate. 248*11389SAlexander.Kolbasov@Sun.COM */ 249*11389SAlexander.Kolbasov@Sun.COM struct cu_cpu_kstat { 250*11389SAlexander.Kolbasov@Sun.COM kstat_named_t cu_cpu_id; 251*11389SAlexander.Kolbasov@Sun.COM kstat_named_t cu_generation; 252*11389SAlexander.Kolbasov@Sun.COM kstat_named_t cu_pg_id; 253*11389SAlexander.Kolbasov@Sun.COM kstat_named_t cu_cpu_util; 254*11389SAlexander.Kolbasov@Sun.COM kstat_named_t cu_cpu_time_running; 255*11389SAlexander.Kolbasov@Sun.COM kstat_named_t cu_cpu_time_stopped; 256*11389SAlexander.Kolbasov@Sun.COM kstat_named_t cu_cpu_rate; 257*11389SAlexander.Kolbasov@Sun.COM kstat_named_t cu_cpu_rate_max; 258*11389SAlexander.Kolbasov@Sun.COM } cu_cpu_kstat = { 259*11389SAlexander.Kolbasov@Sun.COM { "id", KSTAT_DATA_UINT32 }, 260*11389SAlexander.Kolbasov@Sun.COM { "generation", KSTAT_DATA_UINT32 }, 261*11389SAlexander.Kolbasov@Sun.COM { "pg_id", KSTAT_DATA_LONG }, 262*11389SAlexander.Kolbasov@Sun.COM { "hw_util", KSTAT_DATA_UINT64 }, 263*11389SAlexander.Kolbasov@Sun.COM { "hw_util_time_running", KSTAT_DATA_UINT64 }, 264*11389SAlexander.Kolbasov@Sun.COM { "hw_util_time_stopped", KSTAT_DATA_UINT64 }, 265*11389SAlexander.Kolbasov@Sun.COM { "hw_util_rate", KSTAT_DATA_UINT64 }, 266*11389SAlexander.Kolbasov@Sun.COM { "hw_util_rate_max", KSTAT_DATA_UINT64 }, 267*11389SAlexander.Kolbasov@Sun.COM }; 268*11389SAlexander.Kolbasov@Sun.COM 269*11389SAlexander.Kolbasov@Sun.COM /* 270*11389SAlexander.Kolbasov@Sun.COM * Flags for controlling this module 271*11389SAlexander.Kolbasov@Sun.COM */ 272*11389SAlexander.Kolbasov@Sun.COM uint_t cu_flags = CU_FLAG_ENABLE; 273*11389SAlexander.Kolbasov@Sun.COM 274*11389SAlexander.Kolbasov@Sun.COM /* 275*11389SAlexander.Kolbasov@Sun.COM * Error return value for cu_init() since it can't return anything to be called 276*11389SAlexander.Kolbasov@Sun.COM * from mp_init_tbl[] (:-( 277*11389SAlexander.Kolbasov@Sun.COM */ 278*11389SAlexander.Kolbasov@Sun.COM static int cu_init_error = 0; 279*11389SAlexander.Kolbasov@Sun.COM 280*11389SAlexander.Kolbasov@Sun.COM hrtime_t cu_sample_interval_min = CU_SAMPLE_INTERVAL_MIN; 281*11389SAlexander.Kolbasov@Sun.COM 282*11389SAlexander.Kolbasov@Sun.COM hrtime_t cu_update_threshold = CU_UPDATE_THRESHOLD; 283*11389SAlexander.Kolbasov@Sun.COM 284*11389SAlexander.Kolbasov@Sun.COM static kmutex_t pg_cpu_kstat_lock; 285*11389SAlexander.Kolbasov@Sun.COM 286*11389SAlexander.Kolbasov@Sun.COM 287*11389SAlexander.Kolbasov@Sun.COM /* 288*11389SAlexander.Kolbasov@Sun.COM * Forward declaration of interface routines 289*11389SAlexander.Kolbasov@Sun.COM */ 290*11389SAlexander.Kolbasov@Sun.COM void cu_disable(void); 291*11389SAlexander.Kolbasov@Sun.COM void cu_enable(void); 292*11389SAlexander.Kolbasov@Sun.COM void cu_init(void); 293*11389SAlexander.Kolbasov@Sun.COM void cu_cpc_program(cpu_t *cp, int *err); 294*11389SAlexander.Kolbasov@Sun.COM void cu_cpc_unprogram(cpu_t *cp, int *err); 295*11389SAlexander.Kolbasov@Sun.COM int cu_cpu_update(struct cpu *cp, boolean_t move_to); 296*11389SAlexander.Kolbasov@Sun.COM void cu_pg_update(pghw_t *pg); 297*11389SAlexander.Kolbasov@Sun.COM 298*11389SAlexander.Kolbasov@Sun.COM 299*11389SAlexander.Kolbasov@Sun.COM /* 300*11389SAlexander.Kolbasov@Sun.COM * Forward declaration of private routines 301*11389SAlexander.Kolbasov@Sun.COM */ 302*11389SAlexander.Kolbasov@Sun.COM static int cu_cpc_init(cpu_t *cp, kcpc_request_list_t *reqs, int nreqs); 303*11389SAlexander.Kolbasov@Sun.COM static void cu_cpc_program_xcall(uintptr_t arg, int *err); 304*11389SAlexander.Kolbasov@Sun.COM static int cu_cpc_req_add(char *event, kcpc_request_list_t *reqs, 305*11389SAlexander.Kolbasov@Sun.COM int nreqs, cu_cntr_stats_t *stats, int kmem_flags, int *nevents); 306*11389SAlexander.Kolbasov@Sun.COM static int cu_cpu_callback(cpu_setup_t what, int id, void *arg); 307*11389SAlexander.Kolbasov@Sun.COM static void cu_cpu_disable(cpu_t *cp); 308*11389SAlexander.Kolbasov@Sun.COM static void cu_cpu_enable(cpu_t *cp); 309*11389SAlexander.Kolbasov@Sun.COM static int cu_cpu_init(cpu_t *cp, kcpc_request_list_t *reqs); 310*11389SAlexander.Kolbasov@Sun.COM static int cu_cpu_fini(cpu_t *cp); 311*11389SAlexander.Kolbasov@Sun.COM static void cu_cpu_kstat_create(pghw_t *pg, cu_cntr_info_t *cntr_info); 312*11389SAlexander.Kolbasov@Sun.COM static int cu_cpu_kstat_update(kstat_t *ksp, int rw); 313*11389SAlexander.Kolbasov@Sun.COM static int cu_cpu_run(cpu_t *cp, cu_cpu_func_t func, uintptr_t arg); 314*11389SAlexander.Kolbasov@Sun.COM static int cu_cpu_update_stats(cu_cntr_stats_t *stats, 315*11389SAlexander.Kolbasov@Sun.COM uint64_t cntr_value); 316*11389SAlexander.Kolbasov@Sun.COM static void cu_cpu_info_detach_xcall(void); 317*11389SAlexander.Kolbasov@Sun.COM 318*11389SAlexander.Kolbasov@Sun.COM /* 319*11389SAlexander.Kolbasov@Sun.COM * Disable or enable Capacity Utilization counters on all CPUs. 320*11389SAlexander.Kolbasov@Sun.COM */ 321*11389SAlexander.Kolbasov@Sun.COM void 322*11389SAlexander.Kolbasov@Sun.COM cu_disable(void) 323*11389SAlexander.Kolbasov@Sun.COM { 324*11389SAlexander.Kolbasov@Sun.COM cpu_t *cp; 325*11389SAlexander.Kolbasov@Sun.COM 326*11389SAlexander.Kolbasov@Sun.COM ASSERT(MUTEX_HELD(&cpu_lock)); 327*11389SAlexander.Kolbasov@Sun.COM 328*11389SAlexander.Kolbasov@Sun.COM cp = cpu_active; 329*11389SAlexander.Kolbasov@Sun.COM do { 330*11389SAlexander.Kolbasov@Sun.COM if (!(cp->cpu_flags & CPU_OFFLINE)) 331*11389SAlexander.Kolbasov@Sun.COM cu_cpu_disable(cp); 332*11389SAlexander.Kolbasov@Sun.COM } while ((cp = cp->cpu_next_onln) != cpu_active); 333*11389SAlexander.Kolbasov@Sun.COM } 334*11389SAlexander.Kolbasov@Sun.COM 335*11389SAlexander.Kolbasov@Sun.COM 336*11389SAlexander.Kolbasov@Sun.COM void 337*11389SAlexander.Kolbasov@Sun.COM cu_enable(void) 338*11389SAlexander.Kolbasov@Sun.COM { 339*11389SAlexander.Kolbasov@Sun.COM cpu_t *cp; 340*11389SAlexander.Kolbasov@Sun.COM 341*11389SAlexander.Kolbasov@Sun.COM ASSERT(MUTEX_HELD(&cpu_lock)); 342*11389SAlexander.Kolbasov@Sun.COM 343*11389SAlexander.Kolbasov@Sun.COM cp = cpu_active; 344*11389SAlexander.Kolbasov@Sun.COM do { 345*11389SAlexander.Kolbasov@Sun.COM if (!(cp->cpu_flags & CPU_OFFLINE)) 346*11389SAlexander.Kolbasov@Sun.COM cu_cpu_enable(cp); 347*11389SAlexander.Kolbasov@Sun.COM } while ((cp = cp->cpu_next_onln) != cpu_active); 348*11389SAlexander.Kolbasov@Sun.COM } 349*11389SAlexander.Kolbasov@Sun.COM 350*11389SAlexander.Kolbasov@Sun.COM 351*11389SAlexander.Kolbasov@Sun.COM /* 352*11389SAlexander.Kolbasov@Sun.COM * Setup capacity and utilization support 353*11389SAlexander.Kolbasov@Sun.COM */ 354*11389SAlexander.Kolbasov@Sun.COM void 355*11389SAlexander.Kolbasov@Sun.COM cu_init(void) 356*11389SAlexander.Kolbasov@Sun.COM { 357*11389SAlexander.Kolbasov@Sun.COM cpu_t *cp; 358*11389SAlexander.Kolbasov@Sun.COM 359*11389SAlexander.Kolbasov@Sun.COM cu_init_error = 0; 360*11389SAlexander.Kolbasov@Sun.COM if (!(cu_flags & CU_FLAG_ENABLE) || (cu_flags & CU_FLAG_ON)) { 361*11389SAlexander.Kolbasov@Sun.COM cu_init_error = -1; 362*11389SAlexander.Kolbasov@Sun.COM return; 363*11389SAlexander.Kolbasov@Sun.COM } 364*11389SAlexander.Kolbasov@Sun.COM 365*11389SAlexander.Kolbasov@Sun.COM if (kcpc_init() != 0) { 366*11389SAlexander.Kolbasov@Sun.COM cu_init_error = -2; 367*11389SAlexander.Kolbasov@Sun.COM return; 368*11389SAlexander.Kolbasov@Sun.COM } 369*11389SAlexander.Kolbasov@Sun.COM 370*11389SAlexander.Kolbasov@Sun.COM /* 371*11389SAlexander.Kolbasov@Sun.COM * Can't measure hardware capacity and utilization without CPU 372*11389SAlexander.Kolbasov@Sun.COM * hardware performance counters 373*11389SAlexander.Kolbasov@Sun.COM */ 374*11389SAlexander.Kolbasov@Sun.COM if (cpc_ncounters <= 0) { 375*11389SAlexander.Kolbasov@Sun.COM cu_init_error = -3; 376*11389SAlexander.Kolbasov@Sun.COM return; 377*11389SAlexander.Kolbasov@Sun.COM } 378*11389SAlexander.Kolbasov@Sun.COM 379*11389SAlexander.Kolbasov@Sun.COM /* 380*11389SAlexander.Kolbasov@Sun.COM * Setup CPC event request queue 381*11389SAlexander.Kolbasov@Sun.COM */ 382*11389SAlexander.Kolbasov@Sun.COM cu_cpc_reqs = kcpc_reqs_init(cpc_ncounters, KM_SLEEP); 383*11389SAlexander.Kolbasov@Sun.COM 384*11389SAlexander.Kolbasov@Sun.COM mutex_enter(&cpu_lock); 385*11389SAlexander.Kolbasov@Sun.COM 386*11389SAlexander.Kolbasov@Sun.COM /* 387*11389SAlexander.Kolbasov@Sun.COM * Mark flags to say that module is ready to be setup 388*11389SAlexander.Kolbasov@Sun.COM */ 389*11389SAlexander.Kolbasov@Sun.COM cu_flags |= CU_FLAG_READY; 390*11389SAlexander.Kolbasov@Sun.COM 391*11389SAlexander.Kolbasov@Sun.COM cp = cpu_active; 392*11389SAlexander.Kolbasov@Sun.COM do { 393*11389SAlexander.Kolbasov@Sun.COM /* 394*11389SAlexander.Kolbasov@Sun.COM * Allocate and setup state needed to measure capacity and 395*11389SAlexander.Kolbasov@Sun.COM * utilization 396*11389SAlexander.Kolbasov@Sun.COM */ 397*11389SAlexander.Kolbasov@Sun.COM if (cu_cpu_init(cp, cu_cpc_reqs) != 0) 398*11389SAlexander.Kolbasov@Sun.COM cu_init_error = -5; 399*11389SAlexander.Kolbasov@Sun.COM 400*11389SAlexander.Kolbasov@Sun.COM /* 401*11389SAlexander.Kolbasov@Sun.COM * Reset list of counter event requests so its space can be 402*11389SAlexander.Kolbasov@Sun.COM * reused for a different set of requests for next CPU 403*11389SAlexander.Kolbasov@Sun.COM */ 404*11389SAlexander.Kolbasov@Sun.COM (void) kcpc_reqs_reset(cu_cpc_reqs); 405*11389SAlexander.Kolbasov@Sun.COM 406*11389SAlexander.Kolbasov@Sun.COM cp = cp->cpu_next_onln; 407*11389SAlexander.Kolbasov@Sun.COM } while (cp != cpu_active); 408*11389SAlexander.Kolbasov@Sun.COM 409*11389SAlexander.Kolbasov@Sun.COM /* 410*11389SAlexander.Kolbasov@Sun.COM * Mark flags to say that module is on now and counters are ready to be 411*11389SAlexander.Kolbasov@Sun.COM * programmed on all active CPUs 412*11389SAlexander.Kolbasov@Sun.COM */ 413*11389SAlexander.Kolbasov@Sun.COM cu_flags |= CU_FLAG_ON; 414*11389SAlexander.Kolbasov@Sun.COM 415*11389SAlexander.Kolbasov@Sun.COM /* 416*11389SAlexander.Kolbasov@Sun.COM * Program counters on currently active CPUs 417*11389SAlexander.Kolbasov@Sun.COM */ 418*11389SAlexander.Kolbasov@Sun.COM cp = cpu_active; 419*11389SAlexander.Kolbasov@Sun.COM do { 420*11389SAlexander.Kolbasov@Sun.COM if (cu_cpu_run(cp, cu_cpc_program_xcall, 421*11389SAlexander.Kolbasov@Sun.COM (uintptr_t)B_FALSE) != 0) 422*11389SAlexander.Kolbasov@Sun.COM cu_init_error = -6; 423*11389SAlexander.Kolbasov@Sun.COM 424*11389SAlexander.Kolbasov@Sun.COM cp = cp->cpu_next_onln; 425*11389SAlexander.Kolbasov@Sun.COM } while (cp != cpu_active); 426*11389SAlexander.Kolbasov@Sun.COM 427*11389SAlexander.Kolbasov@Sun.COM /* 428*11389SAlexander.Kolbasov@Sun.COM * Register callback for CPU state changes to enable and disable 429*11389SAlexander.Kolbasov@Sun.COM * CPC counters as CPUs come on and offline 430*11389SAlexander.Kolbasov@Sun.COM */ 431*11389SAlexander.Kolbasov@Sun.COM register_cpu_setup_func(cu_cpu_callback, NULL); 432*11389SAlexander.Kolbasov@Sun.COM 433*11389SAlexander.Kolbasov@Sun.COM mutex_exit(&cpu_lock); 434*11389SAlexander.Kolbasov@Sun.COM } 435*11389SAlexander.Kolbasov@Sun.COM 436*11389SAlexander.Kolbasov@Sun.COM 437*11389SAlexander.Kolbasov@Sun.COM /* 438*11389SAlexander.Kolbasov@Sun.COM * Return number of counter events needed to measure capacity and utilization 439*11389SAlexander.Kolbasov@Sun.COM * for specified CPU and fill in list of CPC requests with each counter event 440*11389SAlexander.Kolbasov@Sun.COM * needed if list where to add CPC requests is given 441*11389SAlexander.Kolbasov@Sun.COM * 442*11389SAlexander.Kolbasov@Sun.COM * NOTE: Use KM_NOSLEEP for kmem_{,z}alloc() since cpu_lock is held and free 443*11389SAlexander.Kolbasov@Sun.COM * everything that has been successfully allocated if any memory 444*11389SAlexander.Kolbasov@Sun.COM * allocation fails 445*11389SAlexander.Kolbasov@Sun.COM */ 446*11389SAlexander.Kolbasov@Sun.COM static int 447*11389SAlexander.Kolbasov@Sun.COM cu_cpc_init(cpu_t *cp, kcpc_request_list_t *reqs, int nreqs) 448*11389SAlexander.Kolbasov@Sun.COM { 449*11389SAlexander.Kolbasov@Sun.COM group_t *cmt_pgs; 450*11389SAlexander.Kolbasov@Sun.COM cu_cntr_info_t **cntr_info_array; 451*11389SAlexander.Kolbasov@Sun.COM cpu_pg_t *cpu_pgs; 452*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info_t *cu_cpu_info; 453*11389SAlexander.Kolbasov@Sun.COM pg_cmt_t *pg_cmt; 454*11389SAlexander.Kolbasov@Sun.COM pghw_t *pg_hw; 455*11389SAlexander.Kolbasov@Sun.COM cu_cntr_stats_t *stats; 456*11389SAlexander.Kolbasov@Sun.COM int nevents; 457*11389SAlexander.Kolbasov@Sun.COM pghw_type_t pg_hw_type; 458*11389SAlexander.Kolbasov@Sun.COM group_iter_t iter; 459*11389SAlexander.Kolbasov@Sun.COM 460*11389SAlexander.Kolbasov@Sun.COM ASSERT(MUTEX_HELD(&cpu_lock)); 461*11389SAlexander.Kolbasov@Sun.COM 462*11389SAlexander.Kolbasov@Sun.COM /* 463*11389SAlexander.Kolbasov@Sun.COM * There has to be a target CPU for this 464*11389SAlexander.Kolbasov@Sun.COM */ 465*11389SAlexander.Kolbasov@Sun.COM if (cp == NULL) 466*11389SAlexander.Kolbasov@Sun.COM return (-1); 467*11389SAlexander.Kolbasov@Sun.COM 468*11389SAlexander.Kolbasov@Sun.COM /* 469*11389SAlexander.Kolbasov@Sun.COM * Return 0 when CPU doesn't belong to any group 470*11389SAlexander.Kolbasov@Sun.COM */ 471*11389SAlexander.Kolbasov@Sun.COM cpu_pgs = cp->cpu_pg; 472*11389SAlexander.Kolbasov@Sun.COM if (cpu_pgs == NULL || GROUP_SIZE(&cpu_pgs->cmt_pgs) < 1) 473*11389SAlexander.Kolbasov@Sun.COM return (0); 474*11389SAlexander.Kolbasov@Sun.COM 475*11389SAlexander.Kolbasov@Sun.COM cmt_pgs = &cpu_pgs->cmt_pgs; 476*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info = cp->cpu_cu_info; 477*11389SAlexander.Kolbasov@Sun.COM 478*11389SAlexander.Kolbasov@Sun.COM /* 479*11389SAlexander.Kolbasov@Sun.COM * Grab counter statistics and info 480*11389SAlexander.Kolbasov@Sun.COM */ 481*11389SAlexander.Kolbasov@Sun.COM if (reqs == NULL) { 482*11389SAlexander.Kolbasov@Sun.COM stats = NULL; 483*11389SAlexander.Kolbasov@Sun.COM cntr_info_array = NULL; 484*11389SAlexander.Kolbasov@Sun.COM } else { 485*11389SAlexander.Kolbasov@Sun.COM if (cu_cpu_info == NULL || cu_cpu_info->cu_cntr_stats == NULL) 486*11389SAlexander.Kolbasov@Sun.COM return (-2); 487*11389SAlexander.Kolbasov@Sun.COM 488*11389SAlexander.Kolbasov@Sun.COM stats = cu_cpu_info->cu_cntr_stats; 489*11389SAlexander.Kolbasov@Sun.COM cntr_info_array = cu_cpu_info->cu_cntr_info; 490*11389SAlexander.Kolbasov@Sun.COM } 491*11389SAlexander.Kolbasov@Sun.COM 492*11389SAlexander.Kolbasov@Sun.COM /* 493*11389SAlexander.Kolbasov@Sun.COM * See whether platform (or processor) specific code knows which CPC 494*11389SAlexander.Kolbasov@Sun.COM * events to request, etc. are needed to measure hardware capacity and 495*11389SAlexander.Kolbasov@Sun.COM * utilization on this machine 496*11389SAlexander.Kolbasov@Sun.COM */ 497*11389SAlexander.Kolbasov@Sun.COM nevents = cu_plat_cpc_init(cp, reqs, nreqs); 498*11389SAlexander.Kolbasov@Sun.COM if (nevents >= 0) 499*11389SAlexander.Kolbasov@Sun.COM return (nevents); 500*11389SAlexander.Kolbasov@Sun.COM 501*11389SAlexander.Kolbasov@Sun.COM /* 502*11389SAlexander.Kolbasov@Sun.COM * Let common code decide which CPC events to request, etc. to measure 503*11389SAlexander.Kolbasov@Sun.COM * capacity and utilization since platform (or processor) specific does 504*11389SAlexander.Kolbasov@Sun.COM * not know.... 505*11389SAlexander.Kolbasov@Sun.COM * 506*11389SAlexander.Kolbasov@Sun.COM * Walk CPU's PG lineage and do following: 507*11389SAlexander.Kolbasov@Sun.COM * 508*11389SAlexander.Kolbasov@Sun.COM * - Setup CPC request, counter info, and stats needed for each counter 509*11389SAlexander.Kolbasov@Sun.COM * event to measure capacity and and utilization for each of CPU's PG 510*11389SAlexander.Kolbasov@Sun.COM * hardware sharing relationships 511*11389SAlexander.Kolbasov@Sun.COM * 512*11389SAlexander.Kolbasov@Sun.COM * - Create PG CPU kstats to export capacity and utilization for each PG 513*11389SAlexander.Kolbasov@Sun.COM */ 514*11389SAlexander.Kolbasov@Sun.COM nevents = 0; 515*11389SAlexander.Kolbasov@Sun.COM group_iter_init(&iter); 516*11389SAlexander.Kolbasov@Sun.COM while ((pg_cmt = group_iterate(cmt_pgs, &iter)) != NULL) { 517*11389SAlexander.Kolbasov@Sun.COM cu_cntr_info_t *cntr_info; 518*11389SAlexander.Kolbasov@Sun.COM int nevents_save; 519*11389SAlexander.Kolbasov@Sun.COM int nstats; 520*11389SAlexander.Kolbasov@Sun.COM 521*11389SAlexander.Kolbasov@Sun.COM pg_hw = (pghw_t *)pg_cmt; 522*11389SAlexander.Kolbasov@Sun.COM pg_hw_type = pg_hw->pghw_hw; 523*11389SAlexander.Kolbasov@Sun.COM nevents_save = nevents; 524*11389SAlexander.Kolbasov@Sun.COM nstats = 0; 525*11389SAlexander.Kolbasov@Sun.COM 526*11389SAlexander.Kolbasov@Sun.COM switch (pg_hw_type) { 527*11389SAlexander.Kolbasov@Sun.COM case PGHW_IPIPE: 528*11389SAlexander.Kolbasov@Sun.COM if (cu_cpc_req_add("PAPI_tot_ins", reqs, nreqs, stats, 529*11389SAlexander.Kolbasov@Sun.COM KM_NOSLEEP, &nevents) != 0) 530*11389SAlexander.Kolbasov@Sun.COM continue; 531*11389SAlexander.Kolbasov@Sun.COM nstats = 1; 532*11389SAlexander.Kolbasov@Sun.COM break; 533*11389SAlexander.Kolbasov@Sun.COM 534*11389SAlexander.Kolbasov@Sun.COM case PGHW_FPU: 535*11389SAlexander.Kolbasov@Sun.COM if (cu_cpc_req_add("PAPI_fp_ins", reqs, nreqs, stats, 536*11389SAlexander.Kolbasov@Sun.COM KM_NOSLEEP, &nevents) != 0) 537*11389SAlexander.Kolbasov@Sun.COM continue; 538*11389SAlexander.Kolbasov@Sun.COM nstats = 1; 539*11389SAlexander.Kolbasov@Sun.COM break; 540*11389SAlexander.Kolbasov@Sun.COM 541*11389SAlexander.Kolbasov@Sun.COM default: 542*11389SAlexander.Kolbasov@Sun.COM /* 543*11389SAlexander.Kolbasov@Sun.COM * Don't measure capacity and utilization for this kind 544*11389SAlexander.Kolbasov@Sun.COM * of PG hardware relationship so skip to next PG in 545*11389SAlexander.Kolbasov@Sun.COM * CPU's PG lineage 546*11389SAlexander.Kolbasov@Sun.COM */ 547*11389SAlexander.Kolbasov@Sun.COM continue; 548*11389SAlexander.Kolbasov@Sun.COM } 549*11389SAlexander.Kolbasov@Sun.COM 550*11389SAlexander.Kolbasov@Sun.COM cntr_info = cntr_info_array[pg_hw_type]; 551*11389SAlexander.Kolbasov@Sun.COM 552*11389SAlexander.Kolbasov@Sun.COM /* 553*11389SAlexander.Kolbasov@Sun.COM * Nothing to measure for this hardware sharing relationship 554*11389SAlexander.Kolbasov@Sun.COM */ 555*11389SAlexander.Kolbasov@Sun.COM if (nevents - nevents_save == 0) { 556*11389SAlexander.Kolbasov@Sun.COM if (cntr_info != NULL) 557*11389SAlexander.Kolbasov@Sun.COM kmem_free(cntr_info, sizeof (cu_cntr_info_t)); 558*11389SAlexander.Kolbasov@Sun.COM cntr_info_array[pg_hw_type] = NULL; 559*11389SAlexander.Kolbasov@Sun.COM continue; 560*11389SAlexander.Kolbasov@Sun.COM } 561*11389SAlexander.Kolbasov@Sun.COM 562*11389SAlexander.Kolbasov@Sun.COM /* 563*11389SAlexander.Kolbasov@Sun.COM * Fill in counter info for this PG hardware relationship 564*11389SAlexander.Kolbasov@Sun.COM */ 565*11389SAlexander.Kolbasov@Sun.COM if (cntr_info == NULL) { 566*11389SAlexander.Kolbasov@Sun.COM cntr_info = kmem_zalloc(sizeof (cu_cntr_info_t), 567*11389SAlexander.Kolbasov@Sun.COM KM_NOSLEEP); 568*11389SAlexander.Kolbasov@Sun.COM if (cntr_info == NULL) 569*11389SAlexander.Kolbasov@Sun.COM continue; 570*11389SAlexander.Kolbasov@Sun.COM cntr_info_array[pg_hw_type] = cntr_info; 571*11389SAlexander.Kolbasov@Sun.COM } 572*11389SAlexander.Kolbasov@Sun.COM cntr_info->ci_cpu = cp; 573*11389SAlexander.Kolbasov@Sun.COM cntr_info->ci_pg = pg_hw; 574*11389SAlexander.Kolbasov@Sun.COM cntr_info->ci_stats = &stats[nevents_save]; 575*11389SAlexander.Kolbasov@Sun.COM cntr_info->ci_nstats = nstats; 576*11389SAlexander.Kolbasov@Sun.COM 577*11389SAlexander.Kolbasov@Sun.COM /* 578*11389SAlexander.Kolbasov@Sun.COM * Create PG CPU kstats for this hardware relationship 579*11389SAlexander.Kolbasov@Sun.COM */ 580*11389SAlexander.Kolbasov@Sun.COM cu_cpu_kstat_create(pg_hw, cntr_info); 581*11389SAlexander.Kolbasov@Sun.COM } 582*11389SAlexander.Kolbasov@Sun.COM 583*11389SAlexander.Kolbasov@Sun.COM return (nevents); 584*11389SAlexander.Kolbasov@Sun.COM } 585*11389SAlexander.Kolbasov@Sun.COM 586*11389SAlexander.Kolbasov@Sun.COM 587*11389SAlexander.Kolbasov@Sun.COM /* 588*11389SAlexander.Kolbasov@Sun.COM * Program counters for capacity and utilization on given CPU 589*11389SAlexander.Kolbasov@Sun.COM * 590*11389SAlexander.Kolbasov@Sun.COM * If any of the following conditions is true, the counters are not programmed: 591*11389SAlexander.Kolbasov@Sun.COM * 592*11389SAlexander.Kolbasov@Sun.COM * - CU framework is disabled 593*11389SAlexander.Kolbasov@Sun.COM * - The cpu_cu_info field of the cpu structure is NULL 594*11389SAlexander.Kolbasov@Sun.COM * - DTrace is active 595*11389SAlexander.Kolbasov@Sun.COM * - Counters are programmed already 596*11389SAlexander.Kolbasov@Sun.COM * - Counters are disabled (by calls to cu_cpu_disable()) 597*11389SAlexander.Kolbasov@Sun.COM */ 598*11389SAlexander.Kolbasov@Sun.COM void 599*11389SAlexander.Kolbasov@Sun.COM cu_cpc_program(cpu_t *cp, int *err) 600*11389SAlexander.Kolbasov@Sun.COM { 601*11389SAlexander.Kolbasov@Sun.COM cu_cpc_ctx_t *cpu_ctx; 602*11389SAlexander.Kolbasov@Sun.COM kcpc_ctx_t *ctx; 603*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info_t *cu_cpu_info; 604*11389SAlexander.Kolbasov@Sun.COM 605*11389SAlexander.Kolbasov@Sun.COM ASSERT(IS_HIPIL()); 606*11389SAlexander.Kolbasov@Sun.COM /* 607*11389SAlexander.Kolbasov@Sun.COM * Should be running on given CPU. We disable preemption to keep CPU 608*11389SAlexander.Kolbasov@Sun.COM * from disappearing and make sure flags and CPC context don't change 609*11389SAlexander.Kolbasov@Sun.COM * from underneath us 610*11389SAlexander.Kolbasov@Sun.COM */ 611*11389SAlexander.Kolbasov@Sun.COM kpreempt_disable(); 612*11389SAlexander.Kolbasov@Sun.COM ASSERT(cp == CPU); 613*11389SAlexander.Kolbasov@Sun.COM 614*11389SAlexander.Kolbasov@Sun.COM /* 615*11389SAlexander.Kolbasov@Sun.COM * Module not ready to program counters 616*11389SAlexander.Kolbasov@Sun.COM */ 617*11389SAlexander.Kolbasov@Sun.COM if (!(cu_flags & CU_FLAG_ON)) { 618*11389SAlexander.Kolbasov@Sun.COM *err = -1; 619*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 620*11389SAlexander.Kolbasov@Sun.COM return; 621*11389SAlexander.Kolbasov@Sun.COM } 622*11389SAlexander.Kolbasov@Sun.COM 623*11389SAlexander.Kolbasov@Sun.COM if (cp == NULL) { 624*11389SAlexander.Kolbasov@Sun.COM *err = -2; 625*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 626*11389SAlexander.Kolbasov@Sun.COM return; 627*11389SAlexander.Kolbasov@Sun.COM } 628*11389SAlexander.Kolbasov@Sun.COM 629*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info = cp->cpu_cu_info; 630*11389SAlexander.Kolbasov@Sun.COM if (cu_cpu_info == NULL) { 631*11389SAlexander.Kolbasov@Sun.COM *err = -3; 632*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 633*11389SAlexander.Kolbasov@Sun.COM return; 634*11389SAlexander.Kolbasov@Sun.COM } 635*11389SAlexander.Kolbasov@Sun.COM 636*11389SAlexander.Kolbasov@Sun.COM /* 637*11389SAlexander.Kolbasov@Sun.COM * If DTrace CPC is active or counters turned on already or are 638*11389SAlexander.Kolbasov@Sun.COM * disabled, just return. 639*11389SAlexander.Kolbasov@Sun.COM */ 640*11389SAlexander.Kolbasov@Sun.COM if (dtrace_cpc_in_use || (cu_cpu_info->cu_flag & CU_CPU_CNTRS_ON) || 641*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_disabled) { 642*11389SAlexander.Kolbasov@Sun.COM *err = 1; 643*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 644*11389SAlexander.Kolbasov@Sun.COM return; 645*11389SAlexander.Kolbasov@Sun.COM } 646*11389SAlexander.Kolbasov@Sun.COM 647*11389SAlexander.Kolbasov@Sun.COM if ((CPU->cpu_cpc_ctx != NULL) && 648*11389SAlexander.Kolbasov@Sun.COM !(CPU->cpu_cpc_ctx->kc_flags & KCPC_CTX_INVALID_STOPPED)) { 649*11389SAlexander.Kolbasov@Sun.COM *err = -4; 650*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 651*11389SAlexander.Kolbasov@Sun.COM return; 652*11389SAlexander.Kolbasov@Sun.COM } 653*11389SAlexander.Kolbasov@Sun.COM 654*11389SAlexander.Kolbasov@Sun.COM /* 655*11389SAlexander.Kolbasov@Sun.COM * Get CPU's CPC context needed for capacity and utilization 656*11389SAlexander.Kolbasov@Sun.COM */ 657*11389SAlexander.Kolbasov@Sun.COM cpu_ctx = &cu_cpu_info->cu_cpc_ctx; 658*11389SAlexander.Kolbasov@Sun.COM ASSERT(cpu_ctx != NULL); 659*11389SAlexander.Kolbasov@Sun.COM ASSERT(cpu_ctx->nctx >= 0); 660*11389SAlexander.Kolbasov@Sun.COM 661*11389SAlexander.Kolbasov@Sun.COM ASSERT(cpu_ctx->ctx_ptr_array == NULL || cpu_ctx->ctx_ptr_array_sz > 0); 662*11389SAlexander.Kolbasov@Sun.COM ASSERT(cpu_ctx->nctx <= cpu_ctx->ctx_ptr_array_sz); 663*11389SAlexander.Kolbasov@Sun.COM if (cpu_ctx->nctx <= 0 || cpu_ctx->ctx_ptr_array == NULL || 664*11389SAlexander.Kolbasov@Sun.COM cpu_ctx->ctx_ptr_array_sz <= 0) { 665*11389SAlexander.Kolbasov@Sun.COM *err = -5; 666*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 667*11389SAlexander.Kolbasov@Sun.COM return; 668*11389SAlexander.Kolbasov@Sun.COM } 669*11389SAlexander.Kolbasov@Sun.COM 670*11389SAlexander.Kolbasov@Sun.COM /* 671*11389SAlexander.Kolbasov@Sun.COM * Increment index in CPU's CPC context info to point at next context 672*11389SAlexander.Kolbasov@Sun.COM * to program 673*11389SAlexander.Kolbasov@Sun.COM * 674*11389SAlexander.Kolbasov@Sun.COM * NOTE: Do this now instead of after programming counters to ensure 675*11389SAlexander.Kolbasov@Sun.COM * that index will always point at *current* context so we will 676*11389SAlexander.Kolbasov@Sun.COM * always be able to unprogram *current* context if necessary 677*11389SAlexander.Kolbasov@Sun.COM */ 678*11389SAlexander.Kolbasov@Sun.COM cpu_ctx->cur_index = (cpu_ctx->cur_index + 1) % cpu_ctx->nctx; 679*11389SAlexander.Kolbasov@Sun.COM 680*11389SAlexander.Kolbasov@Sun.COM ctx = cpu_ctx->ctx_ptr_array[cpu_ctx->cur_index]; 681*11389SAlexander.Kolbasov@Sun.COM 682*11389SAlexander.Kolbasov@Sun.COM /* 683*11389SAlexander.Kolbasov@Sun.COM * Clear KCPC_CTX_INVALID and KCPC_CTX_INVALID_STOPPED from CPU's CPC 684*11389SAlexander.Kolbasov@Sun.COM * context before programming counters 685*11389SAlexander.Kolbasov@Sun.COM * 686*11389SAlexander.Kolbasov@Sun.COM * Context is marked with KCPC_CTX_INVALID_STOPPED when context is 687*11389SAlexander.Kolbasov@Sun.COM * unprogrammed and may be marked with KCPC_CTX_INVALID when 688*11389SAlexander.Kolbasov@Sun.COM * kcpc_invalidate_all() is called by cpustat(1M) and dtrace CPC to 689*11389SAlexander.Kolbasov@Sun.COM * invalidate all CPC contexts before they take over all the counters. 690*11389SAlexander.Kolbasov@Sun.COM * 691*11389SAlexander.Kolbasov@Sun.COM * This isn't necessary since these flags are only used for thread bound 692*11389SAlexander.Kolbasov@Sun.COM * CPC contexts not CPU bound CPC contexts like ones used for capacity 693*11389SAlexander.Kolbasov@Sun.COM * and utilization. 694*11389SAlexander.Kolbasov@Sun.COM * 695*11389SAlexander.Kolbasov@Sun.COM * There is no need to protect the flag update since no one is using 696*11389SAlexander.Kolbasov@Sun.COM * this context now. 697*11389SAlexander.Kolbasov@Sun.COM */ 698*11389SAlexander.Kolbasov@Sun.COM ctx->kc_flags &= ~(KCPC_CTX_INVALID | KCPC_CTX_INVALID_STOPPED); 699*11389SAlexander.Kolbasov@Sun.COM 700*11389SAlexander.Kolbasov@Sun.COM /* 701*11389SAlexander.Kolbasov@Sun.COM * Program counters on this CPU 702*11389SAlexander.Kolbasov@Sun.COM */ 703*11389SAlexander.Kolbasov@Sun.COM kcpc_program(ctx, B_FALSE, B_FALSE); 704*11389SAlexander.Kolbasov@Sun.COM 705*11389SAlexander.Kolbasov@Sun.COM cp->cpu_cpc_ctx = ctx; 706*11389SAlexander.Kolbasov@Sun.COM 707*11389SAlexander.Kolbasov@Sun.COM /* 708*11389SAlexander.Kolbasov@Sun.COM * Set state in CPU structure to say that CPU's counters are programmed 709*11389SAlexander.Kolbasov@Sun.COM * for capacity and utilization now and that they are transitioning from 710*11389SAlexander.Kolbasov@Sun.COM * off to on state. This will cause cu_cpu_update to update stop times 711*11389SAlexander.Kolbasov@Sun.COM * for all programmed counters. 712*11389SAlexander.Kolbasov@Sun.COM */ 713*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_flag |= CU_CPU_CNTRS_ON | CU_CPU_CNTRS_OFF_ON; 714*11389SAlexander.Kolbasov@Sun.COM 715*11389SAlexander.Kolbasov@Sun.COM /* 716*11389SAlexander.Kolbasov@Sun.COM * Update counter statistics 717*11389SAlexander.Kolbasov@Sun.COM */ 718*11389SAlexander.Kolbasov@Sun.COM (void) cu_cpu_update(cp, B_FALSE); 719*11389SAlexander.Kolbasov@Sun.COM 720*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_flag &= ~CU_CPU_CNTRS_OFF_ON; 721*11389SAlexander.Kolbasov@Sun.COM 722*11389SAlexander.Kolbasov@Sun.COM *err = 0; 723*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 724*11389SAlexander.Kolbasov@Sun.COM } 725*11389SAlexander.Kolbasov@Sun.COM 726*11389SAlexander.Kolbasov@Sun.COM 727*11389SAlexander.Kolbasov@Sun.COM /* 728*11389SAlexander.Kolbasov@Sun.COM * Cross call wrapper routine for cu_cpc_program() 729*11389SAlexander.Kolbasov@Sun.COM * 730*11389SAlexander.Kolbasov@Sun.COM * Checks to make sure that counters on CPU aren't being used by someone else 731*11389SAlexander.Kolbasov@Sun.COM * before calling cu_cpc_program() since cu_cpc_program() needs to assert that 732*11389SAlexander.Kolbasov@Sun.COM * nobody else is using the counters to catch and prevent any broken code. 733*11389SAlexander.Kolbasov@Sun.COM * Also, this check needs to happen on the target CPU since the CPU's CPC 734*11389SAlexander.Kolbasov@Sun.COM * context can only be changed while running on the CPU. 735*11389SAlexander.Kolbasov@Sun.COM * 736*11389SAlexander.Kolbasov@Sun.COM * If the first argument is TRUE, cu_cpc_program_xcall also checks that there is 737*11389SAlexander.Kolbasov@Sun.COM * no valid thread bound cpc context. This is important to check to prevent 738*11389SAlexander.Kolbasov@Sun.COM * re-programming thread counters with CU counters when CPU is coming on-line. 739*11389SAlexander.Kolbasov@Sun.COM */ 740*11389SAlexander.Kolbasov@Sun.COM static void 741*11389SAlexander.Kolbasov@Sun.COM cu_cpc_program_xcall(uintptr_t arg, int *err) 742*11389SAlexander.Kolbasov@Sun.COM { 743*11389SAlexander.Kolbasov@Sun.COM boolean_t avoid_thread_context = (boolean_t)arg; 744*11389SAlexander.Kolbasov@Sun.COM 745*11389SAlexander.Kolbasov@Sun.COM kpreempt_disable(); 746*11389SAlexander.Kolbasov@Sun.COM 747*11389SAlexander.Kolbasov@Sun.COM if (CPU->cpu_cpc_ctx != NULL && 748*11389SAlexander.Kolbasov@Sun.COM !(CPU->cpu_cpc_ctx->kc_flags & KCPC_CTX_INVALID_STOPPED)) { 749*11389SAlexander.Kolbasov@Sun.COM *err = -100; 750*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 751*11389SAlexander.Kolbasov@Sun.COM return; 752*11389SAlexander.Kolbasov@Sun.COM } 753*11389SAlexander.Kolbasov@Sun.COM 754*11389SAlexander.Kolbasov@Sun.COM if (avoid_thread_context && (curthread->t_cpc_ctx != NULL) && 755*11389SAlexander.Kolbasov@Sun.COM !(curthread->t_cpc_ctx->kc_flags & KCPC_CTX_INVALID_STOPPED)) { 756*11389SAlexander.Kolbasov@Sun.COM *err = -200; 757*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 758*11389SAlexander.Kolbasov@Sun.COM return; 759*11389SAlexander.Kolbasov@Sun.COM } 760*11389SAlexander.Kolbasov@Sun.COM 761*11389SAlexander.Kolbasov@Sun.COM cu_cpc_program(CPU, err); 762*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 763*11389SAlexander.Kolbasov@Sun.COM } 764*11389SAlexander.Kolbasov@Sun.COM 765*11389SAlexander.Kolbasov@Sun.COM 766*11389SAlexander.Kolbasov@Sun.COM /* 767*11389SAlexander.Kolbasov@Sun.COM * Unprogram counters for capacity and utilization on given CPU 768*11389SAlexander.Kolbasov@Sun.COM * This function should be always executed on the target CPU at high PIL 769*11389SAlexander.Kolbasov@Sun.COM */ 770*11389SAlexander.Kolbasov@Sun.COM void 771*11389SAlexander.Kolbasov@Sun.COM cu_cpc_unprogram(cpu_t *cp, int *err) 772*11389SAlexander.Kolbasov@Sun.COM { 773*11389SAlexander.Kolbasov@Sun.COM cu_cpc_ctx_t *cpu_ctx; 774*11389SAlexander.Kolbasov@Sun.COM kcpc_ctx_t *ctx; 775*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info_t *cu_cpu_info; 776*11389SAlexander.Kolbasov@Sun.COM 777*11389SAlexander.Kolbasov@Sun.COM ASSERT(IS_HIPIL()); 778*11389SAlexander.Kolbasov@Sun.COM /* 779*11389SAlexander.Kolbasov@Sun.COM * Should be running on given CPU with preemption disabled to keep CPU 780*11389SAlexander.Kolbasov@Sun.COM * from disappearing and make sure flags and CPC context don't change 781*11389SAlexander.Kolbasov@Sun.COM * from underneath us 782*11389SAlexander.Kolbasov@Sun.COM */ 783*11389SAlexander.Kolbasov@Sun.COM kpreempt_disable(); 784*11389SAlexander.Kolbasov@Sun.COM ASSERT(cp == CPU); 785*11389SAlexander.Kolbasov@Sun.COM 786*11389SAlexander.Kolbasov@Sun.COM /* 787*11389SAlexander.Kolbasov@Sun.COM * Module not on 788*11389SAlexander.Kolbasov@Sun.COM */ 789*11389SAlexander.Kolbasov@Sun.COM if (!(cu_flags & CU_FLAG_ON)) { 790*11389SAlexander.Kolbasov@Sun.COM *err = -1; 791*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 792*11389SAlexander.Kolbasov@Sun.COM return; 793*11389SAlexander.Kolbasov@Sun.COM } 794*11389SAlexander.Kolbasov@Sun.COM 795*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info = cp->cpu_cu_info; 796*11389SAlexander.Kolbasov@Sun.COM if (cu_cpu_info == NULL) { 797*11389SAlexander.Kolbasov@Sun.COM *err = -3; 798*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 799*11389SAlexander.Kolbasov@Sun.COM return; 800*11389SAlexander.Kolbasov@Sun.COM } 801*11389SAlexander.Kolbasov@Sun.COM 802*11389SAlexander.Kolbasov@Sun.COM /* 803*11389SAlexander.Kolbasov@Sun.COM * Counters turned off already 804*11389SAlexander.Kolbasov@Sun.COM */ 805*11389SAlexander.Kolbasov@Sun.COM if (!(cu_cpu_info->cu_flag & CU_CPU_CNTRS_ON)) { 806*11389SAlexander.Kolbasov@Sun.COM *err = 1; 807*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 808*11389SAlexander.Kolbasov@Sun.COM return; 809*11389SAlexander.Kolbasov@Sun.COM } 810*11389SAlexander.Kolbasov@Sun.COM 811*11389SAlexander.Kolbasov@Sun.COM /* 812*11389SAlexander.Kolbasov@Sun.COM * Update counter statistics 813*11389SAlexander.Kolbasov@Sun.COM */ 814*11389SAlexander.Kolbasov@Sun.COM (void) cu_cpu_update(cp, B_FALSE); 815*11389SAlexander.Kolbasov@Sun.COM 816*11389SAlexander.Kolbasov@Sun.COM /* 817*11389SAlexander.Kolbasov@Sun.COM * Get CPU's CPC context needed for capacity and utilization 818*11389SAlexander.Kolbasov@Sun.COM */ 819*11389SAlexander.Kolbasov@Sun.COM cpu_ctx = &cu_cpu_info->cu_cpc_ctx; 820*11389SAlexander.Kolbasov@Sun.COM if (cpu_ctx->nctx <= 0 || cpu_ctx->ctx_ptr_array == NULL || 821*11389SAlexander.Kolbasov@Sun.COM cpu_ctx->ctx_ptr_array_sz <= 0) { 822*11389SAlexander.Kolbasov@Sun.COM *err = -5; 823*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 824*11389SAlexander.Kolbasov@Sun.COM return; 825*11389SAlexander.Kolbasov@Sun.COM } 826*11389SAlexander.Kolbasov@Sun.COM ctx = cpu_ctx->ctx_ptr_array[cpu_ctx->cur_index]; 827*11389SAlexander.Kolbasov@Sun.COM 828*11389SAlexander.Kolbasov@Sun.COM /* 829*11389SAlexander.Kolbasov@Sun.COM * CPU's CPC context should be current capacity and utilization CPC 830*11389SAlexander.Kolbasov@Sun.COM * context 831*11389SAlexander.Kolbasov@Sun.COM */ 832*11389SAlexander.Kolbasov@Sun.COM ASSERT(cp->cpu_cpc_ctx == ctx); 833*11389SAlexander.Kolbasov@Sun.COM if (cp->cpu_cpc_ctx != ctx) { 834*11389SAlexander.Kolbasov@Sun.COM *err = -6; 835*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 836*11389SAlexander.Kolbasov@Sun.COM return; 837*11389SAlexander.Kolbasov@Sun.COM } 838*11389SAlexander.Kolbasov@Sun.COM 839*11389SAlexander.Kolbasov@Sun.COM /* 840*11389SAlexander.Kolbasov@Sun.COM * Unprogram counters on CPU. 841*11389SAlexander.Kolbasov@Sun.COM */ 842*11389SAlexander.Kolbasov@Sun.COM kcpc_unprogram(ctx, B_FALSE); 843*11389SAlexander.Kolbasov@Sun.COM 844*11389SAlexander.Kolbasov@Sun.COM ASSERT(ctx->kc_flags & KCPC_CTX_INVALID_STOPPED); 845*11389SAlexander.Kolbasov@Sun.COM 846*11389SAlexander.Kolbasov@Sun.COM /* 847*11389SAlexander.Kolbasov@Sun.COM * Unset state in CPU structure saying that CPU's counters are 848*11389SAlexander.Kolbasov@Sun.COM * programmed 849*11389SAlexander.Kolbasov@Sun.COM */ 850*11389SAlexander.Kolbasov@Sun.COM cp->cpu_cpc_ctx = NULL; 851*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_flag &= ~CU_CPU_CNTRS_ON; 852*11389SAlexander.Kolbasov@Sun.COM 853*11389SAlexander.Kolbasov@Sun.COM *err = 0; 854*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 855*11389SAlexander.Kolbasov@Sun.COM } 856*11389SAlexander.Kolbasov@Sun.COM 857*11389SAlexander.Kolbasov@Sun.COM 858*11389SAlexander.Kolbasov@Sun.COM /* 859*11389SAlexander.Kolbasov@Sun.COM * Add given counter event to list of CPC requests 860*11389SAlexander.Kolbasov@Sun.COM */ 861*11389SAlexander.Kolbasov@Sun.COM static int 862*11389SAlexander.Kolbasov@Sun.COM cu_cpc_req_add(char *event, kcpc_request_list_t *reqs, int nreqs, 863*11389SAlexander.Kolbasov@Sun.COM cu_cntr_stats_t *stats, int kmem_flags, int *nevents) 864*11389SAlexander.Kolbasov@Sun.COM { 865*11389SAlexander.Kolbasov@Sun.COM int n; 866*11389SAlexander.Kolbasov@Sun.COM int retval; 867*11389SAlexander.Kolbasov@Sun.COM uint_t flags; 868*11389SAlexander.Kolbasov@Sun.COM 869*11389SAlexander.Kolbasov@Sun.COM /* 870*11389SAlexander.Kolbasov@Sun.COM * Return error when no counter event specified, counter event not 871*11389SAlexander.Kolbasov@Sun.COM * supported by CPC's PCBE, or number of events not given 872*11389SAlexander.Kolbasov@Sun.COM */ 873*11389SAlexander.Kolbasov@Sun.COM if (event == NULL || kcpc_event_supported(event) == B_FALSE || 874*11389SAlexander.Kolbasov@Sun.COM nevents == NULL) 875*11389SAlexander.Kolbasov@Sun.COM return (-1); 876*11389SAlexander.Kolbasov@Sun.COM 877*11389SAlexander.Kolbasov@Sun.COM n = *nevents; 878*11389SAlexander.Kolbasov@Sun.COM 879*11389SAlexander.Kolbasov@Sun.COM /* 880*11389SAlexander.Kolbasov@Sun.COM * Only count number of counter events needed if list 881*11389SAlexander.Kolbasov@Sun.COM * where to add CPC requests not given 882*11389SAlexander.Kolbasov@Sun.COM */ 883*11389SAlexander.Kolbasov@Sun.COM if (reqs == NULL) { 884*11389SAlexander.Kolbasov@Sun.COM n++; 885*11389SAlexander.Kolbasov@Sun.COM *nevents = n; 886*11389SAlexander.Kolbasov@Sun.COM return (-3); 887*11389SAlexander.Kolbasov@Sun.COM } 888*11389SAlexander.Kolbasov@Sun.COM 889*11389SAlexander.Kolbasov@Sun.COM /* 890*11389SAlexander.Kolbasov@Sun.COM * Return error when stats not given or not enough room on list of CPC 891*11389SAlexander.Kolbasov@Sun.COM * requests for more counter events 892*11389SAlexander.Kolbasov@Sun.COM */ 893*11389SAlexander.Kolbasov@Sun.COM if (stats == NULL || (nreqs <= 0 && n >= nreqs)) 894*11389SAlexander.Kolbasov@Sun.COM return (-4); 895*11389SAlexander.Kolbasov@Sun.COM 896*11389SAlexander.Kolbasov@Sun.COM /* 897*11389SAlexander.Kolbasov@Sun.COM * Use flags in cu_cpc_flags to program counters and enable overflow 898*11389SAlexander.Kolbasov@Sun.COM * interrupts/traps (unless PCBE can't handle overflow interrupts) so 899*11389SAlexander.Kolbasov@Sun.COM * PCBE can catch counters before they wrap to hopefully give us an 900*11389SAlexander.Kolbasov@Sun.COM * accurate (64-bit) virtualized counter 901*11389SAlexander.Kolbasov@Sun.COM */ 902*11389SAlexander.Kolbasov@Sun.COM flags = cu_cpc_flags; 903*11389SAlexander.Kolbasov@Sun.COM if ((kcpc_pcbe_capabilities() & CPC_CAP_OVERFLOW_INTERRUPT) == 0) 904*11389SAlexander.Kolbasov@Sun.COM flags &= ~CPC_OVF_NOTIFY_EMT; 905*11389SAlexander.Kolbasov@Sun.COM 906*11389SAlexander.Kolbasov@Sun.COM /* 907*11389SAlexander.Kolbasov@Sun.COM * Add CPC request to list 908*11389SAlexander.Kolbasov@Sun.COM */ 909*11389SAlexander.Kolbasov@Sun.COM retval = kcpc_reqs_add(reqs, event, cu_cpc_preset_value, 910*11389SAlexander.Kolbasov@Sun.COM flags, 0, NULL, &stats[n], kmem_flags); 911*11389SAlexander.Kolbasov@Sun.COM 912*11389SAlexander.Kolbasov@Sun.COM if (retval != 0) 913*11389SAlexander.Kolbasov@Sun.COM return (-5); 914*11389SAlexander.Kolbasov@Sun.COM 915*11389SAlexander.Kolbasov@Sun.COM n++; 916*11389SAlexander.Kolbasov@Sun.COM *nevents = n; 917*11389SAlexander.Kolbasov@Sun.COM return (0); 918*11389SAlexander.Kolbasov@Sun.COM } 919*11389SAlexander.Kolbasov@Sun.COM 920*11389SAlexander.Kolbasov@Sun.COM static void 921*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info_detach_xcall(void) 922*11389SAlexander.Kolbasov@Sun.COM { 923*11389SAlexander.Kolbasov@Sun.COM ASSERT(IS_HIPIL()); 924*11389SAlexander.Kolbasov@Sun.COM 925*11389SAlexander.Kolbasov@Sun.COM CPU->cpu_cu_info = NULL; 926*11389SAlexander.Kolbasov@Sun.COM } 927*11389SAlexander.Kolbasov@Sun.COM 928*11389SAlexander.Kolbasov@Sun.COM 929*11389SAlexander.Kolbasov@Sun.COM /* 930*11389SAlexander.Kolbasov@Sun.COM * Enable or disable collection of capacity/utilization data for a current CPU. 931*11389SAlexander.Kolbasov@Sun.COM * Counters are enabled if 'on' argument is True and disabled if it is False. 932*11389SAlexander.Kolbasov@Sun.COM * This function should be always executed at high PIL 933*11389SAlexander.Kolbasov@Sun.COM */ 934*11389SAlexander.Kolbasov@Sun.COM static void 935*11389SAlexander.Kolbasov@Sun.COM cu_cpc_trigger(uintptr_t arg1, uintptr_t arg2) 936*11389SAlexander.Kolbasov@Sun.COM { 937*11389SAlexander.Kolbasov@Sun.COM cpu_t *cp = (cpu_t *)arg1; 938*11389SAlexander.Kolbasov@Sun.COM boolean_t on = (boolean_t)arg2; 939*11389SAlexander.Kolbasov@Sun.COM int error; 940*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info_t *cu_cpu_info; 941*11389SAlexander.Kolbasov@Sun.COM 942*11389SAlexander.Kolbasov@Sun.COM ASSERT(IS_HIPIL()); 943*11389SAlexander.Kolbasov@Sun.COM kpreempt_disable(); 944*11389SAlexander.Kolbasov@Sun.COM ASSERT(cp == CPU); 945*11389SAlexander.Kolbasov@Sun.COM 946*11389SAlexander.Kolbasov@Sun.COM if (!(cu_flags & CU_FLAG_ON)) { 947*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 948*11389SAlexander.Kolbasov@Sun.COM return; 949*11389SAlexander.Kolbasov@Sun.COM } 950*11389SAlexander.Kolbasov@Sun.COM 951*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info = cp->cpu_cu_info; 952*11389SAlexander.Kolbasov@Sun.COM if (cu_cpu_info == NULL) { 953*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 954*11389SAlexander.Kolbasov@Sun.COM return; 955*11389SAlexander.Kolbasov@Sun.COM } 956*11389SAlexander.Kolbasov@Sun.COM 957*11389SAlexander.Kolbasov@Sun.COM ASSERT(!cu_cpu_info->cu_disabled || 958*11389SAlexander.Kolbasov@Sun.COM !(cu_cpu_info->cu_flag & CU_CPU_CNTRS_ON)); 959*11389SAlexander.Kolbasov@Sun.COM 960*11389SAlexander.Kolbasov@Sun.COM if (on) { 961*11389SAlexander.Kolbasov@Sun.COM /* 962*11389SAlexander.Kolbasov@Sun.COM * Decrement the cu_disabled counter. 963*11389SAlexander.Kolbasov@Sun.COM * Once it drops to zero, call cu_cpc_program. 964*11389SAlexander.Kolbasov@Sun.COM */ 965*11389SAlexander.Kolbasov@Sun.COM if (cu_cpu_info->cu_disabled > 0) 966*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_disabled--; 967*11389SAlexander.Kolbasov@Sun.COM if (cu_cpu_info->cu_disabled == 0) 968*11389SAlexander.Kolbasov@Sun.COM cu_cpc_program(CPU, &error); 969*11389SAlexander.Kolbasov@Sun.COM } else if (cu_cpu_info->cu_disabled++ == 0) { 970*11389SAlexander.Kolbasov@Sun.COM /* 971*11389SAlexander.Kolbasov@Sun.COM * This is the first attempt to disable CU, so turn it off 972*11389SAlexander.Kolbasov@Sun.COM */ 973*11389SAlexander.Kolbasov@Sun.COM cu_cpc_unprogram(cp, &error); 974*11389SAlexander.Kolbasov@Sun.COM ASSERT(!(cu_cpu_info->cu_flag & CU_CPU_CNTRS_ON)); 975*11389SAlexander.Kolbasov@Sun.COM } 976*11389SAlexander.Kolbasov@Sun.COM 977*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 978*11389SAlexander.Kolbasov@Sun.COM } 979*11389SAlexander.Kolbasov@Sun.COM 980*11389SAlexander.Kolbasov@Sun.COM 981*11389SAlexander.Kolbasov@Sun.COM /* 982*11389SAlexander.Kolbasov@Sun.COM * Callback for changes in CPU states 983*11389SAlexander.Kolbasov@Sun.COM * Used to enable or disable hardware performance counters on CPUs that are 984*11389SAlexander.Kolbasov@Sun.COM * turned on or off 985*11389SAlexander.Kolbasov@Sun.COM * 986*11389SAlexander.Kolbasov@Sun.COM * NOTE: cpc should be programmed/unprogrammed while running on the target CPU. 987*11389SAlexander.Kolbasov@Sun.COM * We have to use thread_affinity_set to hop to the right CPU because these 988*11389SAlexander.Kolbasov@Sun.COM * routines expect cpu_lock held, so we can't cross-call other CPUs while 989*11389SAlexander.Kolbasov@Sun.COM * holding CPU lock. 990*11389SAlexander.Kolbasov@Sun.COM */ 991*11389SAlexander.Kolbasov@Sun.COM static int 992*11389SAlexander.Kolbasov@Sun.COM /* LINTED E_FUNC_ARG_UNUSED */ 993*11389SAlexander.Kolbasov@Sun.COM cu_cpu_callback(cpu_setup_t what, int id, void *arg) 994*11389SAlexander.Kolbasov@Sun.COM { 995*11389SAlexander.Kolbasov@Sun.COM cpu_t *cp; 996*11389SAlexander.Kolbasov@Sun.COM int retval = 0; 997*11389SAlexander.Kolbasov@Sun.COM 998*11389SAlexander.Kolbasov@Sun.COM ASSERT(MUTEX_HELD(&cpu_lock)); 999*11389SAlexander.Kolbasov@Sun.COM 1000*11389SAlexander.Kolbasov@Sun.COM if (!(cu_flags & CU_FLAG_ON)) 1001*11389SAlexander.Kolbasov@Sun.COM return (-1); 1002*11389SAlexander.Kolbasov@Sun.COM 1003*11389SAlexander.Kolbasov@Sun.COM cp = cpu_get(id); 1004*11389SAlexander.Kolbasov@Sun.COM if (cp == NULL) 1005*11389SAlexander.Kolbasov@Sun.COM return (-2); 1006*11389SAlexander.Kolbasov@Sun.COM 1007*11389SAlexander.Kolbasov@Sun.COM switch (what) { 1008*11389SAlexander.Kolbasov@Sun.COM case CPU_ON: 1009*11389SAlexander.Kolbasov@Sun.COM /* 1010*11389SAlexander.Kolbasov@Sun.COM * Setup counters on CPU being turned on 1011*11389SAlexander.Kolbasov@Sun.COM */ 1012*11389SAlexander.Kolbasov@Sun.COM retval = cu_cpu_init(cp, cu_cpc_reqs); 1013*11389SAlexander.Kolbasov@Sun.COM 1014*11389SAlexander.Kolbasov@Sun.COM /* 1015*11389SAlexander.Kolbasov@Sun.COM * Reset list of counter event requests so its space can be 1016*11389SAlexander.Kolbasov@Sun.COM * reused for a different set of requests for next CPU 1017*11389SAlexander.Kolbasov@Sun.COM */ 1018*11389SAlexander.Kolbasov@Sun.COM (void) kcpc_reqs_reset(cu_cpc_reqs); 1019*11389SAlexander.Kolbasov@Sun.COM break; 1020*11389SAlexander.Kolbasov@Sun.COM case CPU_INTR_ON: 1021*11389SAlexander.Kolbasov@Sun.COM /* 1022*11389SAlexander.Kolbasov@Sun.COM * Setup counters on CPU being turned on. 1023*11389SAlexander.Kolbasov@Sun.COM */ 1024*11389SAlexander.Kolbasov@Sun.COM retval = cu_cpu_run(cp, cu_cpc_program_xcall, 1025*11389SAlexander.Kolbasov@Sun.COM (uintptr_t)B_TRUE); 1026*11389SAlexander.Kolbasov@Sun.COM break; 1027*11389SAlexander.Kolbasov@Sun.COM case CPU_OFF: 1028*11389SAlexander.Kolbasov@Sun.COM /* 1029*11389SAlexander.Kolbasov@Sun.COM * Disable counters on CPU being turned off. Counters will not 1030*11389SAlexander.Kolbasov@Sun.COM * be re-enabled on this CPU until it comes back online. 1031*11389SAlexander.Kolbasov@Sun.COM */ 1032*11389SAlexander.Kolbasov@Sun.COM cu_cpu_disable(cp); 1033*11389SAlexander.Kolbasov@Sun.COM ASSERT(!CU_CPC_ON(cp)); 1034*11389SAlexander.Kolbasov@Sun.COM retval = cu_cpu_fini(cp); 1035*11389SAlexander.Kolbasov@Sun.COM break; 1036*11389SAlexander.Kolbasov@Sun.COM default: 1037*11389SAlexander.Kolbasov@Sun.COM break; 1038*11389SAlexander.Kolbasov@Sun.COM } 1039*11389SAlexander.Kolbasov@Sun.COM return (retval); 1040*11389SAlexander.Kolbasov@Sun.COM } 1041*11389SAlexander.Kolbasov@Sun.COM 1042*11389SAlexander.Kolbasov@Sun.COM 1043*11389SAlexander.Kolbasov@Sun.COM /* 1044*11389SAlexander.Kolbasov@Sun.COM * Disable or enable Capacity Utilization counters on a given CPU. This function 1045*11389SAlexander.Kolbasov@Sun.COM * can be called from any CPU to disable counters on the given CPU. 1046*11389SAlexander.Kolbasov@Sun.COM */ 1047*11389SAlexander.Kolbasov@Sun.COM static void 1048*11389SAlexander.Kolbasov@Sun.COM cu_cpu_disable(cpu_t *cp) 1049*11389SAlexander.Kolbasov@Sun.COM { 1050*11389SAlexander.Kolbasov@Sun.COM cpu_call(cp, cu_cpc_trigger, (uintptr_t)cp, (uintptr_t)B_FALSE); 1051*11389SAlexander.Kolbasov@Sun.COM } 1052*11389SAlexander.Kolbasov@Sun.COM 1053*11389SAlexander.Kolbasov@Sun.COM 1054*11389SAlexander.Kolbasov@Sun.COM static void 1055*11389SAlexander.Kolbasov@Sun.COM cu_cpu_enable(cpu_t *cp) 1056*11389SAlexander.Kolbasov@Sun.COM { 1057*11389SAlexander.Kolbasov@Sun.COM cpu_call(cp, cu_cpc_trigger, (uintptr_t)cp, (uintptr_t)B_TRUE); 1058*11389SAlexander.Kolbasov@Sun.COM } 1059*11389SAlexander.Kolbasov@Sun.COM 1060*11389SAlexander.Kolbasov@Sun.COM 1061*11389SAlexander.Kolbasov@Sun.COM /* 1062*11389SAlexander.Kolbasov@Sun.COM * Setup capacity and utilization support for given CPU 1063*11389SAlexander.Kolbasov@Sun.COM * 1064*11389SAlexander.Kolbasov@Sun.COM * NOTE: Use KM_NOSLEEP for kmem_{,z}alloc() since cpu_lock is held and free 1065*11389SAlexander.Kolbasov@Sun.COM * everything that has been successfully allocated including cpu_cu_info 1066*11389SAlexander.Kolbasov@Sun.COM * if any memory allocation fails 1067*11389SAlexander.Kolbasov@Sun.COM */ 1068*11389SAlexander.Kolbasov@Sun.COM static int 1069*11389SAlexander.Kolbasov@Sun.COM cu_cpu_init(cpu_t *cp, kcpc_request_list_t *reqs) 1070*11389SAlexander.Kolbasov@Sun.COM { 1071*11389SAlexander.Kolbasov@Sun.COM kcpc_ctx_t **ctx_ptr_array; 1072*11389SAlexander.Kolbasov@Sun.COM size_t ctx_ptr_array_sz; 1073*11389SAlexander.Kolbasov@Sun.COM cu_cpc_ctx_t *cpu_ctx; 1074*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info_t *cu_cpu_info; 1075*11389SAlexander.Kolbasov@Sun.COM int n; 1076*11389SAlexander.Kolbasov@Sun.COM 1077*11389SAlexander.Kolbasov@Sun.COM /* 1078*11389SAlexander.Kolbasov@Sun.COM * cpu_lock should be held and protect against CPU going away and races 1079*11389SAlexander.Kolbasov@Sun.COM * with cu_{init,fini,cpu_fini}() 1080*11389SAlexander.Kolbasov@Sun.COM */ 1081*11389SAlexander.Kolbasov@Sun.COM ASSERT(MUTEX_HELD(&cpu_lock)); 1082*11389SAlexander.Kolbasov@Sun.COM 1083*11389SAlexander.Kolbasov@Sun.COM /* 1084*11389SAlexander.Kolbasov@Sun.COM * Return if not ready to setup counters yet 1085*11389SAlexander.Kolbasov@Sun.COM */ 1086*11389SAlexander.Kolbasov@Sun.COM if (!(cu_flags & CU_FLAG_READY)) 1087*11389SAlexander.Kolbasov@Sun.COM return (-1); 1088*11389SAlexander.Kolbasov@Sun.COM 1089*11389SAlexander.Kolbasov@Sun.COM if (cp->cpu_cu_info == NULL) { 1090*11389SAlexander.Kolbasov@Sun.COM cp->cpu_cu_info = kmem_zalloc(sizeof (cu_cpu_info_t), 1091*11389SAlexander.Kolbasov@Sun.COM KM_NOSLEEP); 1092*11389SAlexander.Kolbasov@Sun.COM if (cp->cpu_cu_info == NULL) 1093*11389SAlexander.Kolbasov@Sun.COM return (-2); 1094*11389SAlexander.Kolbasov@Sun.COM } 1095*11389SAlexander.Kolbasov@Sun.COM 1096*11389SAlexander.Kolbasov@Sun.COM /* 1097*11389SAlexander.Kolbasov@Sun.COM * Get capacity and utilization CPC context for CPU and check to see 1098*11389SAlexander.Kolbasov@Sun.COM * whether it has been setup already 1099*11389SAlexander.Kolbasov@Sun.COM */ 1100*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info = cp->cpu_cu_info; 1101*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_cpu = cp; 1102*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_disabled = dtrace_cpc_in_use ? 1 : 0; 1103*11389SAlexander.Kolbasov@Sun.COM 1104*11389SAlexander.Kolbasov@Sun.COM cpu_ctx = &cu_cpu_info->cu_cpc_ctx; 1105*11389SAlexander.Kolbasov@Sun.COM if (cpu_ctx->nctx > 0 && cpu_ctx->ctx_ptr_array != NULL && 1106*11389SAlexander.Kolbasov@Sun.COM cpu_ctx->ctx_ptr_array_sz > 0) { 1107*11389SAlexander.Kolbasov@Sun.COM return (1); 1108*11389SAlexander.Kolbasov@Sun.COM } 1109*11389SAlexander.Kolbasov@Sun.COM 1110*11389SAlexander.Kolbasov@Sun.COM /* 1111*11389SAlexander.Kolbasov@Sun.COM * Should have no contexts since it hasn't been setup already 1112*11389SAlexander.Kolbasov@Sun.COM */ 1113*11389SAlexander.Kolbasov@Sun.COM ASSERT(cpu_ctx->nctx == 0 && cpu_ctx->ctx_ptr_array == NULL && 1114*11389SAlexander.Kolbasov@Sun.COM cpu_ctx->ctx_ptr_array_sz == 0); 1115*11389SAlexander.Kolbasov@Sun.COM 1116*11389SAlexander.Kolbasov@Sun.COM /* 1117*11389SAlexander.Kolbasov@Sun.COM * Determine how many CPC events needed to measure capacity and 1118*11389SAlexander.Kolbasov@Sun.COM * utilization for this CPU, allocate space for counter statistics for 1119*11389SAlexander.Kolbasov@Sun.COM * each event, and fill in list of CPC event requests with corresponding 1120*11389SAlexander.Kolbasov@Sun.COM * counter stats for each request to make attributing counter data 1121*11389SAlexander.Kolbasov@Sun.COM * easier later.... 1122*11389SAlexander.Kolbasov@Sun.COM */ 1123*11389SAlexander.Kolbasov@Sun.COM n = cu_cpc_init(cp, NULL, 0); 1124*11389SAlexander.Kolbasov@Sun.COM if (n <= 0) { 1125*11389SAlexander.Kolbasov@Sun.COM (void) cu_cpu_fini(cp); 1126*11389SAlexander.Kolbasov@Sun.COM return (-3); 1127*11389SAlexander.Kolbasov@Sun.COM } 1128*11389SAlexander.Kolbasov@Sun.COM 1129*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_cntr_stats = kmem_zalloc(n * sizeof (cu_cntr_stats_t), 1130*11389SAlexander.Kolbasov@Sun.COM KM_NOSLEEP); 1131*11389SAlexander.Kolbasov@Sun.COM if (cu_cpu_info->cu_cntr_stats == NULL) { 1132*11389SAlexander.Kolbasov@Sun.COM (void) cu_cpu_fini(cp); 1133*11389SAlexander.Kolbasov@Sun.COM return (-4); 1134*11389SAlexander.Kolbasov@Sun.COM } 1135*11389SAlexander.Kolbasov@Sun.COM 1136*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_ncntr_stats = n; 1137*11389SAlexander.Kolbasov@Sun.COM 1138*11389SAlexander.Kolbasov@Sun.COM n = cu_cpc_init(cp, reqs, n); 1139*11389SAlexander.Kolbasov@Sun.COM if (n <= 0) { 1140*11389SAlexander.Kolbasov@Sun.COM (void) cu_cpu_fini(cp); 1141*11389SAlexander.Kolbasov@Sun.COM return (-5); 1142*11389SAlexander.Kolbasov@Sun.COM } 1143*11389SAlexander.Kolbasov@Sun.COM 1144*11389SAlexander.Kolbasov@Sun.COM /* 1145*11389SAlexander.Kolbasov@Sun.COM * Create CPC context with given requests 1146*11389SAlexander.Kolbasov@Sun.COM */ 1147*11389SAlexander.Kolbasov@Sun.COM ctx_ptr_array = NULL; 1148*11389SAlexander.Kolbasov@Sun.COM ctx_ptr_array_sz = 0; 1149*11389SAlexander.Kolbasov@Sun.COM n = kcpc_cpu_ctx_create(cp, reqs, KM_NOSLEEP, &ctx_ptr_array, 1150*11389SAlexander.Kolbasov@Sun.COM &ctx_ptr_array_sz); 1151*11389SAlexander.Kolbasov@Sun.COM if (n <= 0) { 1152*11389SAlexander.Kolbasov@Sun.COM (void) cu_cpu_fini(cp); 1153*11389SAlexander.Kolbasov@Sun.COM return (-6); 1154*11389SAlexander.Kolbasov@Sun.COM } 1155*11389SAlexander.Kolbasov@Sun.COM 1156*11389SAlexander.Kolbasov@Sun.COM /* 1157*11389SAlexander.Kolbasov@Sun.COM * Should have contexts 1158*11389SAlexander.Kolbasov@Sun.COM */ 1159*11389SAlexander.Kolbasov@Sun.COM ASSERT(n > 0 && ctx_ptr_array != NULL && ctx_ptr_array_sz > 0); 1160*11389SAlexander.Kolbasov@Sun.COM if (ctx_ptr_array == NULL || ctx_ptr_array_sz <= 0) { 1161*11389SAlexander.Kolbasov@Sun.COM (void) cu_cpu_fini(cp); 1162*11389SAlexander.Kolbasov@Sun.COM return (-7); 1163*11389SAlexander.Kolbasov@Sun.COM } 1164*11389SAlexander.Kolbasov@Sun.COM 1165*11389SAlexander.Kolbasov@Sun.COM /* 1166*11389SAlexander.Kolbasov@Sun.COM * Fill in CPC context info for CPU needed for capacity and utilization 1167*11389SAlexander.Kolbasov@Sun.COM */ 1168*11389SAlexander.Kolbasov@Sun.COM cpu_ctx->cur_index = 0; 1169*11389SAlexander.Kolbasov@Sun.COM cpu_ctx->nctx = n; 1170*11389SAlexander.Kolbasov@Sun.COM cpu_ctx->ctx_ptr_array = ctx_ptr_array; 1171*11389SAlexander.Kolbasov@Sun.COM cpu_ctx->ctx_ptr_array_sz = ctx_ptr_array_sz; 1172*11389SAlexander.Kolbasov@Sun.COM return (0); 1173*11389SAlexander.Kolbasov@Sun.COM } 1174*11389SAlexander.Kolbasov@Sun.COM 1175*11389SAlexander.Kolbasov@Sun.COM /* 1176*11389SAlexander.Kolbasov@Sun.COM * Tear down capacity and utilization support for given CPU 1177*11389SAlexander.Kolbasov@Sun.COM */ 1178*11389SAlexander.Kolbasov@Sun.COM static int 1179*11389SAlexander.Kolbasov@Sun.COM cu_cpu_fini(cpu_t *cp) 1180*11389SAlexander.Kolbasov@Sun.COM { 1181*11389SAlexander.Kolbasov@Sun.COM kcpc_ctx_t *ctx; 1182*11389SAlexander.Kolbasov@Sun.COM cu_cpc_ctx_t *cpu_ctx; 1183*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info_t *cu_cpu_info; 1184*11389SAlexander.Kolbasov@Sun.COM int i; 1185*11389SAlexander.Kolbasov@Sun.COM pghw_type_t pg_hw_type; 1186*11389SAlexander.Kolbasov@Sun.COM 1187*11389SAlexander.Kolbasov@Sun.COM /* 1188*11389SAlexander.Kolbasov@Sun.COM * cpu_lock should be held and protect against CPU going away and races 1189*11389SAlexander.Kolbasov@Sun.COM * with cu_{init,fini,cpu_init}() 1190*11389SAlexander.Kolbasov@Sun.COM */ 1191*11389SAlexander.Kolbasov@Sun.COM ASSERT(MUTEX_HELD(&cpu_lock)); 1192*11389SAlexander.Kolbasov@Sun.COM 1193*11389SAlexander.Kolbasov@Sun.COM /* 1194*11389SAlexander.Kolbasov@Sun.COM * Have to at least be ready to setup counters to have allocated 1195*11389SAlexander.Kolbasov@Sun.COM * anything that needs to be deallocated now 1196*11389SAlexander.Kolbasov@Sun.COM */ 1197*11389SAlexander.Kolbasov@Sun.COM if (!(cu_flags & CU_FLAG_READY)) 1198*11389SAlexander.Kolbasov@Sun.COM return (-1); 1199*11389SAlexander.Kolbasov@Sun.COM 1200*11389SAlexander.Kolbasov@Sun.COM /* 1201*11389SAlexander.Kolbasov@Sun.COM * Nothing to do if CPU's capacity and utilization info doesn't exist 1202*11389SAlexander.Kolbasov@Sun.COM */ 1203*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info = cp->cpu_cu_info; 1204*11389SAlexander.Kolbasov@Sun.COM if (cu_cpu_info == NULL) 1205*11389SAlexander.Kolbasov@Sun.COM return (1); 1206*11389SAlexander.Kolbasov@Sun.COM 1207*11389SAlexander.Kolbasov@Sun.COM /* 1208*11389SAlexander.Kolbasov@Sun.COM * Tear down any existing kstats and counter info for each hardware 1209*11389SAlexander.Kolbasov@Sun.COM * sharing relationship 1210*11389SAlexander.Kolbasov@Sun.COM */ 1211*11389SAlexander.Kolbasov@Sun.COM for (pg_hw_type = PGHW_START; pg_hw_type < PGHW_NUM_COMPONENTS; 1212*11389SAlexander.Kolbasov@Sun.COM pg_hw_type++) { 1213*11389SAlexander.Kolbasov@Sun.COM cu_cntr_info_t *cntr_info; 1214*11389SAlexander.Kolbasov@Sun.COM 1215*11389SAlexander.Kolbasov@Sun.COM cntr_info = cu_cpu_info->cu_cntr_info[pg_hw_type]; 1216*11389SAlexander.Kolbasov@Sun.COM if (cntr_info == NULL) 1217*11389SAlexander.Kolbasov@Sun.COM continue; 1218*11389SAlexander.Kolbasov@Sun.COM 1219*11389SAlexander.Kolbasov@Sun.COM if (cntr_info->ci_kstat != NULL) { 1220*11389SAlexander.Kolbasov@Sun.COM kstat_delete(cntr_info->ci_kstat); 1221*11389SAlexander.Kolbasov@Sun.COM cntr_info->ci_kstat = NULL; 1222*11389SAlexander.Kolbasov@Sun.COM } 1223*11389SAlexander.Kolbasov@Sun.COM kmem_free(cntr_info, sizeof (cu_cntr_info_t)); 1224*11389SAlexander.Kolbasov@Sun.COM } 1225*11389SAlexander.Kolbasov@Sun.COM 1226*11389SAlexander.Kolbasov@Sun.COM /* 1227*11389SAlexander.Kolbasov@Sun.COM * Free counter statistics for CPU 1228*11389SAlexander.Kolbasov@Sun.COM */ 1229*11389SAlexander.Kolbasov@Sun.COM ASSERT(cu_cpu_info->cu_cntr_stats == NULL || 1230*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_ncntr_stats > 0); 1231*11389SAlexander.Kolbasov@Sun.COM if (cu_cpu_info->cu_cntr_stats != NULL && 1232*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_ncntr_stats > 0) { 1233*11389SAlexander.Kolbasov@Sun.COM kmem_free(cu_cpu_info->cu_cntr_stats, 1234*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_ncntr_stats * sizeof (cu_cntr_stats_t)); 1235*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_cntr_stats = NULL; 1236*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_ncntr_stats = 0; 1237*11389SAlexander.Kolbasov@Sun.COM } 1238*11389SAlexander.Kolbasov@Sun.COM 1239*11389SAlexander.Kolbasov@Sun.COM /* 1240*11389SAlexander.Kolbasov@Sun.COM * Get capacity and utilization CPC contexts for given CPU and check to 1241*11389SAlexander.Kolbasov@Sun.COM * see whether they have been freed already 1242*11389SAlexander.Kolbasov@Sun.COM */ 1243*11389SAlexander.Kolbasov@Sun.COM cpu_ctx = &cu_cpu_info->cu_cpc_ctx; 1244*11389SAlexander.Kolbasov@Sun.COM if (cpu_ctx != NULL && cpu_ctx->ctx_ptr_array != NULL && 1245*11389SAlexander.Kolbasov@Sun.COM cpu_ctx->ctx_ptr_array_sz > 0) { 1246*11389SAlexander.Kolbasov@Sun.COM /* 1247*11389SAlexander.Kolbasov@Sun.COM * Free CPC contexts for given CPU 1248*11389SAlexander.Kolbasov@Sun.COM */ 1249*11389SAlexander.Kolbasov@Sun.COM for (i = 0; i < cpu_ctx->nctx; i++) { 1250*11389SAlexander.Kolbasov@Sun.COM ctx = cpu_ctx->ctx_ptr_array[i]; 1251*11389SAlexander.Kolbasov@Sun.COM if (ctx == NULL) 1252*11389SAlexander.Kolbasov@Sun.COM continue; 1253*11389SAlexander.Kolbasov@Sun.COM kcpc_free(ctx, 0); 1254*11389SAlexander.Kolbasov@Sun.COM } 1255*11389SAlexander.Kolbasov@Sun.COM 1256*11389SAlexander.Kolbasov@Sun.COM /* 1257*11389SAlexander.Kolbasov@Sun.COM * Free CPC context pointer array 1258*11389SAlexander.Kolbasov@Sun.COM */ 1259*11389SAlexander.Kolbasov@Sun.COM kmem_free(cpu_ctx->ctx_ptr_array, cpu_ctx->ctx_ptr_array_sz); 1260*11389SAlexander.Kolbasov@Sun.COM 1261*11389SAlexander.Kolbasov@Sun.COM /* 1262*11389SAlexander.Kolbasov@Sun.COM * Zero CPC info for CPU 1263*11389SAlexander.Kolbasov@Sun.COM */ 1264*11389SAlexander.Kolbasov@Sun.COM bzero(cpu_ctx, sizeof (cu_cpc_ctx_t)); 1265*11389SAlexander.Kolbasov@Sun.COM } 1266*11389SAlexander.Kolbasov@Sun.COM 1267*11389SAlexander.Kolbasov@Sun.COM /* 1268*11389SAlexander.Kolbasov@Sun.COM * Set cp->cpu_cu_info pointer to NULL. Go through cross-call to ensure 1269*11389SAlexander.Kolbasov@Sun.COM * that no one is going to access the cpu_cu_info whicch we are going to 1270*11389SAlexander.Kolbasov@Sun.COM * free. 1271*11389SAlexander.Kolbasov@Sun.COM */ 1272*11389SAlexander.Kolbasov@Sun.COM if (cpu_is_online(cp)) 1273*11389SAlexander.Kolbasov@Sun.COM cpu_call(cp, (cpu_call_func_t)cu_cpu_info_detach_xcall, 0, 0); 1274*11389SAlexander.Kolbasov@Sun.COM else 1275*11389SAlexander.Kolbasov@Sun.COM cp->cpu_cu_info = NULL; 1276*11389SAlexander.Kolbasov@Sun.COM 1277*11389SAlexander.Kolbasov@Sun.COM /* 1278*11389SAlexander.Kolbasov@Sun.COM * Free CPU's capacity and utilization info 1279*11389SAlexander.Kolbasov@Sun.COM */ 1280*11389SAlexander.Kolbasov@Sun.COM kmem_free(cu_cpu_info, sizeof (cu_cpu_info_t)); 1281*11389SAlexander.Kolbasov@Sun.COM 1282*11389SAlexander.Kolbasov@Sun.COM return (0); 1283*11389SAlexander.Kolbasov@Sun.COM } 1284*11389SAlexander.Kolbasov@Sun.COM 1285*11389SAlexander.Kolbasov@Sun.COM /* 1286*11389SAlexander.Kolbasov@Sun.COM * Create capacity & utilization kstats for given PG CPU hardware sharing 1287*11389SAlexander.Kolbasov@Sun.COM * relationship 1288*11389SAlexander.Kolbasov@Sun.COM */ 1289*11389SAlexander.Kolbasov@Sun.COM static void 1290*11389SAlexander.Kolbasov@Sun.COM cu_cpu_kstat_create(pghw_t *pg, cu_cntr_info_t *cntr_info) 1291*11389SAlexander.Kolbasov@Sun.COM { 1292*11389SAlexander.Kolbasov@Sun.COM char *class, *sh_name; 1293*11389SAlexander.Kolbasov@Sun.COM kstat_t *ks; 1294*11389SAlexander.Kolbasov@Sun.COM 1295*11389SAlexander.Kolbasov@Sun.COM /* 1296*11389SAlexander.Kolbasov@Sun.COM * Just return when no counter info or CPU 1297*11389SAlexander.Kolbasov@Sun.COM */ 1298*11389SAlexander.Kolbasov@Sun.COM if (cntr_info == NULL || cntr_info->ci_cpu == NULL) 1299*11389SAlexander.Kolbasov@Sun.COM return; 1300*11389SAlexander.Kolbasov@Sun.COM 1301*11389SAlexander.Kolbasov@Sun.COM /* 1302*11389SAlexander.Kolbasov@Sun.COM * Get the class name from the leaf PG that this CPU belongs to. 1303*11389SAlexander.Kolbasov@Sun.COM * If there are no PGs, just use the default class "cpu". 1304*11389SAlexander.Kolbasov@Sun.COM */ 1305*11389SAlexander.Kolbasov@Sun.COM class = pg ? pghw_type_string(pg->pghw_hw) : "cpu"; 1306*11389SAlexander.Kolbasov@Sun.COM sh_name = pg ? pghw_type_shortstring(pg->pghw_hw) : "cpu"; 1307*11389SAlexander.Kolbasov@Sun.COM 1308*11389SAlexander.Kolbasov@Sun.COM if ((ks = kstat_create_zone("pg_cpu", cntr_info->ci_cpu->cpu_id, 1309*11389SAlexander.Kolbasov@Sun.COM sh_name, class, KSTAT_TYPE_NAMED, 1310*11389SAlexander.Kolbasov@Sun.COM sizeof (cu_cpu_kstat) / sizeof (kstat_named_t), 1311*11389SAlexander.Kolbasov@Sun.COM KSTAT_FLAG_VIRTUAL, GLOBAL_ZONEID)) == NULL) 1312*11389SAlexander.Kolbasov@Sun.COM return; 1313*11389SAlexander.Kolbasov@Sun.COM 1314*11389SAlexander.Kolbasov@Sun.COM ks->ks_lock = &pg_cpu_kstat_lock; 1315*11389SAlexander.Kolbasov@Sun.COM ks->ks_data = &cu_cpu_kstat; 1316*11389SAlexander.Kolbasov@Sun.COM ks->ks_update = cu_cpu_kstat_update; 1317*11389SAlexander.Kolbasov@Sun.COM 1318*11389SAlexander.Kolbasov@Sun.COM ks->ks_private = cntr_info; 1319*11389SAlexander.Kolbasov@Sun.COM cntr_info->ci_kstat = ks; 1320*11389SAlexander.Kolbasov@Sun.COM kstat_install(cntr_info->ci_kstat); 1321*11389SAlexander.Kolbasov@Sun.COM } 1322*11389SAlexander.Kolbasov@Sun.COM 1323*11389SAlexander.Kolbasov@Sun.COM 1324*11389SAlexander.Kolbasov@Sun.COM /* 1325*11389SAlexander.Kolbasov@Sun.COM * Propagate values from CPU capacity & utilization stats to kstats 1326*11389SAlexander.Kolbasov@Sun.COM */ 1327*11389SAlexander.Kolbasov@Sun.COM static int 1328*11389SAlexander.Kolbasov@Sun.COM cu_cpu_kstat_update(kstat_t *ksp, int rw) 1329*11389SAlexander.Kolbasov@Sun.COM { 1330*11389SAlexander.Kolbasov@Sun.COM cpu_t *cp; 1331*11389SAlexander.Kolbasov@Sun.COM cu_cntr_info_t *cntr_info = ksp->ks_private; 1332*11389SAlexander.Kolbasov@Sun.COM struct cu_cpu_kstat *kstat = &cu_cpu_kstat; 1333*11389SAlexander.Kolbasov@Sun.COM pghw_t *pg; 1334*11389SAlexander.Kolbasov@Sun.COM cu_cntr_stats_t *stats; 1335*11389SAlexander.Kolbasov@Sun.COM 1336*11389SAlexander.Kolbasov@Sun.COM if (rw == KSTAT_WRITE) 1337*11389SAlexander.Kolbasov@Sun.COM return (EACCES); 1338*11389SAlexander.Kolbasov@Sun.COM 1339*11389SAlexander.Kolbasov@Sun.COM kpreempt_disable(); 1340*11389SAlexander.Kolbasov@Sun.COM 1341*11389SAlexander.Kolbasov@Sun.COM /* 1342*11389SAlexander.Kolbasov@Sun.COM * Update capacity and utilization statistics needed for CPU's PG (CPU) 1343*11389SAlexander.Kolbasov@Sun.COM * kstats 1344*11389SAlexander.Kolbasov@Sun.COM */ 1345*11389SAlexander.Kolbasov@Sun.COM cp = cntr_info->ci_cpu; 1346*11389SAlexander.Kolbasov@Sun.COM (void) cu_cpu_update(cp, B_TRUE); 1347*11389SAlexander.Kolbasov@Sun.COM 1348*11389SAlexander.Kolbasov@Sun.COM pg = cntr_info->ci_pg; 1349*11389SAlexander.Kolbasov@Sun.COM stats = cntr_info->ci_stats; 1350*11389SAlexander.Kolbasov@Sun.COM kstat->cu_cpu_id.value.ui32 = cp->cpu_id; 1351*11389SAlexander.Kolbasov@Sun.COM kstat->cu_generation.value.ui32 = cp->cpu_generation; 1352*11389SAlexander.Kolbasov@Sun.COM if (pg == NULL) 1353*11389SAlexander.Kolbasov@Sun.COM kstat->cu_pg_id.value.l = -1; 1354*11389SAlexander.Kolbasov@Sun.COM else 1355*11389SAlexander.Kolbasov@Sun.COM kstat->cu_pg_id.value.l = pg->pghw_pg.pg_id; 1356*11389SAlexander.Kolbasov@Sun.COM 1357*11389SAlexander.Kolbasov@Sun.COM kstat->cu_cpu_util.value.ui64 = stats->cs_value_total; 1358*11389SAlexander.Kolbasov@Sun.COM kstat->cu_cpu_rate.value.ui64 = stats->cs_rate; 1359*11389SAlexander.Kolbasov@Sun.COM kstat->cu_cpu_rate_max.value.ui64 = stats->cs_rate_max; 1360*11389SAlexander.Kolbasov@Sun.COM kstat->cu_cpu_time_running.value.ui64 = stats->cs_time_running; 1361*11389SAlexander.Kolbasov@Sun.COM kstat->cu_cpu_time_stopped.value.ui64 = stats->cs_time_stopped; 1362*11389SAlexander.Kolbasov@Sun.COM /* 1363*11389SAlexander.Kolbasov@Sun.COM * Counters are stopped now, so the cs_time_stopped was last 1364*11389SAlexander.Kolbasov@Sun.COM * updated at cs_time_start time. Add the time passed since then 1365*11389SAlexander.Kolbasov@Sun.COM * to the stopped time. 1366*11389SAlexander.Kolbasov@Sun.COM */ 1367*11389SAlexander.Kolbasov@Sun.COM if (!(cp->cpu_cu_info->cu_flag & CU_CPU_CNTRS_ON)) 1368*11389SAlexander.Kolbasov@Sun.COM kstat->cu_cpu_time_stopped.value.ui64 += 1369*11389SAlexander.Kolbasov@Sun.COM gethrtime() - stats->cs_time_start; 1370*11389SAlexander.Kolbasov@Sun.COM 1371*11389SAlexander.Kolbasov@Sun.COM kpreempt_enable(); 1372*11389SAlexander.Kolbasov@Sun.COM 1373*11389SAlexander.Kolbasov@Sun.COM return (0); 1374*11389SAlexander.Kolbasov@Sun.COM } 1375*11389SAlexander.Kolbasov@Sun.COM 1376*11389SAlexander.Kolbasov@Sun.COM /* 1377*11389SAlexander.Kolbasov@Sun.COM * Run specified function with specified argument on a given CPU and return 1378*11389SAlexander.Kolbasov@Sun.COM * whatever the function returns 1379*11389SAlexander.Kolbasov@Sun.COM */ 1380*11389SAlexander.Kolbasov@Sun.COM static int 1381*11389SAlexander.Kolbasov@Sun.COM cu_cpu_run(cpu_t *cp, cu_cpu_func_t func, uintptr_t arg) 1382*11389SAlexander.Kolbasov@Sun.COM { 1383*11389SAlexander.Kolbasov@Sun.COM int error = 0; 1384*11389SAlexander.Kolbasov@Sun.COM 1385*11389SAlexander.Kolbasov@Sun.COM /* 1386*11389SAlexander.Kolbasov@Sun.COM * cpu_call() will call func on the CPU specified with given argument 1387*11389SAlexander.Kolbasov@Sun.COM * and return func's return value in last argument 1388*11389SAlexander.Kolbasov@Sun.COM */ 1389*11389SAlexander.Kolbasov@Sun.COM cpu_call(cp, (cpu_call_func_t)func, arg, (uintptr_t)&error); 1390*11389SAlexander.Kolbasov@Sun.COM return (error); 1391*11389SAlexander.Kolbasov@Sun.COM } 1392*11389SAlexander.Kolbasov@Sun.COM 1393*11389SAlexander.Kolbasov@Sun.COM 1394*11389SAlexander.Kolbasov@Sun.COM /* 1395*11389SAlexander.Kolbasov@Sun.COM * Update counter statistics on a given CPU. 1396*11389SAlexander.Kolbasov@Sun.COM * 1397*11389SAlexander.Kolbasov@Sun.COM * If move_to argument is True, execute the function on the CPU specified 1398*11389SAlexander.Kolbasov@Sun.COM * Otherwise, assume that it is already runninng on the right CPU 1399*11389SAlexander.Kolbasov@Sun.COM * 1400*11389SAlexander.Kolbasov@Sun.COM * If move_to is specified, the caller should hold cpu_lock or have preemption 1401*11389SAlexander.Kolbasov@Sun.COM * disabled. Otherwise it is up to the caller to guarantee that things do not 1402*11389SAlexander.Kolbasov@Sun.COM * change in the process. 1403*11389SAlexander.Kolbasov@Sun.COM */ 1404*11389SAlexander.Kolbasov@Sun.COM int 1405*11389SAlexander.Kolbasov@Sun.COM cu_cpu_update(struct cpu *cp, boolean_t move_to) 1406*11389SAlexander.Kolbasov@Sun.COM { 1407*11389SAlexander.Kolbasov@Sun.COM int retval; 1408*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info_t *cu_cpu_info = cp->cpu_cu_info; 1409*11389SAlexander.Kolbasov@Sun.COM hrtime_t time_snap; 1410*11389SAlexander.Kolbasov@Sun.COM 1411*11389SAlexander.Kolbasov@Sun.COM ASSERT(!move_to || MUTEX_HELD(&cpu_lock) || curthread->t_preempt > 0); 1412*11389SAlexander.Kolbasov@Sun.COM 1413*11389SAlexander.Kolbasov@Sun.COM /* 1414*11389SAlexander.Kolbasov@Sun.COM * Nothing to do if counters are not programmed 1415*11389SAlexander.Kolbasov@Sun.COM */ 1416*11389SAlexander.Kolbasov@Sun.COM if (!(cu_flags & CU_FLAG_ON) || 1417*11389SAlexander.Kolbasov@Sun.COM (cu_cpu_info == NULL) || 1418*11389SAlexander.Kolbasov@Sun.COM !(cu_cpu_info->cu_flag & CU_CPU_CNTRS_ON)) 1419*11389SAlexander.Kolbasov@Sun.COM return (0); 1420*11389SAlexander.Kolbasov@Sun.COM 1421*11389SAlexander.Kolbasov@Sun.COM /* 1422*11389SAlexander.Kolbasov@Sun.COM * Don't update CPU statistics if it was updated recently 1423*11389SAlexander.Kolbasov@Sun.COM * and provide old results instead 1424*11389SAlexander.Kolbasov@Sun.COM */ 1425*11389SAlexander.Kolbasov@Sun.COM time_snap = gethrtime(); 1426*11389SAlexander.Kolbasov@Sun.COM if ((time_snap - cu_cpu_info->cu_sample_time) < cu_update_threshold) { 1427*11389SAlexander.Kolbasov@Sun.COM DTRACE_PROBE1(cu__drop__sample, cpu_t *, cp); 1428*11389SAlexander.Kolbasov@Sun.COM return (0); 1429*11389SAlexander.Kolbasov@Sun.COM } 1430*11389SAlexander.Kolbasov@Sun.COM 1431*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info->cu_sample_time = time_snap; 1432*11389SAlexander.Kolbasov@Sun.COM 1433*11389SAlexander.Kolbasov@Sun.COM /* 1434*11389SAlexander.Kolbasov@Sun.COM * CPC counter should be read on the CPU that is running the counter. We 1435*11389SAlexander.Kolbasov@Sun.COM * either have to move ourselves to the target CPU or insure that we 1436*11389SAlexander.Kolbasov@Sun.COM * already run there. 1437*11389SAlexander.Kolbasov@Sun.COM * 1438*11389SAlexander.Kolbasov@Sun.COM * We use cross-call to the target CPU to execute kcpc_read() and 1439*11389SAlexander.Kolbasov@Sun.COM * cu_cpu_update_stats() there. 1440*11389SAlexander.Kolbasov@Sun.COM */ 1441*11389SAlexander.Kolbasov@Sun.COM retval = 0; 1442*11389SAlexander.Kolbasov@Sun.COM if (move_to) 1443*11389SAlexander.Kolbasov@Sun.COM (void) cu_cpu_run(cp, (cu_cpu_func_t)kcpc_read, 1444*11389SAlexander.Kolbasov@Sun.COM (uintptr_t)cu_cpu_update_stats); 1445*11389SAlexander.Kolbasov@Sun.COM else { 1446*11389SAlexander.Kolbasov@Sun.COM retval = kcpc_read((kcpc_update_func_t)cu_cpu_update_stats); 1447*11389SAlexander.Kolbasov@Sun.COM /* 1448*11389SAlexander.Kolbasov@Sun.COM * Offset negative return value by -10 so we can distinguish it 1449*11389SAlexander.Kolbasov@Sun.COM * from error return values of this routine vs kcpc_read() 1450*11389SAlexander.Kolbasov@Sun.COM */ 1451*11389SAlexander.Kolbasov@Sun.COM if (retval < 0) 1452*11389SAlexander.Kolbasov@Sun.COM retval -= 10; 1453*11389SAlexander.Kolbasov@Sun.COM } 1454*11389SAlexander.Kolbasov@Sun.COM 1455*11389SAlexander.Kolbasov@Sun.COM return (retval); 1456*11389SAlexander.Kolbasov@Sun.COM } 1457*11389SAlexander.Kolbasov@Sun.COM 1458*11389SAlexander.Kolbasov@Sun.COM 1459*11389SAlexander.Kolbasov@Sun.COM /* 1460*11389SAlexander.Kolbasov@Sun.COM * Update CPU counter statistics for current CPU. 1461*11389SAlexander.Kolbasov@Sun.COM * This function may be called from a cross-call 1462*11389SAlexander.Kolbasov@Sun.COM */ 1463*11389SAlexander.Kolbasov@Sun.COM static int 1464*11389SAlexander.Kolbasov@Sun.COM cu_cpu_update_stats(cu_cntr_stats_t *stats, uint64_t cntr_value) 1465*11389SAlexander.Kolbasov@Sun.COM { 1466*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info_t *cu_cpu_info = CPU->cpu_cu_info; 1467*11389SAlexander.Kolbasov@Sun.COM uint_t flags; 1468*11389SAlexander.Kolbasov@Sun.COM uint64_t delta; 1469*11389SAlexander.Kolbasov@Sun.COM hrtime_t time_delta; 1470*11389SAlexander.Kolbasov@Sun.COM hrtime_t time_snap; 1471*11389SAlexander.Kolbasov@Sun.COM 1472*11389SAlexander.Kolbasov@Sun.COM if (stats == NULL) 1473*11389SAlexander.Kolbasov@Sun.COM return (-1); 1474*11389SAlexander.Kolbasov@Sun.COM 1475*11389SAlexander.Kolbasov@Sun.COM /* 1476*11389SAlexander.Kolbasov@Sun.COM * Nothing to do if counters are not programmed. This should not happen, 1477*11389SAlexander.Kolbasov@Sun.COM * but we check just in case. 1478*11389SAlexander.Kolbasov@Sun.COM */ 1479*11389SAlexander.Kolbasov@Sun.COM ASSERT(cu_flags & CU_FLAG_ON); 1480*11389SAlexander.Kolbasov@Sun.COM ASSERT(cu_cpu_info != NULL); 1481*11389SAlexander.Kolbasov@Sun.COM if (!(cu_flags & CU_FLAG_ON) || 1482*11389SAlexander.Kolbasov@Sun.COM (cu_cpu_info == NULL)) 1483*11389SAlexander.Kolbasov@Sun.COM return (-2); 1484*11389SAlexander.Kolbasov@Sun.COM 1485*11389SAlexander.Kolbasov@Sun.COM flags = cu_cpu_info->cu_flag; 1486*11389SAlexander.Kolbasov@Sun.COM ASSERT(flags & CU_CPU_CNTRS_ON); 1487*11389SAlexander.Kolbasov@Sun.COM if (!(flags & CU_CPU_CNTRS_ON)) 1488*11389SAlexander.Kolbasov@Sun.COM return (-2); 1489*11389SAlexander.Kolbasov@Sun.COM 1490*11389SAlexander.Kolbasov@Sun.COM /* 1491*11389SAlexander.Kolbasov@Sun.COM * Take snapshot of high resolution timer 1492*11389SAlexander.Kolbasov@Sun.COM */ 1493*11389SAlexander.Kolbasov@Sun.COM time_snap = gethrtime(); 1494*11389SAlexander.Kolbasov@Sun.COM 1495*11389SAlexander.Kolbasov@Sun.COM /* 1496*11389SAlexander.Kolbasov@Sun.COM * CU counters have just been programmed. We cannot assume that the new 1497*11389SAlexander.Kolbasov@Sun.COM * cntr_value continues from where we left off, so use the cntr_value as 1498*11389SAlexander.Kolbasov@Sun.COM * the new initial value. 1499*11389SAlexander.Kolbasov@Sun.COM */ 1500*11389SAlexander.Kolbasov@Sun.COM if (flags & CU_CPU_CNTRS_OFF_ON) 1501*11389SAlexander.Kolbasov@Sun.COM stats->cs_value_start = cntr_value; 1502*11389SAlexander.Kolbasov@Sun.COM 1503*11389SAlexander.Kolbasov@Sun.COM /* 1504*11389SAlexander.Kolbasov@Sun.COM * Calculate delta in counter values between start of sampling period 1505*11389SAlexander.Kolbasov@Sun.COM * and now 1506*11389SAlexander.Kolbasov@Sun.COM */ 1507*11389SAlexander.Kolbasov@Sun.COM delta = cntr_value - stats->cs_value_start; 1508*11389SAlexander.Kolbasov@Sun.COM 1509*11389SAlexander.Kolbasov@Sun.COM /* 1510*11389SAlexander.Kolbasov@Sun.COM * Calculate time between start of sampling period and now 1511*11389SAlexander.Kolbasov@Sun.COM */ 1512*11389SAlexander.Kolbasov@Sun.COM time_delta = stats->cs_time_start ? 1513*11389SAlexander.Kolbasov@Sun.COM time_snap - stats->cs_time_start : 1514*11389SAlexander.Kolbasov@Sun.COM 0; 1515*11389SAlexander.Kolbasov@Sun.COM stats->cs_time_start = time_snap; 1516*11389SAlexander.Kolbasov@Sun.COM stats->cs_value_start = cntr_value; 1517*11389SAlexander.Kolbasov@Sun.COM 1518*11389SAlexander.Kolbasov@Sun.COM if (time_delta > 0) { /* wrap shouldn't happen */ 1519*11389SAlexander.Kolbasov@Sun.COM /* 1520*11389SAlexander.Kolbasov@Sun.COM * Update either running or stopped time based on the transition 1521*11389SAlexander.Kolbasov@Sun.COM * state 1522*11389SAlexander.Kolbasov@Sun.COM */ 1523*11389SAlexander.Kolbasov@Sun.COM if (flags & CU_CPU_CNTRS_OFF_ON) 1524*11389SAlexander.Kolbasov@Sun.COM stats->cs_time_stopped += time_delta; 1525*11389SAlexander.Kolbasov@Sun.COM else 1526*11389SAlexander.Kolbasov@Sun.COM stats->cs_time_running += time_delta; 1527*11389SAlexander.Kolbasov@Sun.COM } 1528*11389SAlexander.Kolbasov@Sun.COM 1529*11389SAlexander.Kolbasov@Sun.COM /* 1530*11389SAlexander.Kolbasov@Sun.COM * Update rest of counter statistics if counter value didn't wrap 1531*11389SAlexander.Kolbasov@Sun.COM */ 1532*11389SAlexander.Kolbasov@Sun.COM if (delta > 0) { 1533*11389SAlexander.Kolbasov@Sun.COM /* 1534*11389SAlexander.Kolbasov@Sun.COM * Update utilization rate if the interval between samples is 1535*11389SAlexander.Kolbasov@Sun.COM * sufficient. 1536*11389SAlexander.Kolbasov@Sun.COM */ 1537*11389SAlexander.Kolbasov@Sun.COM ASSERT(cu_sample_interval_min > CU_SCALE); 1538*11389SAlexander.Kolbasov@Sun.COM if (time_delta > cu_sample_interval_min) 1539*11389SAlexander.Kolbasov@Sun.COM stats->cs_rate = CU_RATE(delta, time_delta); 1540*11389SAlexander.Kolbasov@Sun.COM if (stats->cs_rate_max < stats->cs_rate) 1541*11389SAlexander.Kolbasov@Sun.COM stats->cs_rate_max = stats->cs_rate; 1542*11389SAlexander.Kolbasov@Sun.COM 1543*11389SAlexander.Kolbasov@Sun.COM stats->cs_value_last = delta; 1544*11389SAlexander.Kolbasov@Sun.COM stats->cs_value_total += delta; 1545*11389SAlexander.Kolbasov@Sun.COM } 1546*11389SAlexander.Kolbasov@Sun.COM 1547*11389SAlexander.Kolbasov@Sun.COM return (0); 1548*11389SAlexander.Kolbasov@Sun.COM } 1549*11389SAlexander.Kolbasov@Sun.COM 1550*11389SAlexander.Kolbasov@Sun.COM /* 1551*11389SAlexander.Kolbasov@Sun.COM * Update CMT PG utilization data. 1552*11389SAlexander.Kolbasov@Sun.COM * 1553*11389SAlexander.Kolbasov@Sun.COM * This routine computes the running total utilization and times for the 1554*11389SAlexander.Kolbasov@Sun.COM * specified PG by adding up the total utilization and counter running and 1555*11389SAlexander.Kolbasov@Sun.COM * stopped times of all CPUs in the PG and calculates the utilization rate and 1556*11389SAlexander.Kolbasov@Sun.COM * maximum rate for all CPUs in the PG. 1557*11389SAlexander.Kolbasov@Sun.COM */ 1558*11389SAlexander.Kolbasov@Sun.COM void 1559*11389SAlexander.Kolbasov@Sun.COM cu_pg_update(pghw_t *pg) 1560*11389SAlexander.Kolbasov@Sun.COM { 1561*11389SAlexander.Kolbasov@Sun.COM pg_cpu_itr_t cpu_iter; 1562*11389SAlexander.Kolbasov@Sun.COM pghw_type_t pg_hwtype; 1563*11389SAlexander.Kolbasov@Sun.COM cpu_t *cpu; 1564*11389SAlexander.Kolbasov@Sun.COM pghw_util_t *hw_util = &pg->pghw_stats; 1565*11389SAlexander.Kolbasov@Sun.COM uint64_t old_utilization = hw_util->pghw_util; 1566*11389SAlexander.Kolbasov@Sun.COM hrtime_t now; 1567*11389SAlexander.Kolbasov@Sun.COM hrtime_t time_delta; 1568*11389SAlexander.Kolbasov@Sun.COM uint64_t utilization_delta; 1569*11389SAlexander.Kolbasov@Sun.COM 1570*11389SAlexander.Kolbasov@Sun.COM ASSERT(MUTEX_HELD(&cpu_lock)); 1571*11389SAlexander.Kolbasov@Sun.COM 1572*11389SAlexander.Kolbasov@Sun.COM now = gethrtime(); 1573*11389SAlexander.Kolbasov@Sun.COM 1574*11389SAlexander.Kolbasov@Sun.COM pg_hwtype = pg->pghw_hw; 1575*11389SAlexander.Kolbasov@Sun.COM 1576*11389SAlexander.Kolbasov@Sun.COM /* 1577*11389SAlexander.Kolbasov@Sun.COM * Initialize running total utilization and times for PG to 0 1578*11389SAlexander.Kolbasov@Sun.COM */ 1579*11389SAlexander.Kolbasov@Sun.COM hw_util->pghw_util = 0; 1580*11389SAlexander.Kolbasov@Sun.COM hw_util->pghw_time_running = 0; 1581*11389SAlexander.Kolbasov@Sun.COM hw_util->pghw_time_stopped = 0; 1582*11389SAlexander.Kolbasov@Sun.COM 1583*11389SAlexander.Kolbasov@Sun.COM /* 1584*11389SAlexander.Kolbasov@Sun.COM * Iterate over all CPUs in the PG and aggregate utilization, running 1585*11389SAlexander.Kolbasov@Sun.COM * time and stopped time. 1586*11389SAlexander.Kolbasov@Sun.COM */ 1587*11389SAlexander.Kolbasov@Sun.COM PG_CPU_ITR_INIT(pg, cpu_iter); 1588*11389SAlexander.Kolbasov@Sun.COM while ((cpu = pg_cpu_next(&cpu_iter)) != NULL) { 1589*11389SAlexander.Kolbasov@Sun.COM cu_cpu_info_t *cu_cpu_info = cpu->cpu_cu_info; 1590*11389SAlexander.Kolbasov@Sun.COM cu_cntr_info_t *cntr_info; 1591*11389SAlexander.Kolbasov@Sun.COM cu_cntr_stats_t *stats; 1592*11389SAlexander.Kolbasov@Sun.COM 1593*11389SAlexander.Kolbasov@Sun.COM if (cu_cpu_info == NULL) 1594*11389SAlexander.Kolbasov@Sun.COM continue; 1595*11389SAlexander.Kolbasov@Sun.COM 1596*11389SAlexander.Kolbasov@Sun.COM /* 1597*11389SAlexander.Kolbasov@Sun.COM * Update utilization data for the CPU and then 1598*11389SAlexander.Kolbasov@Sun.COM * aggregate per CPU running totals for PG 1599*11389SAlexander.Kolbasov@Sun.COM */ 1600*11389SAlexander.Kolbasov@Sun.COM (void) cu_cpu_update(cpu, B_TRUE); 1601*11389SAlexander.Kolbasov@Sun.COM cntr_info = cu_cpu_info->cu_cntr_info[pg_hwtype]; 1602*11389SAlexander.Kolbasov@Sun.COM 1603*11389SAlexander.Kolbasov@Sun.COM if (cntr_info == NULL || (stats = cntr_info->ci_stats) == NULL) 1604*11389SAlexander.Kolbasov@Sun.COM continue; 1605*11389SAlexander.Kolbasov@Sun.COM 1606*11389SAlexander.Kolbasov@Sun.COM hw_util->pghw_util += stats->cs_value_total; 1607*11389SAlexander.Kolbasov@Sun.COM hw_util->pghw_time_running += stats->cs_time_running; 1608*11389SAlexander.Kolbasov@Sun.COM hw_util->pghw_time_stopped += stats->cs_time_stopped; 1609*11389SAlexander.Kolbasov@Sun.COM 1610*11389SAlexander.Kolbasov@Sun.COM /* 1611*11389SAlexander.Kolbasov@Sun.COM * If counters are stopped now, the pg_time_stopped was last 1612*11389SAlexander.Kolbasov@Sun.COM * updated at cs_time_start time. Add the time passed since then 1613*11389SAlexander.Kolbasov@Sun.COM * to the stopped time. 1614*11389SAlexander.Kolbasov@Sun.COM */ 1615*11389SAlexander.Kolbasov@Sun.COM if (!(cu_cpu_info->cu_flag & CU_CPU_CNTRS_ON)) 1616*11389SAlexander.Kolbasov@Sun.COM hw_util->pghw_time_stopped += 1617*11389SAlexander.Kolbasov@Sun.COM now - stats->cs_time_start; 1618*11389SAlexander.Kolbasov@Sun.COM } 1619*11389SAlexander.Kolbasov@Sun.COM 1620*11389SAlexander.Kolbasov@Sun.COM /* 1621*11389SAlexander.Kolbasov@Sun.COM * Compute per PG instruction rate and maximum rate 1622*11389SAlexander.Kolbasov@Sun.COM */ 1623*11389SAlexander.Kolbasov@Sun.COM time_delta = now - hw_util->pghw_time_stamp; 1624*11389SAlexander.Kolbasov@Sun.COM hw_util->pghw_time_stamp = now; 1625*11389SAlexander.Kolbasov@Sun.COM 1626*11389SAlexander.Kolbasov@Sun.COM if (old_utilization == 0) 1627*11389SAlexander.Kolbasov@Sun.COM return; 1628*11389SAlexander.Kolbasov@Sun.COM 1629*11389SAlexander.Kolbasov@Sun.COM /* 1630*11389SAlexander.Kolbasov@Sun.COM * Calculate change in utilization over sampling period and set this to 1631*11389SAlexander.Kolbasov@Sun.COM * 0 if the delta would be 0 or negative which may happen if any CPUs go 1632*11389SAlexander.Kolbasov@Sun.COM * offline during the sampling period 1633*11389SAlexander.Kolbasov@Sun.COM */ 1634*11389SAlexander.Kolbasov@Sun.COM if (hw_util->pghw_util > old_utilization) 1635*11389SAlexander.Kolbasov@Sun.COM utilization_delta = hw_util->pghw_util - old_utilization; 1636*11389SAlexander.Kolbasov@Sun.COM else 1637*11389SAlexander.Kolbasov@Sun.COM utilization_delta = 0; 1638*11389SAlexander.Kolbasov@Sun.COM 1639*11389SAlexander.Kolbasov@Sun.COM /* 1640*11389SAlexander.Kolbasov@Sun.COM * Update utilization rate if the interval between samples is 1641*11389SAlexander.Kolbasov@Sun.COM * sufficient. 1642*11389SAlexander.Kolbasov@Sun.COM */ 1643*11389SAlexander.Kolbasov@Sun.COM ASSERT(cu_sample_interval_min > CU_SCALE); 1644*11389SAlexander.Kolbasov@Sun.COM if (time_delta > CU_SAMPLE_INTERVAL_MIN) 1645*11389SAlexander.Kolbasov@Sun.COM hw_util->pghw_rate = CU_RATE(utilization_delta, time_delta); 1646*11389SAlexander.Kolbasov@Sun.COM 1647*11389SAlexander.Kolbasov@Sun.COM /* 1648*11389SAlexander.Kolbasov@Sun.COM * Update the maximum observed rate 1649*11389SAlexander.Kolbasov@Sun.COM */ 1650*11389SAlexander.Kolbasov@Sun.COM if (hw_util->pghw_rate_max < hw_util->pghw_rate) 1651*11389SAlexander.Kolbasov@Sun.COM hw_util->pghw_rate_max = hw_util->pghw_rate; 1652*11389SAlexander.Kolbasov@Sun.COM } 1653