10Sstevel@tonic-gate /* 20Sstevel@tonic-gate * CDDL HEADER START 30Sstevel@tonic-gate * 40Sstevel@tonic-gate * The contents of this file are subject to the terms of the 51852Syz147064 * Common Development and Distribution License (the "License"). 61852Syz147064 * You may not use this file except in compliance with the License. 70Sstevel@tonic-gate * 80Sstevel@tonic-gate * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 90Sstevel@tonic-gate * or http://www.opensolaris.org/os/licensing. 100Sstevel@tonic-gate * See the License for the specific language governing permissions 110Sstevel@tonic-gate * and limitations under the License. 120Sstevel@tonic-gate * 130Sstevel@tonic-gate * When distributing Covered Code, include this CDDL HEADER in each 140Sstevel@tonic-gate * file and include the License file at usr/src/OPENSOLARIS.LICENSE. 150Sstevel@tonic-gate * If applicable, add the following below this CDDL HEADER, with the 160Sstevel@tonic-gate * fields enclosed by brackets "[]" replaced with your own identifying 170Sstevel@tonic-gate * information: Portions Copyright [yyyy] [name of copyright owner] 180Sstevel@tonic-gate * 190Sstevel@tonic-gate * CDDL HEADER END 200Sstevel@tonic-gate */ 215084Sjohnlev 220Sstevel@tonic-gate /* 238603SGirish.Moodalbail@Sun.COM * Copyright 2009 Sun Microsystems, Inc. All rights reserved. 240Sstevel@tonic-gate * Use is subject to license terms. 250Sstevel@tonic-gate */ 260Sstevel@tonic-gate 270Sstevel@tonic-gate /* 280Sstevel@tonic-gate * MAC Services Module 298275SEric Cheng * 308275SEric Cheng * The GLDv3 framework locking - The MAC layer 318275SEric Cheng * -------------------------------------------- 328275SEric Cheng * 338275SEric Cheng * The MAC layer is central to the GLD framework and can provide the locking 348275SEric Cheng * framework needed for itself and for the use of MAC clients. MAC end points 358275SEric Cheng * are fairly disjoint and don't share a lot of state. So a coarse grained 368275SEric Cheng * multi-threading scheme is to single thread all create/modify/delete or set 378275SEric Cheng * type of control operations on a per mac end point while allowing data threads 388275SEric Cheng * concurrently. 398275SEric Cheng * 408275SEric Cheng * Control operations (set) that modify a mac end point are always serialized on 418275SEric Cheng * a per mac end point basis, We have at most 1 such thread per mac end point 428275SEric Cheng * at a time. 438275SEric Cheng * 448275SEric Cheng * All other operations that are not serialized are essentially multi-threaded. 458275SEric Cheng * For example a control operation (get) like getting statistics which may not 468275SEric Cheng * care about reading values atomically or data threads sending or receiving 478275SEric Cheng * data. Mostly these type of operations don't modify the control state. Any 488275SEric Cheng * state these operations care about are protected using traditional locks. 498275SEric Cheng * 508275SEric Cheng * The perimeter only serializes serial operations. It does not imply there 518275SEric Cheng * aren't any other concurrent operations. However a serialized operation may 528275SEric Cheng * sometimes need to make sure it is the only thread. In this case it needs 538275SEric Cheng * to use reference counting mechanisms to cv_wait until any current data 548275SEric Cheng * threads are done. 558275SEric Cheng * 568275SEric Cheng * The mac layer itself does not hold any locks across a call to another layer. 578275SEric Cheng * The perimeter is however held across a down call to the driver to make the 588275SEric Cheng * whole control operation atomic with respect to other control operations. 598275SEric Cheng * Also the data path and get type control operations may proceed concurrently. 608275SEric Cheng * These operations synchronize with the single serial operation on a given mac 618275SEric Cheng * end point using regular locks. The perimeter ensures that conflicting 628275SEric Cheng * operations like say a mac_multicast_add and a mac_multicast_remove on the 638275SEric Cheng * same mac end point don't interfere with each other and also ensures that the 648275SEric Cheng * changes in the mac layer and the call to the underlying driver to say add a 658275SEric Cheng * multicast address are done atomically without interference from a thread 668275SEric Cheng * trying to delete the same address. 678275SEric Cheng * 688275SEric Cheng * For example, consider 698275SEric Cheng * mac_multicst_add() 708275SEric Cheng * { 718275SEric Cheng * mac_perimeter_enter(); serialize all control operations 728275SEric Cheng * 738275SEric Cheng * grab list lock protect against access by data threads 748275SEric Cheng * add to list 758275SEric Cheng * drop list lock 768275SEric Cheng * 778275SEric Cheng * call driver's mi_multicst 788275SEric Cheng * 798275SEric Cheng * mac_perimeter_exit(); 808275SEric Cheng * } 818275SEric Cheng * 828275SEric Cheng * To lessen the number of serialization locks and simplify the lock hierarchy, 838275SEric Cheng * we serialize all the control operations on a per mac end point by using a 848275SEric Cheng * single serialization lock called the perimeter. We allow recursive entry into 858275SEric Cheng * the perimeter to facilitate use of this mechanism by both the mac client and 868275SEric Cheng * the MAC layer itself. 878275SEric Cheng * 888275SEric Cheng * MAC client means an entity that does an operation on a mac handle 898275SEric Cheng * obtained from a mac_open/mac_client_open. Similarly MAC driver means 908275SEric Cheng * an entity that does an operation on a mac handle obtained from a 918275SEric Cheng * mac_register. An entity could be both client and driver but on different 928275SEric Cheng * handles eg. aggr. and should only make the corresponding mac interface calls 938275SEric Cheng * i.e. mac driver interface or mac client interface as appropriate for that 948275SEric Cheng * mac handle. 958275SEric Cheng * 968275SEric Cheng * General rules. 978275SEric Cheng * ------------- 988275SEric Cheng * 998275SEric Cheng * R1. The lock order of upcall threads is natually opposite to downcall 1008275SEric Cheng * threads. Hence upcalls must not hold any locks across layers for fear of 1018275SEric Cheng * recursive lock enter and lock order violation. This applies to all layers. 1028275SEric Cheng * 1038275SEric Cheng * R2. The perimeter is just another lock. Since it is held in the down 1048275SEric Cheng * direction, acquiring the perimeter in an upcall is prohibited as it would 1058275SEric Cheng * cause a deadlock. This applies to all layers. 1068275SEric Cheng * 1078275SEric Cheng * Note that upcalls that need to grab the mac perimeter (for example 1088275SEric Cheng * mac_notify upcalls) can still achieve that by posting the request to a 1098275SEric Cheng * thread, which can then grab all the required perimeters and locks in the 1108275SEric Cheng * right global order. Note that in the above example the mac layer iself 1118275SEric Cheng * won't grab the mac perimeter in the mac_notify upcall, instead the upcall 1128275SEric Cheng * to the client must do that. Please see the aggr code for an example. 1138275SEric Cheng * 1148275SEric Cheng * MAC client rules 1158275SEric Cheng * ---------------- 1168275SEric Cheng * 1178275SEric Cheng * R3. A MAC client may use the MAC provided perimeter facility to serialize 1188275SEric Cheng * control operations on a per mac end point. It does this by by acquring 1198275SEric Cheng * and holding the perimeter across a sequence of calls to the mac layer. 1208275SEric Cheng * This ensures atomicity across the entire block of mac calls. In this 1218275SEric Cheng * model the MAC client must not hold any client locks across the calls to 1228275SEric Cheng * the mac layer. This model is the preferred solution. 1238275SEric Cheng * 1248275SEric Cheng * R4. However if a MAC client has a lot of global state across all mac end 1258275SEric Cheng * points the per mac end point serialization may not be sufficient. In this 1268275SEric Cheng * case the client may choose to use global locks or use its own serialization. 1278275SEric Cheng * To avoid deadlocks, these client layer locks held across the mac calls 1288275SEric Cheng * in the control path must never be acquired by the data path for the reason 1298275SEric Cheng * mentioned below. 1308275SEric Cheng * 1318275SEric Cheng * (Assume that a control operation that holds a client lock blocks in the 1328275SEric Cheng * mac layer waiting for upcall reference counts to drop to zero. If an upcall 1338275SEric Cheng * data thread that holds this reference count, tries to acquire the same 1348275SEric Cheng * client lock subsequently it will deadlock). 1358275SEric Cheng * 1368275SEric Cheng * A MAC client may follow either the R3 model or the R4 model, but can't 1378275SEric Cheng * mix both. In the former, the hierarchy is Perim -> client locks, but in 1388275SEric Cheng * the latter it is client locks -> Perim. 1398275SEric Cheng * 1408275SEric Cheng * R5. MAC clients must make MAC calls (excluding data calls) in a cv_wait'able 1418275SEric Cheng * context since they may block while trying to acquire the perimeter. 1428275SEric Cheng * In addition some calls may block waiting for upcall refcnts to come down to 1438275SEric Cheng * zero. 1448275SEric Cheng * 1458275SEric Cheng * R6. MAC clients must make sure that they are single threaded and all threads 1468275SEric Cheng * from the top (in particular data threads) have finished before calling 1478275SEric Cheng * mac_client_close. The MAC framework does not track the number of client 1488275SEric Cheng * threads using the mac client handle. Also mac clients must make sure 1498275SEric Cheng * they have undone all the control operations before calling mac_client_close. 1508275SEric Cheng * For example mac_unicast_remove/mac_multicast_remove to undo the corresponding 1518275SEric Cheng * mac_unicast_add/mac_multicast_add. 1528275SEric Cheng * 1538275SEric Cheng * MAC framework rules 1548275SEric Cheng * ------------------- 1558275SEric Cheng * 1568275SEric Cheng * R7. The mac layer itself must not hold any mac layer locks (except the mac 1578275SEric Cheng * perimeter) across a call to any other layer from the mac layer. The call to 1588275SEric Cheng * any other layer could be via mi_* entry points, classifier entry points into 1598275SEric Cheng * the driver or via upcall pointers into layers above. The mac perimeter may 1608275SEric Cheng * be acquired or held only in the down direction, for e.g. when calling into 1618275SEric Cheng * a mi_* driver enty point to provide atomicity of the operation. 1628275SEric Cheng * 1638275SEric Cheng * R8. Since it is not guaranteed (see R14) that drivers won't hold locks across 1648275SEric Cheng * mac driver interfaces, the MAC layer must provide a cut out for control 1658275SEric Cheng * interfaces like upcall notifications and start them in a separate thread. 1668275SEric Cheng * 1678275SEric Cheng * R9. Note that locking order also implies a plumbing order. For example 1688275SEric Cheng * VNICs are allowed to be created over aggrs, but not vice-versa. An attempt 1698275SEric Cheng * to plumb in any other order must be failed at mac_open time, otherwise it 1708275SEric Cheng * could lead to deadlocks due to inverse locking order. 1718275SEric Cheng * 1728275SEric Cheng * R10. MAC driver interfaces must not block since the driver could call them 1738275SEric Cheng * in interrupt context. 1748275SEric Cheng * 1758275SEric Cheng * R11. Walkers must preferably not hold any locks while calling walker 1768275SEric Cheng * callbacks. Instead these can operate on reference counts. In simple 1778275SEric Cheng * callbacks it may be ok to hold a lock and call the callbacks, but this is 1788275SEric Cheng * harder to maintain in the general case of arbitrary callbacks. 1798275SEric Cheng * 1808275SEric Cheng * R12. The MAC layer must protect upcall notification callbacks using reference 1818275SEric Cheng * counts rather than holding locks across the callbacks. 1828275SEric Cheng * 1838275SEric Cheng * R13. Given the variety of drivers, it is preferable if the MAC layer can make 1848275SEric Cheng * sure that any pointers (such as mac ring pointers) it passes to the driver 1858275SEric Cheng * remain valid until mac unregister time. Currently the mac layer achieves 1868275SEric Cheng * this by using generation numbers for rings and freeing the mac rings only 1878275SEric Cheng * at unregister time. The MAC layer must provide a layer of indirection and 1888275SEric Cheng * must not expose underlying driver rings or driver data structures/pointers 1898275SEric Cheng * directly to MAC clients. 1908275SEric Cheng * 1918275SEric Cheng * MAC driver rules 1928275SEric Cheng * ---------------- 1938275SEric Cheng * 1948275SEric Cheng * R14. It would be preferable if MAC drivers don't hold any locks across any 1958275SEric Cheng * mac call. However at a minimum they must not hold any locks across data 1968275SEric Cheng * upcalls. They must also make sure that all references to mac data structures 1978275SEric Cheng * are cleaned up and that it is single threaded at mac_unregister time. 1988275SEric Cheng * 1998275SEric Cheng * R15. MAC driver interfaces don't block and so the action may be done 2008275SEric Cheng * asynchronously in a separate thread as for example handling notifications. 2018275SEric Cheng * The driver must not assume that the action is complete when the call 2028275SEric Cheng * returns. 2038275SEric Cheng * 2048275SEric Cheng * R16. Drivers must maintain a generation number per Rx ring, and pass it 2058275SEric Cheng * back to mac_rx_ring(); They are expected to increment the generation 2068275SEric Cheng * number whenever the ring's stop routine is invoked. 2078275SEric Cheng * See comments in mac_rx_ring(); 2088275SEric Cheng * 2098275SEric Cheng * R17 Similarly mi_stop is another synchronization point and the driver must 2108275SEric Cheng * ensure that all upcalls are done and there won't be any future upcall 2118275SEric Cheng * before returning from mi_stop. 2128275SEric Cheng * 2138275SEric Cheng * R18. The driver may assume that all set/modify control operations via 2148275SEric Cheng * the mi_* entry points are single threaded on a per mac end point. 2158275SEric Cheng * 2168275SEric Cheng * Lock and Perimeter hierarchy scenarios 2178275SEric Cheng * --------------------------------------- 2188275SEric Cheng * 2198275SEric Cheng * i_mac_impl_lock -> mi_rw_lock -> srs_lock -> s_ring_lock[i_mac_tx_srs_notify] 2208275SEric Cheng * 2218275SEric Cheng * ft_lock -> fe_lock [mac_flow_lookup] 2228275SEric Cheng * 2238275SEric Cheng * mi_rw_lock -> fe_lock [mac_bcast_send] 2248275SEric Cheng * 2258275SEric Cheng * srs_lock -> mac_bw_lock [mac_rx_srs_drain_bw] 2268275SEric Cheng * 2278275SEric Cheng * cpu_lock -> mac_srs_g_lock -> srs_lock -> s_ring_lock [mac_walk_srs_and_bind] 2288275SEric Cheng * 2298275SEric Cheng * i_dls_devnet_lock -> mac layer locks [dls_devnet_rename] 2308275SEric Cheng * 2318275SEric Cheng * Perimeters are ordered P1 -> P2 -> P3 from top to bottom in order of mac 2328275SEric Cheng * client to driver. In the case of clients that explictly use the mac provided 2338275SEric Cheng * perimeter mechanism for its serialization, the hierarchy is 2348275SEric Cheng * Perimeter -> mac layer locks, since the client never holds any locks across 2358275SEric Cheng * the mac calls. In the case of clients that use its own locks the hierarchy 2368275SEric Cheng * is Client locks -> Mac Perim -> Mac layer locks. The client never explicitly 2378275SEric Cheng * calls mac_perim_enter/exit in this case. 2388275SEric Cheng * 2398275SEric Cheng * Subflow creation rules 2408275SEric Cheng * --------------------------- 2418275SEric Cheng * o In case of a user specified cpulist present on underlying link and flows, 2428275SEric Cheng * the flows cpulist must be a subset of the underlying link. 2438275SEric Cheng * o In case of a user specified fanout mode present on link and flow, the 2448275SEric Cheng * subflow fanout count has to be less than or equal to that of the 2458275SEric Cheng * underlying link. The cpu-bindings for the subflows will be a subset of 2468275SEric Cheng * the underlying link. 2478275SEric Cheng * o In case if no cpulist specified on both underlying link and flow, the 2488275SEric Cheng * underlying link relies on a MAC tunable to provide out of box fanout. 2498275SEric Cheng * The subflow will have no cpulist (the subflow will be unbound) 2508275SEric Cheng * o In case if no cpulist is specified on the underlying link, a subflow can 2518275SEric Cheng * carry either a user-specified cpulist or fanout count. The cpu-bindings 2528275SEric Cheng * for the subflow will not adhere to restriction that they need to be subset 2538275SEric Cheng * of the underlying link. 2548275SEric Cheng * o In case where the underlying link is carrying either a user specified 2558275SEric Cheng * cpulist or fanout mode and for a unspecified subflow, the subflow will be 2568275SEric Cheng * created unbound. 2578275SEric Cheng * o While creating unbound subflows, bandwidth mode changes attempt to 2588275SEric Cheng * figure a right fanout count. In such cases the fanout count will override 2598275SEric Cheng * the unbound cpu-binding behavior. 2608275SEric Cheng * o In addition to this, while cycling between flow and link properties, we 2618275SEric Cheng * impose a restriction that if a link property has a subflow with 2628275SEric Cheng * user-specified attributes, we will not allow changing the link property. 2638275SEric Cheng * The administrator needs to reset all the user specified properties for the 2648275SEric Cheng * subflows before attempting a link property change. 2658275SEric Cheng * Some of the above rules can be overridden by specifying additional command 2668275SEric Cheng * line options while creating or modifying link or subflow properties. 2670Sstevel@tonic-gate */ 2680Sstevel@tonic-gate 2690Sstevel@tonic-gate #include <sys/types.h> 2700Sstevel@tonic-gate #include <sys/conf.h> 2715895Syz147064 #include <sys/id_space.h> 2726077Syz147064 #include <sys/esunddi.h> 2730Sstevel@tonic-gate #include <sys/stat.h> 2745895Syz147064 #include <sys/mkdev.h> 2750Sstevel@tonic-gate #include <sys/stream.h> 2760Sstevel@tonic-gate #include <sys/strsun.h> 2770Sstevel@tonic-gate #include <sys/strsubr.h> 2780Sstevel@tonic-gate #include <sys/dlpi.h> 2798275SEric Cheng #include <sys/modhash.h> 2808275SEric Cheng #include <sys/mac_provider.h> 2818275SEric Cheng #include <sys/mac_client_impl.h> 2828275SEric Cheng #include <sys/mac_soft_ring.h> 2838275SEric Cheng #include <sys/mac_impl.h> 2848275SEric Cheng #include <sys/mac.h> 2855895Syz147064 #include <sys/dls.h> 286269Sericheng #include <sys/dld.h> 2872311Sseb #include <sys/modctl.h> 2883448Sdh155122 #include <sys/fs/dv_node.h> 2895009Sgd78059 #include <sys/thread.h> 2905009Sgd78059 #include <sys/proc.h> 2915009Sgd78059 #include <sys/callb.h> 2925009Sgd78059 #include <sys/cpuvar.h> 2933288Sseb #include <sys/atomic.h> 2948275SEric Cheng #include <sys/bitmap.h> 2954913Sethindra #include <sys/sdt.h> 2968275SEric Cheng #include <sys/mac_flow.h> 2978275SEric Cheng #include <sys/ddi_intr_impl.h> 2988275SEric Cheng #include <sys/disp.h> 2998275SEric Cheng #include <sys/sdt.h> 3008275SEric Cheng #include <sys/vnic.h> 3018275SEric Cheng #include <sys/vnic_impl.h> 3028275SEric Cheng #include <sys/vlan.h> 3038275SEric Cheng #include <inet/ip.h> 3048275SEric Cheng #include <inet/ip6.h> 3058275SEric Cheng #include <sys/exacct.h> 3068275SEric Cheng #include <sys/exacct_impl.h> 3075903Ssowmini #include <inet/nd.h> 3086512Ssowmini #include <sys/ethernet.h> 3090Sstevel@tonic-gate 3100Sstevel@tonic-gate #define IMPL_HASHSZ 67 /* prime */ 3110Sstevel@tonic-gate 3128275SEric Cheng kmem_cache_t *i_mac_impl_cachep; 3138275SEric Cheng mod_hash_t *i_mac_impl_hash; 314269Sericheng krwlock_t i_mac_impl_lock; 315269Sericheng uint_t i_mac_impl_count; 3168275SEric Cheng static kmem_cache_t *mac_ring_cache; 3175895Syz147064 static id_space_t *minor_ids; 3185895Syz147064 static uint32_t minor_count; 3190Sstevel@tonic-gate 3208275SEric Cheng /* 3218275SEric Cheng * Logging stuff. Perhaps mac_logging_interval could be broken into 3228275SEric Cheng * mac_flow_log_interval and mac_link_log_interval if we want to be 3238275SEric Cheng * able to schedule them differently. 3248275SEric Cheng */ 3258275SEric Cheng uint_t mac_logging_interval; 3268275SEric Cheng boolean_t mac_flow_log_enable; 3278275SEric Cheng boolean_t mac_link_log_enable; 3288275SEric Cheng timeout_id_t mac_logging_timer; 3298275SEric Cheng 3308275SEric Cheng /* for debugging, see MAC_DBG_PRT() in mac_impl.h */ 3318275SEric Cheng int mac_dbg = 0; 3328275SEric Cheng 3332311Sseb #define MACTYPE_KMODDIR "mac" 3342311Sseb #define MACTYPE_HASHSZ 67 3352311Sseb static mod_hash_t *i_mactype_hash; 3363288Sseb /* 3373288Sseb * i_mactype_lock synchronizes threads that obtain references to mactype_t 3383288Sseb * structures through i_mactype_getplugin(). 3393288Sseb */ 3403288Sseb static kmutex_t i_mactype_lock; 3412311Sseb 3420Sstevel@tonic-gate /* 3438275SEric Cheng * mac_tx_percpu_cnt 3448275SEric Cheng * 3458275SEric Cheng * Number of per cpu locks per mac_client_impl_t. Used by the transmit side 3468275SEric Cheng * in mac_tx to reduce lock contention. This is sized at boot time in mac_init. 3478275SEric Cheng * mac_tx_percpu_cnt_max is settable in /etc/system and must be a power of 2. 3488275SEric Cheng * Per cpu locks may be disabled by setting mac_tx_percpu_cnt_max to 1. 3495084Sjohnlev */ 3508275SEric Cheng int mac_tx_percpu_cnt; 3518275SEric Cheng int mac_tx_percpu_cnt_max = 128; 3528275SEric Cheng 35310491SRishi.Srivatsavai@Sun.COM /* 35410491SRishi.Srivatsavai@Sun.COM * Call back functions for the bridge module. These are guaranteed to be valid 35510491SRishi.Srivatsavai@Sun.COM * when holding a reference on a link or when holding mip->mi_bridge_lock and 35610491SRishi.Srivatsavai@Sun.COM * mi_bridge_link is non-NULL. 35710491SRishi.Srivatsavai@Sun.COM */ 35810491SRishi.Srivatsavai@Sun.COM mac_bridge_tx_t mac_bridge_tx_cb; 35910491SRishi.Srivatsavai@Sun.COM mac_bridge_rx_t mac_bridge_rx_cb; 36010491SRishi.Srivatsavai@Sun.COM mac_bridge_ref_t mac_bridge_ref_cb; 36110491SRishi.Srivatsavai@Sun.COM mac_bridge_ls_t mac_bridge_ls_cb; 36210491SRishi.Srivatsavai@Sun.COM 3638275SEric Cheng static int i_mac_constructor(void *, void *, int); 3648275SEric Cheng static void i_mac_destructor(void *, void *); 3658275SEric Cheng static int i_mac_ring_ctor(void *, void *, int); 3668275SEric Cheng static void i_mac_ring_dtor(void *, void *); 3678275SEric Cheng static mblk_t *mac_rx_classify(mac_impl_t *, mac_resource_handle_t, mblk_t *); 3688275SEric Cheng void mac_tx_client_flush(mac_client_impl_t *); 3698275SEric Cheng void mac_tx_client_block(mac_client_impl_t *); 3708275SEric Cheng static void mac_rx_ring_quiesce(mac_ring_t *, uint_t); 3718275SEric Cheng static int mac_start_group_and_rings(mac_group_t *); 3728275SEric Cheng static void mac_stop_group_and_rings(mac_group_t *); 3732311Sseb 3740Sstevel@tonic-gate /* 3750Sstevel@tonic-gate * Module initialization functions. 3760Sstevel@tonic-gate */ 3770Sstevel@tonic-gate 3780Sstevel@tonic-gate void 3790Sstevel@tonic-gate mac_init(void) 3800Sstevel@tonic-gate { 3818275SEric Cheng mac_tx_percpu_cnt = ((boot_max_ncpus == -1) ? max_ncpus : 3828275SEric Cheng boot_max_ncpus); 3838275SEric Cheng 3848275SEric Cheng /* Upper bound is mac_tx_percpu_cnt_max */ 3858275SEric Cheng if (mac_tx_percpu_cnt > mac_tx_percpu_cnt_max) 3868275SEric Cheng mac_tx_percpu_cnt = mac_tx_percpu_cnt_max; 3878275SEric Cheng 3888275SEric Cheng if (mac_tx_percpu_cnt < 1) { 3898275SEric Cheng /* Someone set max_tx_percpu_cnt_max to 0 or less */ 3908275SEric Cheng mac_tx_percpu_cnt = 1; 3918275SEric Cheng } 3928275SEric Cheng 3938275SEric Cheng ASSERT(mac_tx_percpu_cnt >= 1); 3948275SEric Cheng mac_tx_percpu_cnt = (1 << highbit(mac_tx_percpu_cnt - 1)); 3958275SEric Cheng /* 3968275SEric Cheng * Make it of the form 2**N - 1 in the range 3978275SEric Cheng * [0 .. mac_tx_percpu_cnt_max - 1] 3988275SEric Cheng */ 3998275SEric Cheng mac_tx_percpu_cnt--; 4008275SEric Cheng 4010Sstevel@tonic-gate i_mac_impl_cachep = kmem_cache_create("mac_impl_cache", 4022311Sseb sizeof (mac_impl_t), 0, i_mac_constructor, i_mac_destructor, 4032311Sseb NULL, NULL, NULL, 0); 4040Sstevel@tonic-gate ASSERT(i_mac_impl_cachep != NULL); 4050Sstevel@tonic-gate 4068275SEric Cheng mac_ring_cache = kmem_cache_create("mac_ring_cache", 4078275SEric Cheng sizeof (mac_ring_t), 0, i_mac_ring_ctor, i_mac_ring_dtor, NULL, 4088275SEric Cheng NULL, NULL, 0); 4098275SEric Cheng ASSERT(mac_ring_cache != NULL); 4105084Sjohnlev 411269Sericheng i_mac_impl_hash = mod_hash_create_extended("mac_impl_hash", 412269Sericheng IMPL_HASHSZ, mod_hash_null_keydtor, mod_hash_null_valdtor, 413269Sericheng mod_hash_bystr, NULL, mod_hash_strkey_cmp, KM_SLEEP); 414269Sericheng rw_init(&i_mac_impl_lock, NULL, RW_DEFAULT, NULL); 4158275SEric Cheng 4168275SEric Cheng mac_flow_init(); 4178275SEric Cheng mac_soft_ring_init(); 4188275SEric Cheng mac_bcast_init(); 4198275SEric Cheng mac_client_init(); 4208275SEric Cheng 421269Sericheng i_mac_impl_count = 0; 4222311Sseb 4232311Sseb i_mactype_hash = mod_hash_create_extended("mactype_hash", 4242311Sseb MACTYPE_HASHSZ, 4252311Sseb mod_hash_null_keydtor, mod_hash_null_valdtor, 4262311Sseb mod_hash_bystr, NULL, mod_hash_strkey_cmp, KM_SLEEP); 4275895Syz147064 4285895Syz147064 /* 4295895Syz147064 * Allocate an id space to manage minor numbers. The range of the 43010283SGarrett.Damore@Sun.COM * space will be from MAC_MAX_MINOR+1 to MAC_PRIVATE_MINOR-1. This 43110283SGarrett.Damore@Sun.COM * leaves half of the 32-bit minors available for driver private use. 4325895Syz147064 */ 43310283SGarrett.Damore@Sun.COM minor_ids = id_space_create("mac_minor_ids", MAC_MAX_MINOR+1, 43410283SGarrett.Damore@Sun.COM MAC_PRIVATE_MINOR-1); 4355895Syz147064 ASSERT(minor_ids != NULL); 4365895Syz147064 minor_count = 0; 4378275SEric Cheng 4388275SEric Cheng /* Let's default to 20 seconds */ 4398275SEric Cheng mac_logging_interval = 20; 4408275SEric Cheng mac_flow_log_enable = B_FALSE; 4418275SEric Cheng mac_link_log_enable = B_FALSE; 4428275SEric Cheng mac_logging_timer = 0; 4430Sstevel@tonic-gate } 4440Sstevel@tonic-gate 4450Sstevel@tonic-gate int 4460Sstevel@tonic-gate mac_fini(void) 4470Sstevel@tonic-gate { 4485895Syz147064 if (i_mac_impl_count > 0 || minor_count > 0) 449269Sericheng return (EBUSY); 4500Sstevel@tonic-gate 4515895Syz147064 id_space_destroy(minor_ids); 4528275SEric Cheng mac_flow_fini(); 4535895Syz147064 454269Sericheng mod_hash_destroy_hash(i_mac_impl_hash); 455269Sericheng rw_destroy(&i_mac_impl_lock); 4560Sstevel@tonic-gate 4578275SEric Cheng mac_client_fini(); 4588275SEric Cheng kmem_cache_destroy(mac_ring_cache); 4592311Sseb 4602311Sseb mod_hash_destroy_hash(i_mactype_hash); 4618275SEric Cheng mac_soft_ring_finish(); 4620Sstevel@tonic-gate return (0); 4630Sstevel@tonic-gate } 4640Sstevel@tonic-gate 465*10986SSebastien.Roy@Sun.COM /* 466*10986SSebastien.Roy@Sun.COM * Initialize a GLDv3 driver's device ops. A driver that manages its own ops 467*10986SSebastien.Roy@Sun.COM * (e.g. softmac) may pass in a NULL ops argument. 468*10986SSebastien.Roy@Sun.COM */ 4698275SEric Cheng void 4708275SEric Cheng mac_init_ops(struct dev_ops *ops, const char *name) 4718275SEric Cheng { 472*10986SSebastien.Roy@Sun.COM major_t major = ddi_name_to_major((char *)name); 473*10986SSebastien.Roy@Sun.COM 474*10986SSebastien.Roy@Sun.COM /* 475*10986SSebastien.Roy@Sun.COM * By returning on error below, we are not letting the driver continue 476*10986SSebastien.Roy@Sun.COM * in an undefined context. The mac_register() function will faill if 477*10986SSebastien.Roy@Sun.COM * DN_GLDV3_DRIVER isn't set. 478*10986SSebastien.Roy@Sun.COM */ 479*10986SSebastien.Roy@Sun.COM if (major == DDI_MAJOR_T_NONE) 480*10986SSebastien.Roy@Sun.COM return; 481*10986SSebastien.Roy@Sun.COM LOCK_DEV_OPS(&devnamesp[major].dn_lock); 482*10986SSebastien.Roy@Sun.COM devnamesp[major].dn_flags |= (DN_GLDV3_DRIVER | DN_NETWORK_DRIVER); 483*10986SSebastien.Roy@Sun.COM UNLOCK_DEV_OPS(&devnamesp[major].dn_lock); 484*10986SSebastien.Roy@Sun.COM if (ops != NULL) 485*10986SSebastien.Roy@Sun.COM dld_init_ops(ops, name); 4868275SEric Cheng } 4878275SEric Cheng 4888275SEric Cheng void 4898275SEric Cheng mac_fini_ops(struct dev_ops *ops) 4908275SEric Cheng { 4918275SEric Cheng dld_fini_ops(ops); 4928275SEric Cheng } 4938275SEric Cheng 4948275SEric Cheng /*ARGSUSED*/ 4958275SEric Cheng static int 4968275SEric Cheng i_mac_constructor(void *buf, void *arg, int kmflag) 4978275SEric Cheng { 4988275SEric Cheng mac_impl_t *mip = buf; 4998275SEric Cheng 5008275SEric Cheng bzero(buf, sizeof (mac_impl_t)); 5018275SEric Cheng 5028275SEric Cheng mip->mi_linkstate = LINK_STATE_UNKNOWN; 5038275SEric Cheng 5048275SEric Cheng mutex_init(&mip->mi_lock, NULL, MUTEX_DRIVER, NULL); 5058275SEric Cheng rw_init(&mip->mi_rw_lock, NULL, RW_DRIVER, NULL); 5068275SEric Cheng mutex_init(&mip->mi_notify_lock, NULL, MUTEX_DRIVER, NULL); 5078275SEric Cheng mutex_init(&mip->mi_promisc_lock, NULL, MUTEX_DRIVER, NULL); 5088275SEric Cheng mutex_init(&mip->mi_ring_lock, NULL, MUTEX_DEFAULT, NULL); 5098275SEric Cheng 5108275SEric Cheng mip->mi_notify_cb_info.mcbi_lockp = &mip->mi_notify_lock; 5118275SEric Cheng cv_init(&mip->mi_notify_cb_info.mcbi_cv, NULL, CV_DRIVER, NULL); 5128275SEric Cheng mip->mi_promisc_cb_info.mcbi_lockp = &mip->mi_promisc_lock; 5138275SEric Cheng cv_init(&mip->mi_promisc_cb_info.mcbi_cv, NULL, CV_DRIVER, NULL); 51410491SRishi.Srivatsavai@Sun.COM 51510491SRishi.Srivatsavai@Sun.COM mutex_init(&mip->mi_bridge_lock, NULL, MUTEX_DEFAULT, NULL); 51610491SRishi.Srivatsavai@Sun.COM 5178275SEric Cheng return (0); 5188275SEric Cheng } 5198275SEric Cheng 5208275SEric Cheng /*ARGSUSED*/ 5218275SEric Cheng static void 5228275SEric Cheng i_mac_destructor(void *buf, void *arg) 5238275SEric Cheng { 5248275SEric Cheng mac_impl_t *mip = buf; 5258275SEric Cheng mac_cb_info_t *mcbi; 5268275SEric Cheng 5278275SEric Cheng ASSERT(mip->mi_ref == 0); 5288275SEric Cheng ASSERT(mip->mi_active == 0); 5298275SEric Cheng ASSERT(mip->mi_linkstate == LINK_STATE_UNKNOWN); 5308275SEric Cheng ASSERT(mip->mi_devpromisc == 0); 5318275SEric Cheng ASSERT(mip->mi_ksp == NULL); 5328275SEric Cheng ASSERT(mip->mi_kstat_count == 0); 5338275SEric Cheng ASSERT(mip->mi_nclients == 0); 5348275SEric Cheng ASSERT(mip->mi_nactiveclients == 0); 5358833SVenu.Iyer@Sun.COM ASSERT(mip->mi_single_active_client == NULL); 5368275SEric Cheng ASSERT(mip->mi_state_flags == 0); 5378275SEric Cheng ASSERT(mip->mi_factory_addr == NULL); 5388275SEric Cheng ASSERT(mip->mi_factory_addr_num == 0); 5398275SEric Cheng ASSERT(mip->mi_default_tx_ring == NULL); 5408275SEric Cheng 5418275SEric Cheng mcbi = &mip->mi_notify_cb_info; 5428275SEric Cheng ASSERT(mcbi->mcbi_del_cnt == 0 && mcbi->mcbi_walker_cnt == 0); 5438275SEric Cheng ASSERT(mip->mi_notify_bits == 0); 5448275SEric Cheng ASSERT(mip->mi_notify_thread == NULL); 5458275SEric Cheng ASSERT(mcbi->mcbi_lockp == &mip->mi_notify_lock); 5468275SEric Cheng mcbi->mcbi_lockp = NULL; 5478275SEric Cheng 5488275SEric Cheng mcbi = &mip->mi_promisc_cb_info; 5498275SEric Cheng ASSERT(mcbi->mcbi_del_cnt == 0 && mip->mi_promisc_list == NULL); 5508275SEric Cheng ASSERT(mip->mi_promisc_list == NULL); 5518275SEric Cheng ASSERT(mcbi->mcbi_lockp == &mip->mi_promisc_lock); 5528275SEric Cheng mcbi->mcbi_lockp = NULL; 5538275SEric Cheng 5548275SEric Cheng ASSERT(mip->mi_bcast_ngrps == 0 && mip->mi_bcast_grp == NULL); 5558275SEric Cheng ASSERT(mip->mi_perim_owner == NULL && mip->mi_perim_ocnt == 0); 5568275SEric Cheng 5578275SEric Cheng mutex_destroy(&mip->mi_lock); 5588275SEric Cheng rw_destroy(&mip->mi_rw_lock); 5598275SEric Cheng 5608275SEric Cheng mutex_destroy(&mip->mi_promisc_lock); 5618275SEric Cheng cv_destroy(&mip->mi_promisc_cb_info.mcbi_cv); 5628275SEric Cheng mutex_destroy(&mip->mi_notify_lock); 5638275SEric Cheng cv_destroy(&mip->mi_notify_cb_info.mcbi_cv); 5648275SEric Cheng mutex_destroy(&mip->mi_ring_lock); 56510491SRishi.Srivatsavai@Sun.COM 56610491SRishi.Srivatsavai@Sun.COM ASSERT(mip->mi_bridge_link == NULL); 5678275SEric Cheng } 5688275SEric Cheng 5698275SEric Cheng /* ARGSUSED */ 5708275SEric Cheng static int 5718275SEric Cheng i_mac_ring_ctor(void *buf, void *arg, int kmflag) 5728275SEric Cheng { 5738275SEric Cheng mac_ring_t *ring = (mac_ring_t *)buf; 5748275SEric Cheng 5758275SEric Cheng bzero(ring, sizeof (mac_ring_t)); 5768275SEric Cheng cv_init(&ring->mr_cv, NULL, CV_DEFAULT, NULL); 5778275SEric Cheng mutex_init(&ring->mr_lock, NULL, MUTEX_DEFAULT, NULL); 5788275SEric Cheng ring->mr_state = MR_FREE; 5798275SEric Cheng return (0); 5808275SEric Cheng } 5818275SEric Cheng 5828275SEric Cheng /* ARGSUSED */ 5838275SEric Cheng static void 5848275SEric Cheng i_mac_ring_dtor(void *buf, void *arg) 5858275SEric Cheng { 5868275SEric Cheng mac_ring_t *ring = (mac_ring_t *)buf; 5878275SEric Cheng 5888275SEric Cheng cv_destroy(&ring->mr_cv); 5898275SEric Cheng mutex_destroy(&ring->mr_lock); 5908275SEric Cheng } 5918275SEric Cheng 5928275SEric Cheng /* 5938275SEric Cheng * Common functions to do mac callback addition and deletion. Currently this is 5948275SEric Cheng * used by promisc callbacks and notify callbacks. List addition and deletion 5958275SEric Cheng * need to take care of list walkers. List walkers in general, can't hold list 5968275SEric Cheng * locks and make upcall callbacks due to potential lock order and recursive 5978275SEric Cheng * reentry issues. Instead list walkers increment the list walker count to mark 5988275SEric Cheng * the presence of a walker thread. Addition can be carefully done to ensure 5998275SEric Cheng * that the list walker always sees either the old list or the new list. 6008275SEric Cheng * However the deletion can't be done while the walker is active, instead the 6018275SEric Cheng * deleting thread simply marks the entry as logically deleted. The last walker 6028275SEric Cheng * physically deletes and frees up the logically deleted entries when the walk 6038275SEric Cheng * is complete. 6048275SEric Cheng */ 6058275SEric Cheng void 6068275SEric Cheng mac_callback_add(mac_cb_info_t *mcbi, mac_cb_t **mcb_head, 6078275SEric Cheng mac_cb_t *mcb_elem) 6088275SEric Cheng { 6098275SEric Cheng mac_cb_t *p; 6108275SEric Cheng mac_cb_t **pp; 6118275SEric Cheng 6128275SEric Cheng /* Verify it is not already in the list */ 6138275SEric Cheng for (pp = mcb_head; (p = *pp) != NULL; pp = &p->mcb_nextp) { 6148275SEric Cheng if (p == mcb_elem) 6158275SEric Cheng break; 6168275SEric Cheng } 6178275SEric Cheng VERIFY(p == NULL); 6188275SEric Cheng 6198275SEric Cheng /* 6208275SEric Cheng * Add it to the head of the callback list. The membar ensures that 6218275SEric Cheng * the following list pointer manipulations reach global visibility 6228275SEric Cheng * in exactly the program order below. 6238275SEric Cheng */ 6248275SEric Cheng ASSERT(MUTEX_HELD(mcbi->mcbi_lockp)); 6258275SEric Cheng 6268275SEric Cheng mcb_elem->mcb_nextp = *mcb_head; 6278275SEric Cheng membar_producer(); 6288275SEric Cheng *mcb_head = mcb_elem; 6298275SEric Cheng } 6308275SEric Cheng 6318275SEric Cheng /* 6328275SEric Cheng * Mark the entry as logically deleted. If there aren't any walkers unlink 6338275SEric Cheng * from the list. In either case return the corresponding status. 6348275SEric Cheng */ 6358275SEric Cheng boolean_t 6368275SEric Cheng mac_callback_remove(mac_cb_info_t *mcbi, mac_cb_t **mcb_head, 6378275SEric Cheng mac_cb_t *mcb_elem) 6388275SEric Cheng { 6398275SEric Cheng mac_cb_t *p; 6408275SEric Cheng mac_cb_t **pp; 6418275SEric Cheng 6428275SEric Cheng ASSERT(MUTEX_HELD(mcbi->mcbi_lockp)); 6438275SEric Cheng /* 6448275SEric Cheng * Search the callback list for the entry to be removed 6458275SEric Cheng */ 6468275SEric Cheng for (pp = mcb_head; (p = *pp) != NULL; pp = &p->mcb_nextp) { 6478275SEric Cheng if (p == mcb_elem) 6488275SEric Cheng break; 6498275SEric Cheng } 6508275SEric Cheng VERIFY(p != NULL); 6518275SEric Cheng 6528275SEric Cheng /* 6538275SEric Cheng * If there are walkers just mark it as deleted and the last walker 6548275SEric Cheng * will remove from the list and free it. 6558275SEric Cheng */ 6568275SEric Cheng if (mcbi->mcbi_walker_cnt != 0) { 6578275SEric Cheng p->mcb_flags |= MCB_CONDEMNED; 6588275SEric Cheng mcbi->mcbi_del_cnt++; 6598275SEric Cheng return (B_FALSE); 6608275SEric Cheng } 6618275SEric Cheng 6628275SEric Cheng ASSERT(mcbi->mcbi_del_cnt == 0); 6638275SEric Cheng *pp = p->mcb_nextp; 6648275SEric Cheng p->mcb_nextp = NULL; 6658275SEric Cheng return (B_TRUE); 6668275SEric Cheng } 6678275SEric Cheng 6688275SEric Cheng /* 6698275SEric Cheng * Wait for all pending callback removals to be completed 6708275SEric Cheng */ 6718275SEric Cheng void 6728275SEric Cheng mac_callback_remove_wait(mac_cb_info_t *mcbi) 6738275SEric Cheng { 6748275SEric Cheng ASSERT(MUTEX_HELD(mcbi->mcbi_lockp)); 6758275SEric Cheng while (mcbi->mcbi_del_cnt != 0) { 6768275SEric Cheng DTRACE_PROBE1(need_wait, mac_cb_info_t *, mcbi); 6778275SEric Cheng cv_wait(&mcbi->mcbi_cv, mcbi->mcbi_lockp); 6788275SEric Cheng } 6798275SEric Cheng } 6808275SEric Cheng 6810Sstevel@tonic-gate /* 6828275SEric Cheng * The last mac callback walker does the cleanup. Walk the list and unlik 6838275SEric Cheng * all the logically deleted entries and construct a temporary list of 6848275SEric Cheng * removed entries. Return the list of removed entries to the caller. 6858275SEric Cheng */ 6868275SEric Cheng mac_cb_t * 6878275SEric Cheng mac_callback_walker_cleanup(mac_cb_info_t *mcbi, mac_cb_t **mcb_head) 6888275SEric Cheng { 6898275SEric Cheng mac_cb_t *p; 6908275SEric Cheng mac_cb_t **pp; 6918275SEric Cheng mac_cb_t *rmlist = NULL; /* List of removed elements */ 6928275SEric Cheng int cnt = 0; 6938275SEric Cheng 6948275SEric Cheng ASSERT(MUTEX_HELD(mcbi->mcbi_lockp)); 6958275SEric Cheng ASSERT(mcbi->mcbi_del_cnt != 0 && mcbi->mcbi_walker_cnt == 0); 6968275SEric Cheng 6978275SEric Cheng pp = mcb_head; 6988275SEric Cheng while (*pp != NULL) { 6998275SEric Cheng if ((*pp)->mcb_flags & MCB_CONDEMNED) { 7008275SEric Cheng p = *pp; 7018275SEric Cheng *pp = p->mcb_nextp; 7028275SEric Cheng p->mcb_nextp = rmlist; 7038275SEric Cheng rmlist = p; 7048275SEric Cheng cnt++; 7058275SEric Cheng continue; 7068275SEric Cheng } 7078275SEric Cheng pp = &(*pp)->mcb_nextp; 7088275SEric Cheng } 7098275SEric Cheng 7108275SEric Cheng ASSERT(mcbi->mcbi_del_cnt == cnt); 7118275SEric Cheng mcbi->mcbi_del_cnt = 0; 7128275SEric Cheng return (rmlist); 7138275SEric Cheng } 7148275SEric Cheng 7158275SEric Cheng boolean_t 7168275SEric Cheng mac_callback_lookup(mac_cb_t **mcb_headp, mac_cb_t *mcb_elem) 7178275SEric Cheng { 7188275SEric Cheng mac_cb_t *mcb; 7198275SEric Cheng 7208275SEric Cheng /* Verify it is not already in the list */ 7218275SEric Cheng for (mcb = *mcb_headp; mcb != NULL; mcb = mcb->mcb_nextp) { 7228275SEric Cheng if (mcb == mcb_elem) 7238275SEric Cheng return (B_TRUE); 7248275SEric Cheng } 7258275SEric Cheng 7268275SEric Cheng return (B_FALSE); 7278275SEric Cheng } 7288275SEric Cheng 7298275SEric Cheng boolean_t 7308275SEric Cheng mac_callback_find(mac_cb_info_t *mcbi, mac_cb_t **mcb_headp, mac_cb_t *mcb_elem) 7318275SEric Cheng { 7328275SEric Cheng boolean_t found; 7338275SEric Cheng 7348275SEric Cheng mutex_enter(mcbi->mcbi_lockp); 7358275SEric Cheng found = mac_callback_lookup(mcb_headp, mcb_elem); 7368275SEric Cheng mutex_exit(mcbi->mcbi_lockp); 7378275SEric Cheng 7388275SEric Cheng return (found); 7398275SEric Cheng } 7408275SEric Cheng 7418275SEric Cheng /* Free the list of removed callbacks */ 7428275SEric Cheng void 7438275SEric Cheng mac_callback_free(mac_cb_t *rmlist) 7448275SEric Cheng { 7458275SEric Cheng mac_cb_t *mcb; 7468275SEric Cheng mac_cb_t *mcb_next; 7478275SEric Cheng 7488275SEric Cheng for (mcb = rmlist; mcb != NULL; mcb = mcb_next) { 7498275SEric Cheng mcb_next = mcb->mcb_nextp; 7508275SEric Cheng kmem_free(mcb->mcb_objp, mcb->mcb_objsize); 7518275SEric Cheng } 7528275SEric Cheng } 7538275SEric Cheng 7548275SEric Cheng /* 7558275SEric Cheng * The promisc callbacks are in 2 lists, one off the 'mip' and another off the 7568275SEric Cheng * 'mcip' threaded by mpi_mi_link and mpi_mci_link respectively. However there 7578275SEric Cheng * is only a single shared total walker count, and an entry can't be physically 7588275SEric Cheng * unlinked if a walker is active on either list. The last walker does this 7598275SEric Cheng * cleanup of logically deleted entries. 7608275SEric Cheng */ 7618275SEric Cheng void 7628275SEric Cheng i_mac_promisc_walker_cleanup(mac_impl_t *mip) 7638275SEric Cheng { 7648275SEric Cheng mac_cb_t *rmlist; 7658275SEric Cheng mac_cb_t *mcb; 7668275SEric Cheng mac_cb_t *mcb_next; 7678275SEric Cheng mac_promisc_impl_t *mpip; 7688275SEric Cheng 7698275SEric Cheng /* 7708275SEric Cheng * Construct a temporary list of deleted callbacks by walking the 7718275SEric Cheng * the mi_promisc_list. Then for each entry in the temporary list, 7728275SEric Cheng * remove it from the mci_promisc_list and free the entry. 7738275SEric Cheng */ 7748275SEric Cheng rmlist = mac_callback_walker_cleanup(&mip->mi_promisc_cb_info, 7758275SEric Cheng &mip->mi_promisc_list); 7768275SEric Cheng 7778275SEric Cheng for (mcb = rmlist; mcb != NULL; mcb = mcb_next) { 7788275SEric Cheng mcb_next = mcb->mcb_nextp; 7798275SEric Cheng mpip = (mac_promisc_impl_t *)mcb->mcb_objp; 7808275SEric Cheng VERIFY(mac_callback_remove(&mip->mi_promisc_cb_info, 7818275SEric Cheng &mpip->mpi_mcip->mci_promisc_list, &mpip->mpi_mci_link)); 7828275SEric Cheng mcb->mcb_flags = 0; 7838275SEric Cheng mcb->mcb_nextp = NULL; 7848275SEric Cheng kmem_cache_free(mac_promisc_impl_cache, mpip); 7858275SEric Cheng } 7868275SEric Cheng } 7878275SEric Cheng 7888275SEric Cheng void 7898275SEric Cheng i_mac_notify(mac_impl_t *mip, mac_notify_type_t type) 7908275SEric Cheng { 7918275SEric Cheng mac_cb_info_t *mcbi; 7928275SEric Cheng 7938275SEric Cheng /* 7948275SEric Cheng * Signal the notify thread even after mi_ref has become zero and 7958275SEric Cheng * mi_disabled is set. The synchronization with the notify thread 7968275SEric Cheng * happens in mac_unregister and that implies the driver must make 7978275SEric Cheng * sure it is single-threaded (with respect to mac calls) and that 7988275SEric Cheng * all pending mac calls have returned before it calls mac_unregister 7998275SEric Cheng */ 8008275SEric Cheng rw_enter(&i_mac_impl_lock, RW_READER); 8018275SEric Cheng if (mip->mi_state_flags & MIS_DISABLED) 8028275SEric Cheng goto exit; 8038275SEric Cheng 8048275SEric Cheng /* 8058275SEric Cheng * Guard against incorrect notifications. (Running a newer 8068275SEric Cheng * mac client against an older implementation?) 8078275SEric Cheng */ 8088275SEric Cheng if (type >= MAC_NNOTE) 8098275SEric Cheng goto exit; 8108275SEric Cheng 8118275SEric Cheng mcbi = &mip->mi_notify_cb_info; 8128275SEric Cheng mutex_enter(mcbi->mcbi_lockp); 8138275SEric Cheng mip->mi_notify_bits |= (1 << type); 8148275SEric Cheng cv_broadcast(&mcbi->mcbi_cv); 8158275SEric Cheng mutex_exit(mcbi->mcbi_lockp); 8168275SEric Cheng 8178275SEric Cheng exit: 8188275SEric Cheng rw_exit(&i_mac_impl_lock); 8198275SEric Cheng } 8208275SEric Cheng 8218275SEric Cheng /* 8228275SEric Cheng * Mac serialization primitives. Please see the block comment at the 8238275SEric Cheng * top of the file. 8240Sstevel@tonic-gate */ 8258275SEric Cheng void 8268275SEric Cheng i_mac_perim_enter(mac_impl_t *mip) 8278275SEric Cheng { 8288275SEric Cheng mac_client_impl_t *mcip; 8298275SEric Cheng 8308275SEric Cheng if (mip->mi_state_flags & MIS_IS_VNIC) { 8318275SEric Cheng /* 8328275SEric Cheng * This is a VNIC. Return the lower mac since that is what 8338275SEric Cheng * we want to serialize on. 8348275SEric Cheng */ 8358275SEric Cheng mcip = mac_vnic_lower(mip); 8368275SEric Cheng mip = mcip->mci_mip; 8378275SEric Cheng } 8388275SEric Cheng 8398275SEric Cheng mutex_enter(&mip->mi_perim_lock); 8408275SEric Cheng if (mip->mi_perim_owner == curthread) { 8418275SEric Cheng mip->mi_perim_ocnt++; 8428275SEric Cheng mutex_exit(&mip->mi_perim_lock); 8438275SEric Cheng return; 8448275SEric Cheng } 8458275SEric Cheng 8468275SEric Cheng while (mip->mi_perim_owner != NULL) 8478275SEric Cheng cv_wait(&mip->mi_perim_cv, &mip->mi_perim_lock); 8488275SEric Cheng 8498275SEric Cheng mip->mi_perim_owner = curthread; 8508275SEric Cheng ASSERT(mip->mi_perim_ocnt == 0); 8518275SEric Cheng mip->mi_perim_ocnt++; 8528275SEric Cheng #ifdef DEBUG 8538275SEric Cheng mip->mi_perim_stack_depth = getpcstack(mip->mi_perim_stack, 8548275SEric Cheng MAC_PERIM_STACK_DEPTH); 8558275SEric Cheng #endif 8568275SEric Cheng mutex_exit(&mip->mi_perim_lock); 8578275SEric Cheng } 8588275SEric Cheng 8598275SEric Cheng int 8608275SEric Cheng i_mac_perim_enter_nowait(mac_impl_t *mip) 8618275SEric Cheng { 8628275SEric Cheng /* 8638275SEric Cheng * The vnic is a special case, since the serialization is done based 8648275SEric Cheng * on the lower mac. If the lower mac is busy, it does not imply the 8658275SEric Cheng * vnic can't be unregistered. But in the case of other drivers, 8668275SEric Cheng * a busy perimeter or open mac handles implies that the mac is busy 8678275SEric Cheng * and can't be unregistered. 8688275SEric Cheng */ 8698275SEric Cheng if (mip->mi_state_flags & MIS_IS_VNIC) { 8708275SEric Cheng i_mac_perim_enter(mip); 8718275SEric Cheng return (0); 8728275SEric Cheng } 8738275SEric Cheng 8748275SEric Cheng mutex_enter(&mip->mi_perim_lock); 8758275SEric Cheng if (mip->mi_perim_owner != NULL) { 8768275SEric Cheng mutex_exit(&mip->mi_perim_lock); 8778275SEric Cheng return (EBUSY); 8788275SEric Cheng } 8798275SEric Cheng ASSERT(mip->mi_perim_ocnt == 0); 8808275SEric Cheng mip->mi_perim_owner = curthread; 8818275SEric Cheng mip->mi_perim_ocnt++; 8828275SEric Cheng mutex_exit(&mip->mi_perim_lock); 8838275SEric Cheng 8848275SEric Cheng return (0); 8858275SEric Cheng } 8868275SEric Cheng 8878275SEric Cheng void 8888275SEric Cheng i_mac_perim_exit(mac_impl_t *mip) 8898275SEric Cheng { 8908275SEric Cheng mac_client_impl_t *mcip; 8918275SEric Cheng 8928275SEric Cheng if (mip->mi_state_flags & MIS_IS_VNIC) { 8938275SEric Cheng /* 8948275SEric Cheng * This is a VNIC. Return the lower mac since that is what 8958275SEric Cheng * we want to serialize on. 8968275SEric Cheng */ 8978275SEric Cheng mcip = mac_vnic_lower(mip); 8988275SEric Cheng mip = mcip->mci_mip; 8998275SEric Cheng } 9008275SEric Cheng 9018275SEric Cheng ASSERT(mip->mi_perim_owner == curthread && mip->mi_perim_ocnt != 0); 9028275SEric Cheng 9038275SEric Cheng mutex_enter(&mip->mi_perim_lock); 9048275SEric Cheng if (--mip->mi_perim_ocnt == 0) { 9058275SEric Cheng mip->mi_perim_owner = NULL; 9068275SEric Cheng cv_signal(&mip->mi_perim_cv); 9078275SEric Cheng } 9088275SEric Cheng mutex_exit(&mip->mi_perim_lock); 9098275SEric Cheng } 9108275SEric Cheng 9118275SEric Cheng /* 9128275SEric Cheng * Returns whether the current thread holds the mac perimeter. Used in making 9138275SEric Cheng * assertions. 9148275SEric Cheng */ 9158275SEric Cheng boolean_t 9168275SEric Cheng mac_perim_held(mac_handle_t mh) 9178275SEric Cheng { 9188275SEric Cheng mac_impl_t *mip = (mac_impl_t *)mh; 9198275SEric Cheng mac_client_impl_t *mcip; 9208275SEric Cheng 9218275SEric Cheng if (mip->mi_state_flags & MIS_IS_VNIC) { 9228275SEric Cheng /* 9238275SEric Cheng * This is a VNIC. Return the lower mac since that is what 9248275SEric Cheng * we want to serialize on. 9258275SEric Cheng */ 9268275SEric Cheng mcip = mac_vnic_lower(mip); 9278275SEric Cheng mip = mcip->mci_mip; 9288275SEric Cheng } 9298275SEric Cheng return (mip->mi_perim_owner == curthread); 9308275SEric Cheng } 9318275SEric Cheng 9328275SEric Cheng /* 9338275SEric Cheng * mac client interfaces to enter the mac perimeter of a mac end point, given 9348275SEric Cheng * its mac handle, or macname or linkid. 9358275SEric Cheng */ 9368275SEric Cheng void 9378275SEric Cheng mac_perim_enter_by_mh(mac_handle_t mh, mac_perim_handle_t *mphp) 9388275SEric Cheng { 9398275SEric Cheng mac_impl_t *mip = (mac_impl_t *)mh; 9408275SEric Cheng 9418275SEric Cheng i_mac_perim_enter(mip); 9428275SEric Cheng /* 9438275SEric Cheng * The mac_perim_handle_t returned encodes the 'mip' and whether a 9448275SEric Cheng * mac_open has been done internally while entering the perimeter. 9458275SEric Cheng * This information is used in mac_perim_exit 9468275SEric Cheng */ 9478275SEric Cheng MAC_ENCODE_MPH(*mphp, mip, 0); 9488275SEric Cheng } 9498275SEric Cheng 9508275SEric Cheng int 9518275SEric Cheng mac_perim_enter_by_macname(const char *name, mac_perim_handle_t *mphp) 9528275SEric Cheng { 9538275SEric Cheng int err; 9548275SEric Cheng mac_handle_t mh; 9558275SEric Cheng 9568275SEric Cheng if ((err = mac_open(name, &mh)) != 0) 9578275SEric Cheng return (err); 9588275SEric Cheng 9598275SEric Cheng mac_perim_enter_by_mh(mh, mphp); 9608275SEric Cheng MAC_ENCODE_MPH(*mphp, mh, 1); 9618275SEric Cheng return (0); 9628275SEric Cheng } 9638275SEric Cheng 9648275SEric Cheng int 9658275SEric Cheng mac_perim_enter_by_linkid(datalink_id_t linkid, mac_perim_handle_t *mphp) 9668275SEric Cheng { 9678275SEric Cheng int err; 9688275SEric Cheng mac_handle_t mh; 9698275SEric Cheng 9708275SEric Cheng if ((err = mac_open_by_linkid(linkid, &mh)) != 0) 9718275SEric Cheng return (err); 9728275SEric Cheng 9738275SEric Cheng mac_perim_enter_by_mh(mh, mphp); 9748275SEric Cheng MAC_ENCODE_MPH(*mphp, mh, 1); 9758275SEric Cheng return (0); 9768275SEric Cheng } 9778275SEric Cheng 9788275SEric Cheng void 9798275SEric Cheng mac_perim_exit(mac_perim_handle_t mph) 9808275SEric Cheng { 9818275SEric Cheng mac_impl_t *mip; 9828275SEric Cheng boolean_t need_close; 9838275SEric Cheng 9848275SEric Cheng MAC_DECODE_MPH(mph, mip, need_close); 9858275SEric Cheng i_mac_perim_exit(mip); 9868275SEric Cheng if (need_close) 9878275SEric Cheng mac_close((mac_handle_t)mip); 9888275SEric Cheng } 9898275SEric Cheng 9908275SEric Cheng int 9915895Syz147064 mac_hold(const char *macname, mac_impl_t **pmip) 9920Sstevel@tonic-gate { 9930Sstevel@tonic-gate mac_impl_t *mip; 9940Sstevel@tonic-gate int err; 9950Sstevel@tonic-gate 9960Sstevel@tonic-gate /* 9970Sstevel@tonic-gate * Check the device name length to make sure it won't overflow our 9980Sstevel@tonic-gate * buffer. 9990Sstevel@tonic-gate */ 10002311Sseb if (strlen(macname) >= MAXNAMELEN) 10010Sstevel@tonic-gate return (EINVAL); 10020Sstevel@tonic-gate 10030Sstevel@tonic-gate /* 10045895Syz147064 * Look up its entry in the global hash table. 10050Sstevel@tonic-gate */ 10065895Syz147064 rw_enter(&i_mac_impl_lock, RW_WRITER); 10075895Syz147064 err = mod_hash_find(i_mac_impl_hash, (mod_hash_key_t)macname, 10085895Syz147064 (mod_hash_val_t *)&mip); 10095895Syz147064 10105895Syz147064 if (err != 0) { 10115895Syz147064 rw_exit(&i_mac_impl_lock); 10125895Syz147064 return (ENOENT); 10135895Syz147064 } 10145895Syz147064 10158275SEric Cheng if (mip->mi_state_flags & MIS_DISABLED) { 10165895Syz147064 rw_exit(&i_mac_impl_lock); 10175895Syz147064 return (ENOENT); 10185895Syz147064 } 10195895Syz147064 10208275SEric Cheng if (mip->mi_state_flags & MIS_EXCLUSIVE_HELD) { 10215895Syz147064 rw_exit(&i_mac_impl_lock); 10225895Syz147064 return (EBUSY); 10235895Syz147064 } 10245895Syz147064 10255895Syz147064 mip->mi_ref++; 10265895Syz147064 rw_exit(&i_mac_impl_lock); 10275895Syz147064 10285895Syz147064 *pmip = mip; 10295895Syz147064 return (0); 10305895Syz147064 } 10315895Syz147064 10328275SEric Cheng void 10335895Syz147064 mac_rele(mac_impl_t *mip) 10345895Syz147064 { 10355895Syz147064 rw_enter(&i_mac_impl_lock, RW_WRITER); 10365895Syz147064 ASSERT(mip->mi_ref != 0); 10378275SEric Cheng if (--mip->mi_ref == 0) { 10388275SEric Cheng ASSERT(mip->mi_nactiveclients == 0 && 10398275SEric Cheng !(mip->mi_state_flags & MIS_EXCLUSIVE)); 10405895Syz147064 } 10415895Syz147064 rw_exit(&i_mac_impl_lock); 10425895Syz147064 } 10435895Syz147064 10448275SEric Cheng /* 10458893SMichael.Lim@Sun.COM * Private GLDv3 function to start a MAC instance. 10468275SEric Cheng */ 10475895Syz147064 int 10488893SMichael.Lim@Sun.COM mac_start(mac_handle_t mh) 10490Sstevel@tonic-gate { 10508893SMichael.Lim@Sun.COM mac_impl_t *mip = (mac_impl_t *)mh; 10518275SEric Cheng int err = 0; 10528275SEric Cheng 10538275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 10542311Sseb ASSERT(mip->mi_start != NULL); 10550Sstevel@tonic-gate 10560Sstevel@tonic-gate /* 10570Sstevel@tonic-gate * Check whether the device is already started. 10580Sstevel@tonic-gate */ 10598275SEric Cheng if (mip->mi_active++ == 0) { 10608275SEric Cheng mac_ring_t *ring = NULL; 10618275SEric Cheng 10628275SEric Cheng /* 10638275SEric Cheng * Start the device. 10648275SEric Cheng */ 10658275SEric Cheng err = mip->mi_start(mip->mi_driver); 10668275SEric Cheng if (err != 0) { 10678275SEric Cheng mip->mi_active--; 10688275SEric Cheng return (err); 10698275SEric Cheng } 10708275SEric Cheng 10710Sstevel@tonic-gate /* 10728275SEric Cheng * Start the default tx ring. 10730Sstevel@tonic-gate */ 10748275SEric Cheng if (mip->mi_default_tx_ring != NULL) { 10758275SEric Cheng 10768275SEric Cheng ring = (mac_ring_t *)mip->mi_default_tx_ring; 10778275SEric Cheng err = mac_start_ring(ring); 10788275SEric Cheng if (err != 0) { 10798275SEric Cheng mip->mi_active--; 10808275SEric Cheng return (err); 10818275SEric Cheng } 10828275SEric Cheng ring->mr_state = MR_INUSE; 10838275SEric Cheng } 10848275SEric Cheng 10858275SEric Cheng if (mip->mi_rx_groups != NULL) { 10868275SEric Cheng /* 10878275SEric Cheng * Start the default ring, since it will be needed 10888275SEric Cheng * to receive broadcast and multicast traffic for 10898275SEric Cheng * both primary and non-primary MAC clients. 10908275SEric Cheng */ 10918275SEric Cheng mac_group_t *grp = &mip->mi_rx_groups[0]; 10928275SEric Cheng 10938275SEric Cheng ASSERT(grp->mrg_state == MAC_GROUP_STATE_REGISTERED); 10948275SEric Cheng err = mac_start_group_and_rings(grp); 10958275SEric Cheng if (err != 0) { 10968275SEric Cheng mip->mi_active--; 10978275SEric Cheng if (ring != NULL) { 10988275SEric Cheng mac_stop_ring(ring); 10998275SEric Cheng ring->mr_state = MR_FREE; 11008275SEric Cheng } 11018275SEric Cheng return (err); 11028275SEric Cheng } 11038275SEric Cheng mac_set_rx_group_state(grp, MAC_GROUP_STATE_SHARED); 11048275SEric Cheng } 11050Sstevel@tonic-gate } 11060Sstevel@tonic-gate 11070Sstevel@tonic-gate return (err); 11080Sstevel@tonic-gate } 11090Sstevel@tonic-gate 11108275SEric Cheng /* 11118893SMichael.Lim@Sun.COM * Private GLDv3 function to stop a MAC instance. 11128275SEric Cheng */ 11130Sstevel@tonic-gate void 11148893SMichael.Lim@Sun.COM mac_stop(mac_handle_t mh) 11150Sstevel@tonic-gate { 11168893SMichael.Lim@Sun.COM mac_impl_t *mip = (mac_impl_t *)mh; 11178893SMichael.Lim@Sun.COM 11182311Sseb ASSERT(mip->mi_stop != NULL); 11198275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 11200Sstevel@tonic-gate 11210Sstevel@tonic-gate /* 11220Sstevel@tonic-gate * Check whether the device is still needed. 11230Sstevel@tonic-gate */ 11240Sstevel@tonic-gate ASSERT(mip->mi_active != 0); 11258275SEric Cheng if (--mip->mi_active == 0) { 11268275SEric Cheng if (mip->mi_rx_groups != NULL) { 11270Sstevel@tonic-gate /* 11288275SEric Cheng * There should be no more active clients since the 11298275SEric Cheng * MAC is being stopped. Stop the default RX group 11308275SEric Cheng * and transition it back to registered state. 11310Sstevel@tonic-gate */ 11328275SEric Cheng mac_group_t *grp = &mip->mi_rx_groups[0]; 11330Sstevel@tonic-gate 11340Sstevel@tonic-gate /* 11358275SEric Cheng * When clients are torn down, the groups 11368275SEric Cheng * are release via mac_release_rx_group which 11378275SEric Cheng * knows the the default group is always in 11388275SEric Cheng * started mode since broadcast uses it. So 11398275SEric Cheng * we can assert that their are no clients 11408275SEric Cheng * (since mac_bcast_add doesn't register itself 11418275SEric Cheng * as a client) and group is in SHARED state. 11420Sstevel@tonic-gate */ 11438275SEric Cheng ASSERT(grp->mrg_state == MAC_GROUP_STATE_SHARED); 11448275SEric Cheng ASSERT(MAC_RX_GROUP_NO_CLIENT(grp) && 11458275SEric Cheng mip->mi_nactiveclients == 0); 11468275SEric Cheng mac_stop_group_and_rings(grp); 11478275SEric Cheng mac_set_rx_group_state(grp, MAC_GROUP_STATE_REGISTERED); 11480Sstevel@tonic-gate } 11498275SEric Cheng 11508275SEric Cheng if (mip->mi_default_tx_ring != NULL) { 11518275SEric Cheng mac_ring_t *ring; 11528275SEric Cheng 11538275SEric Cheng ring = (mac_ring_t *)mip->mi_default_tx_ring; 11548275SEric Cheng mac_stop_ring(ring); 11558275SEric Cheng ring->mr_state = MR_FREE; 11568275SEric Cheng } 11578275SEric Cheng 11588275SEric Cheng /* 11598275SEric Cheng * Stop the device. 11608275SEric Cheng */ 11618275SEric Cheng mip->mi_stop(mip->mi_driver); 11622331Skrgopi } 11632331Skrgopi } 11642331Skrgopi 11650Sstevel@tonic-gate int 11669641SGirish.Moodalbail@Sun.COM i_mac_promisc_set(mac_impl_t *mip, boolean_t on) 11670Sstevel@tonic-gate { 11680Sstevel@tonic-gate int err = 0; 11690Sstevel@tonic-gate 11708275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 11712311Sseb ASSERT(mip->mi_setpromisc != NULL); 11729641SGirish.Moodalbail@Sun.COM 11730Sstevel@tonic-gate if (on) { 11740Sstevel@tonic-gate /* 11750Sstevel@tonic-gate * Enable promiscuous mode on the device if not yet enabled. 11760Sstevel@tonic-gate */ 11770Sstevel@tonic-gate if (mip->mi_devpromisc++ == 0) { 11782311Sseb err = mip->mi_setpromisc(mip->mi_driver, B_TRUE); 11792311Sseb if (err != 0) { 11800Sstevel@tonic-gate mip->mi_devpromisc--; 11818275SEric Cheng return (err); 11820Sstevel@tonic-gate } 11830Sstevel@tonic-gate i_mac_notify(mip, MAC_NOTE_DEVPROMISC); 11840Sstevel@tonic-gate } 11850Sstevel@tonic-gate } else { 11868275SEric Cheng if (mip->mi_devpromisc == 0) 11878275SEric Cheng return (EPROTO); 11888275SEric Cheng 11890Sstevel@tonic-gate /* 11900Sstevel@tonic-gate * Disable promiscuous mode on the device if this is the last 11910Sstevel@tonic-gate * enabling. 11920Sstevel@tonic-gate */ 11930Sstevel@tonic-gate if (--mip->mi_devpromisc == 0) { 11942311Sseb err = mip->mi_setpromisc(mip->mi_driver, B_FALSE); 11952311Sseb if (err != 0) { 11960Sstevel@tonic-gate mip->mi_devpromisc++; 11978275SEric Cheng return (err); 11980Sstevel@tonic-gate } 11990Sstevel@tonic-gate i_mac_notify(mip, MAC_NOTE_DEVPROMISC); 12000Sstevel@tonic-gate } 12010Sstevel@tonic-gate } 12020Sstevel@tonic-gate 12038275SEric Cheng return (0); 12040Sstevel@tonic-gate } 12050Sstevel@tonic-gate 12068275SEric Cheng /* 12078275SEric Cheng * The promiscuity state can change any time. If the caller needs to take 12088275SEric Cheng * actions that are atomic with the promiscuity state, then the caller needs 12098275SEric Cheng * to bracket the entire sequence with mac_perim_enter/exit 12108275SEric Cheng */ 12110Sstevel@tonic-gate boolean_t 12129641SGirish.Moodalbail@Sun.COM mac_promisc_get(mac_handle_t mh) 12130Sstevel@tonic-gate { 12140Sstevel@tonic-gate mac_impl_t *mip = (mac_impl_t *)mh; 12150Sstevel@tonic-gate 12160Sstevel@tonic-gate /* 12170Sstevel@tonic-gate * Return the current promiscuity. 12180Sstevel@tonic-gate */ 12199641SGirish.Moodalbail@Sun.COM return (mip->mi_devpromisc != 0); 12200Sstevel@tonic-gate } 12210Sstevel@tonic-gate 12228275SEric Cheng /* 12238275SEric Cheng * Invoked at MAC instance attach time to initialize the list 12248275SEric Cheng * of factory MAC addresses supported by a MAC instance. This function 12258275SEric Cheng * builds a local cache in the mac_impl_t for the MAC addresses 12268275SEric Cheng * supported by the underlying hardware. The MAC clients themselves 12278275SEric Cheng * use the mac_addr_factory*() functions to query and reserve 12288275SEric Cheng * factory MAC addresses. 12298275SEric Cheng */ 12300Sstevel@tonic-gate void 12318275SEric Cheng mac_addr_factory_init(mac_impl_t *mip) 12325903Ssowmini { 12338275SEric Cheng mac_capab_multifactaddr_t capab; 12348275SEric Cheng uint8_t *addr; 12358275SEric Cheng int i; 12360Sstevel@tonic-gate 12370Sstevel@tonic-gate /* 12388275SEric Cheng * First round to see how many factory MAC addresses are available. 12390Sstevel@tonic-gate */ 12408275SEric Cheng bzero(&capab, sizeof (capab)); 12418275SEric Cheng if (!i_mac_capab_get((mac_handle_t)mip, MAC_CAPAB_MULTIFACTADDR, 12428275SEric Cheng &capab) || (capab.mcm_naddr == 0)) { 12436512Ssowmini /* 12448275SEric Cheng * The MAC instance doesn't support multiple factory 12458275SEric Cheng * MAC addresses, we're done here. 12466512Ssowmini */ 12476512Ssowmini return; 12485903Ssowmini } 12496512Ssowmini 12500Sstevel@tonic-gate /* 12518275SEric Cheng * Allocate the space and get all the factory addresses. 125256Smeem */ 12538275SEric Cheng addr = kmem_alloc(capab.mcm_naddr * MAXMACADDRLEN, KM_SLEEP); 12548275SEric Cheng capab.mcm_getaddr(mip->mi_driver, capab.mcm_naddr, addr); 12558275SEric Cheng 12568275SEric Cheng mip->mi_factory_addr_num = capab.mcm_naddr; 12578275SEric Cheng mip->mi_factory_addr = kmem_zalloc(mip->mi_factory_addr_num * 12588275SEric Cheng sizeof (mac_factory_addr_t), KM_SLEEP); 12598275SEric Cheng 12608275SEric Cheng for (i = 0; i < capab.mcm_naddr; i++) { 12618275SEric Cheng bcopy(addr + i * MAXMACADDRLEN, 12628275SEric Cheng mip->mi_factory_addr[i].mfa_addr, 12638275SEric Cheng mip->mi_type->mt_addr_length); 12648275SEric Cheng mip->mi_factory_addr[i].mfa_in_use = B_FALSE; 126556Smeem } 126656Smeem 12678275SEric Cheng kmem_free(addr, capab.mcm_naddr * MAXMACADDRLEN); 12688275SEric Cheng } 12698275SEric Cheng 12708275SEric Cheng void 12718275SEric Cheng mac_addr_factory_fini(mac_impl_t *mip) 12728275SEric Cheng { 12738275SEric Cheng if (mip->mi_factory_addr == NULL) { 12748275SEric Cheng ASSERT(mip->mi_factory_addr_num == 0); 12758275SEric Cheng return; 12768275SEric Cheng } 12778275SEric Cheng 12788275SEric Cheng kmem_free(mip->mi_factory_addr, mip->mi_factory_addr_num * 12798275SEric Cheng sizeof (mac_factory_addr_t)); 12808275SEric Cheng 12818275SEric Cheng mip->mi_factory_addr = NULL; 12828275SEric Cheng mip->mi_factory_addr_num = 0; 12830Sstevel@tonic-gate } 12840Sstevel@tonic-gate 12855084Sjohnlev /* 12868275SEric Cheng * Reserve a factory MAC address. If *slot is set to -1, the function 12878275SEric Cheng * attempts to reserve any of the available factory MAC addresses and 12888275SEric Cheng * returns the reserved slot id. If no slots are available, the function 12898275SEric Cheng * returns ENOSPC. If *slot is not set to -1, the function reserves 12908275SEric Cheng * the specified slot if it is available, or returns EBUSY is the slot 12918275SEric Cheng * is already used. Returns ENOTSUP if the underlying MAC does not 12928275SEric Cheng * support multiple factory addresses. If the slot number is not -1 but 12938275SEric Cheng * is invalid, returns EINVAL. 12948275SEric Cheng */ 12958275SEric Cheng int 12968275SEric Cheng mac_addr_factory_reserve(mac_client_handle_t mch, int *slot) 12978275SEric Cheng { 12988275SEric Cheng mac_client_impl_t *mcip = (mac_client_impl_t *)mch; 12998275SEric Cheng mac_impl_t *mip = mcip->mci_mip; 13008275SEric Cheng int i, ret = 0; 13018275SEric Cheng 13028275SEric Cheng i_mac_perim_enter(mip); 13038275SEric Cheng /* 13048275SEric Cheng * Protect against concurrent readers that may need a self-consistent 13058275SEric Cheng * view of the factory addresses 13068275SEric Cheng */ 13078275SEric Cheng rw_enter(&mip->mi_rw_lock, RW_WRITER); 13088275SEric Cheng 13098275SEric Cheng if (mip->mi_factory_addr_num == 0) { 13108275SEric Cheng ret = ENOTSUP; 13118275SEric Cheng goto bail; 13128275SEric Cheng } 13138275SEric Cheng 13148275SEric Cheng if (*slot != -1) { 13158275SEric Cheng /* check the specified slot */ 13168275SEric Cheng if (*slot < 1 || *slot > mip->mi_factory_addr_num) { 13178275SEric Cheng ret = EINVAL; 13188275SEric Cheng goto bail; 13198275SEric Cheng } 13208275SEric Cheng if (mip->mi_factory_addr[*slot-1].mfa_in_use) { 13218275SEric Cheng ret = EBUSY; 13228275SEric Cheng goto bail; 13238275SEric Cheng } 13248275SEric Cheng } else { 13258275SEric Cheng /* pick the next available slot */ 13268275SEric Cheng for (i = 0; i < mip->mi_factory_addr_num; i++) { 13278275SEric Cheng if (!mip->mi_factory_addr[i].mfa_in_use) 13288275SEric Cheng break; 13298275SEric Cheng } 13308275SEric Cheng 13318275SEric Cheng if (i == mip->mi_factory_addr_num) { 13328275SEric Cheng ret = ENOSPC; 13338275SEric Cheng goto bail; 13348275SEric Cheng } 13358275SEric Cheng *slot = i+1; 13368275SEric Cheng } 13378275SEric Cheng 13388275SEric Cheng mip->mi_factory_addr[*slot-1].mfa_in_use = B_TRUE; 13398275SEric Cheng mip->mi_factory_addr[*slot-1].mfa_client = mcip; 13408275SEric Cheng 13418275SEric Cheng bail: 13428275SEric Cheng rw_exit(&mip->mi_rw_lock); 13438275SEric Cheng i_mac_perim_exit(mip); 13448275SEric Cheng return (ret); 13458275SEric Cheng } 13468275SEric Cheng 13478275SEric Cheng /* 13488275SEric Cheng * Release the specified factory MAC address slot. 13495084Sjohnlev */ 13508275SEric Cheng void 13518275SEric Cheng mac_addr_factory_release(mac_client_handle_t mch, uint_t slot) 13528275SEric Cheng { 13538275SEric Cheng mac_client_impl_t *mcip = (mac_client_impl_t *)mch; 13548275SEric Cheng mac_impl_t *mip = mcip->mci_mip; 13558275SEric Cheng 13568275SEric Cheng i_mac_perim_enter(mip); 13578275SEric Cheng /* 13588275SEric Cheng * Protect against concurrent readers that may need a self-consistent 13598275SEric Cheng * view of the factory addresses 13608275SEric Cheng */ 13618275SEric Cheng rw_enter(&mip->mi_rw_lock, RW_WRITER); 13628275SEric Cheng 13638275SEric Cheng ASSERT(slot > 0 && slot <= mip->mi_factory_addr_num); 13648275SEric Cheng ASSERT(mip->mi_factory_addr[slot-1].mfa_in_use); 13658275SEric Cheng 13668275SEric Cheng mip->mi_factory_addr[slot-1].mfa_in_use = B_FALSE; 13678275SEric Cheng 13688275SEric Cheng rw_exit(&mip->mi_rw_lock); 13698275SEric Cheng i_mac_perim_exit(mip); 13708275SEric Cheng } 13718275SEric Cheng 13728275SEric Cheng /* 13738275SEric Cheng * Stores in mac_addr the value of the specified MAC address. Returns 13748275SEric Cheng * 0 on success, or EINVAL if the slot number is not valid for the MAC. 13758275SEric Cheng * The caller must provide a string of at least MAXNAMELEN bytes. 13768275SEric Cheng */ 13778275SEric Cheng void 13788275SEric Cheng mac_addr_factory_value(mac_handle_t mh, int slot, uchar_t *mac_addr, 13798275SEric Cheng uint_t *addr_len, char *client_name, boolean_t *in_use_arg) 13805084Sjohnlev { 13818275SEric Cheng mac_impl_t *mip = (mac_impl_t *)mh; 13828275SEric Cheng boolean_t in_use; 13838275SEric Cheng 13848275SEric Cheng ASSERT(slot > 0 && slot <= mip->mi_factory_addr_num); 13858275SEric Cheng 13868275SEric Cheng /* 13878275SEric Cheng * Readers need to hold mi_rw_lock. Writers need to hold mac perimeter 13888275SEric Cheng * and mi_rw_lock 13898275SEric Cheng */ 13908275SEric Cheng rw_enter(&mip->mi_rw_lock, RW_READER); 13918275SEric Cheng bcopy(mip->mi_factory_addr[slot-1].mfa_addr, mac_addr, MAXMACADDRLEN); 13928275SEric Cheng *addr_len = mip->mi_type->mt_addr_length; 13938275SEric Cheng in_use = mip->mi_factory_addr[slot-1].mfa_in_use; 13948275SEric Cheng if (in_use && client_name != NULL) { 13958275SEric Cheng bcopy(mip->mi_factory_addr[slot-1].mfa_client->mci_name, 13968275SEric Cheng client_name, MAXNAMELEN); 13978275SEric Cheng } 13988275SEric Cheng if (in_use_arg != NULL) 13998275SEric Cheng *in_use_arg = in_use; 14008275SEric Cheng rw_exit(&mip->mi_rw_lock); 14018275SEric Cheng } 14028275SEric Cheng 14038275SEric Cheng /* 14048275SEric Cheng * Returns the number of factory MAC addresses (in addition to the 14058275SEric Cheng * primary MAC address), 0 if the underlying MAC doesn't support 14068275SEric Cheng * that feature. 14078275SEric Cheng */ 14088275SEric Cheng uint_t 14098275SEric Cheng mac_addr_factory_num(mac_handle_t mh) 14108275SEric Cheng { 14118275SEric Cheng mac_impl_t *mip = (mac_impl_t *)mh; 14128275SEric Cheng 14138275SEric Cheng return (mip->mi_factory_addr_num); 14148275SEric Cheng } 14158275SEric Cheng 14168275SEric Cheng 14178275SEric Cheng void 14188275SEric Cheng mac_rx_group_unmark(mac_group_t *grp, uint_t flag) 14198275SEric Cheng { 14208275SEric Cheng mac_ring_t *ring; 14218275SEric Cheng 14228275SEric Cheng for (ring = grp->mrg_rings; ring != NULL; ring = ring->mr_next) 14238275SEric Cheng ring->mr_flag &= ~flag; 14245084Sjohnlev } 14255084Sjohnlev 14265084Sjohnlev /* 14278275SEric Cheng * The following mac_hwrings_xxx() functions are private mac client functions 14288275SEric Cheng * used by the aggr driver to access and control the underlying HW Rx group 14298275SEric Cheng * and rings. In this case, the aggr driver has exclusive control of the 14308275SEric Cheng * underlying HW Rx group/rings, it calls the following functions to 14318275SEric Cheng * start/stop the HW Rx rings, disable/enable polling, add/remove mac' 14328275SEric Cheng * addresses, or set up the Rx callback. 14335084Sjohnlev */ 14348275SEric Cheng /* ARGSUSED */ 14358275SEric Cheng static void 14368275SEric Cheng mac_hwrings_rx_process(void *arg, mac_resource_handle_t srs, 14378275SEric Cheng mblk_t *mp_chain, boolean_t loopback) 14380Sstevel@tonic-gate { 14398275SEric Cheng mac_soft_ring_set_t *mac_srs = (mac_soft_ring_set_t *)srs; 14408275SEric Cheng mac_srs_rx_t *srs_rx = &mac_srs->srs_rx; 14418275SEric Cheng mac_direct_rx_t proc; 14428275SEric Cheng void *arg1; 14438275SEric Cheng mac_resource_handle_t arg2; 14448275SEric Cheng 14458275SEric Cheng proc = srs_rx->sr_func; 14468275SEric Cheng arg1 = srs_rx->sr_arg1; 14478275SEric Cheng arg2 = mac_srs->srs_mrh; 14488275SEric Cheng 14498275SEric Cheng proc(arg1, arg2, mp_chain, NULL); 14500Sstevel@tonic-gate } 14510Sstevel@tonic-gate 14528275SEric Cheng /* 14538275SEric Cheng * This function is called to get the list of HW rings that are reserved by 14548275SEric Cheng * an exclusive mac client. 14558275SEric Cheng * 14568275SEric Cheng * Return value: the number of HW rings. 14578275SEric Cheng */ 14588275SEric Cheng int 14598275SEric Cheng mac_hwrings_get(mac_client_handle_t mch, mac_group_handle_t *hwgh, 146010309SSriharsha.Basavapatna@Sun.COM mac_ring_handle_t *hwrh, mac_ring_type_t rtype) 14610Sstevel@tonic-gate { 14628275SEric Cheng mac_client_impl_t *mcip = (mac_client_impl_t *)mch; 14638275SEric Cheng int cnt = 0; 14640Sstevel@tonic-gate 146510309SSriharsha.Basavapatna@Sun.COM switch (rtype) { 146610309SSriharsha.Basavapatna@Sun.COM case MAC_RING_TYPE_RX: { 146710309SSriharsha.Basavapatna@Sun.COM flow_entry_t *flent = mcip->mci_flent; 146810309SSriharsha.Basavapatna@Sun.COM mac_group_t *grp; 146910309SSriharsha.Basavapatna@Sun.COM mac_ring_t *ring; 147010309SSriharsha.Basavapatna@Sun.COM 147110309SSriharsha.Basavapatna@Sun.COM grp = flent->fe_rx_ring_group; 147210309SSriharsha.Basavapatna@Sun.COM /* 147310309SSriharsha.Basavapatna@Sun.COM * The mac client did not reserve any RX group, return directly. 147410309SSriharsha.Basavapatna@Sun.COM * This is probably because the underlying MAC does not support 147510309SSriharsha.Basavapatna@Sun.COM * any groups. 147610309SSriharsha.Basavapatna@Sun.COM */ 147710309SSriharsha.Basavapatna@Sun.COM *hwgh = NULL; 147810309SSriharsha.Basavapatna@Sun.COM if (grp == NULL) 147910309SSriharsha.Basavapatna@Sun.COM return (0); 148010309SSriharsha.Basavapatna@Sun.COM /* 148110309SSriharsha.Basavapatna@Sun.COM * This group must be reserved by this mac client. 148210309SSriharsha.Basavapatna@Sun.COM */ 148310309SSriharsha.Basavapatna@Sun.COM ASSERT((grp->mrg_state == MAC_GROUP_STATE_RESERVED) && 148410309SSriharsha.Basavapatna@Sun.COM (mch == (mac_client_handle_t) 148510309SSriharsha.Basavapatna@Sun.COM (MAC_RX_GROUP_ONLY_CLIENT(grp)))); 148610309SSriharsha.Basavapatna@Sun.COM for (ring = grp->mrg_rings; 148710309SSriharsha.Basavapatna@Sun.COM ring != NULL; ring = ring->mr_next, cnt++) { 148810309SSriharsha.Basavapatna@Sun.COM ASSERT(cnt < MAX_RINGS_PER_GROUP); 148910309SSriharsha.Basavapatna@Sun.COM hwrh[cnt] = (mac_ring_handle_t)ring; 149010309SSriharsha.Basavapatna@Sun.COM } 149110309SSriharsha.Basavapatna@Sun.COM *hwgh = (mac_group_handle_t)grp; 149210309SSriharsha.Basavapatna@Sun.COM return (cnt); 14938275SEric Cheng } 149410309SSriharsha.Basavapatna@Sun.COM case MAC_RING_TYPE_TX: { 149510309SSriharsha.Basavapatna@Sun.COM mac_soft_ring_set_t *tx_srs; 149610309SSriharsha.Basavapatna@Sun.COM mac_srs_tx_t *tx; 149710309SSriharsha.Basavapatna@Sun.COM 149810309SSriharsha.Basavapatna@Sun.COM tx_srs = MCIP_TX_SRS(mcip); 149910309SSriharsha.Basavapatna@Sun.COM tx = &tx_srs->srs_tx; 150010309SSriharsha.Basavapatna@Sun.COM for (; cnt < tx->st_ring_count; cnt++) 150110309SSriharsha.Basavapatna@Sun.COM hwrh[cnt] = tx->st_rings[cnt]; 150210309SSriharsha.Basavapatna@Sun.COM return (cnt); 150310309SSriharsha.Basavapatna@Sun.COM } 150410309SSriharsha.Basavapatna@Sun.COM default: 150510309SSriharsha.Basavapatna@Sun.COM ASSERT(B_FALSE); 150610309SSriharsha.Basavapatna@Sun.COM return (-1); 150710309SSriharsha.Basavapatna@Sun.COM } 15088275SEric Cheng } 15098275SEric Cheng 15108275SEric Cheng /* 15118275SEric Cheng * Setup the RX callback of the mac client which exclusively controls HW ring. 15128275SEric Cheng */ 15138275SEric Cheng void 15148275SEric Cheng mac_hwring_setup(mac_ring_handle_t hwrh, mac_resource_handle_t prh) 15158275SEric Cheng { 15168275SEric Cheng mac_ring_t *hw_ring = (mac_ring_t *)hwrh; 15178275SEric Cheng mac_soft_ring_set_t *mac_srs = hw_ring->mr_srs; 15188275SEric Cheng 15198275SEric Cheng mac_srs->srs_mrh = prh; 15208275SEric Cheng mac_srs->srs_rx.sr_lower_proc = mac_hwrings_rx_process; 15210Sstevel@tonic-gate } 15220Sstevel@tonic-gate 15230Sstevel@tonic-gate void 15248275SEric Cheng mac_hwring_teardown(mac_ring_handle_t hwrh) 15258275SEric Cheng { 15268275SEric Cheng mac_ring_t *hw_ring = (mac_ring_t *)hwrh; 15278275SEric Cheng mac_soft_ring_set_t *mac_srs = hw_ring->mr_srs; 15288275SEric Cheng 15298275SEric Cheng mac_srs->srs_rx.sr_lower_proc = mac_rx_srs_process; 15308275SEric Cheng mac_srs->srs_mrh = NULL; 15318275SEric Cheng } 15328275SEric Cheng 15338275SEric Cheng int 15348275SEric Cheng mac_hwring_disable_intr(mac_ring_handle_t rh) 15350Sstevel@tonic-gate { 15368275SEric Cheng mac_ring_t *rr_ring = (mac_ring_t *)rh; 15378275SEric Cheng mac_intr_t *intr = &rr_ring->mr_info.mri_intr; 15388275SEric Cheng 15398275SEric Cheng return (intr->mi_disable(intr->mi_handle)); 15408275SEric Cheng } 15418275SEric Cheng 15428275SEric Cheng int 15438275SEric Cheng mac_hwring_enable_intr(mac_ring_handle_t rh) 15448275SEric Cheng { 15458275SEric Cheng mac_ring_t *rr_ring = (mac_ring_t *)rh; 15468275SEric Cheng mac_intr_t *intr = &rr_ring->mr_info.mri_intr; 15478275SEric Cheng 15488275SEric Cheng return (intr->mi_enable(intr->mi_handle)); 15498275SEric Cheng } 15508275SEric Cheng 15518275SEric Cheng int 15528275SEric Cheng mac_hwring_start(mac_ring_handle_t rh) 15538275SEric Cheng { 15548275SEric Cheng mac_ring_t *rr_ring = (mac_ring_t *)rh; 15558275SEric Cheng 15568275SEric Cheng MAC_RING_UNMARK(rr_ring, MR_QUIESCE); 15578275SEric Cheng return (0); 15580Sstevel@tonic-gate } 15590Sstevel@tonic-gate 15600Sstevel@tonic-gate void 15618275SEric Cheng mac_hwring_stop(mac_ring_handle_t rh) 15628275SEric Cheng { 15638275SEric Cheng mac_ring_t *rr_ring = (mac_ring_t *)rh; 15648275SEric Cheng 15658275SEric Cheng mac_rx_ring_quiesce(rr_ring, MR_QUIESCE); 15668275SEric Cheng } 15678275SEric Cheng 15688275SEric Cheng mblk_t * 15698275SEric Cheng mac_hwring_poll(mac_ring_handle_t rh, int bytes_to_pickup) 15708275SEric Cheng { 15718275SEric Cheng mac_ring_t *rr_ring = (mac_ring_t *)rh; 15728275SEric Cheng mac_ring_info_t *info = &rr_ring->mr_info; 15738275SEric Cheng 15748275SEric Cheng return (info->mri_poll(info->mri_driver, bytes_to_pickup)); 15758275SEric Cheng } 15768275SEric Cheng 157710309SSriharsha.Basavapatna@Sun.COM /* 157810309SSriharsha.Basavapatna@Sun.COM * Send packets through the selected tx ring. 157910309SSriharsha.Basavapatna@Sun.COM */ 158010309SSriharsha.Basavapatna@Sun.COM mblk_t * 158110309SSriharsha.Basavapatna@Sun.COM mac_hwring_tx(mac_ring_handle_t rh, mblk_t *mp) 158210309SSriharsha.Basavapatna@Sun.COM { 158310309SSriharsha.Basavapatna@Sun.COM mac_ring_t *ring = (mac_ring_t *)rh; 158410309SSriharsha.Basavapatna@Sun.COM mac_ring_info_t *info = &ring->mr_info; 158510309SSriharsha.Basavapatna@Sun.COM 158610491SRishi.Srivatsavai@Sun.COM ASSERT(ring->mr_type == MAC_RING_TYPE_TX && 158710491SRishi.Srivatsavai@Sun.COM ring->mr_state >= MR_INUSE); 158810309SSriharsha.Basavapatna@Sun.COM return (info->mri_tx(info->mri_driver, mp)); 158910309SSriharsha.Basavapatna@Sun.COM } 159010309SSriharsha.Basavapatna@Sun.COM 15918275SEric Cheng int 15928275SEric Cheng mac_hwgroup_addmac(mac_group_handle_t gh, const uint8_t *addr) 15938275SEric Cheng { 15948275SEric Cheng mac_group_t *group = (mac_group_t *)gh; 15958275SEric Cheng 15968275SEric Cheng return (mac_group_addmac(group, addr)); 15978275SEric Cheng } 15988275SEric Cheng 15998275SEric Cheng int 16008275SEric Cheng mac_hwgroup_remmac(mac_group_handle_t gh, const uint8_t *addr) 16018275SEric Cheng { 16028275SEric Cheng mac_group_t *group = (mac_group_t *)gh; 16038275SEric Cheng 16048275SEric Cheng return (mac_group_remmac(group, addr)); 16058275SEric Cheng } 16068275SEric Cheng 16078275SEric Cheng /* 16088275SEric Cheng * Set the RX group to be shared/reserved. Note that the group must be 16098275SEric Cheng * started/stopped outside of this function. 16108275SEric Cheng */ 16118275SEric Cheng void 16128275SEric Cheng mac_set_rx_group_state(mac_group_t *grp, mac_group_state_t state) 16130Sstevel@tonic-gate { 16148275SEric Cheng /* 16158275SEric Cheng * If there is no change in the group state, just return. 16168275SEric Cheng */ 16178275SEric Cheng if (grp->mrg_state == state) 16188275SEric Cheng return; 16198275SEric Cheng 16208275SEric Cheng switch (state) { 16218275SEric Cheng case MAC_GROUP_STATE_RESERVED: 16228275SEric Cheng /* 16238275SEric Cheng * Successfully reserved the group. 16248275SEric Cheng * 16258275SEric Cheng * Given that there is an exclusive client controlling this 16268275SEric Cheng * group, we enable the group level polling when available, 16278275SEric Cheng * so that SRSs get to turn on/off individual rings they's 16288275SEric Cheng * assigned to. 16298275SEric Cheng */ 16308275SEric Cheng ASSERT(MAC_PERIM_HELD(grp->mrg_mh)); 16318275SEric Cheng 16328275SEric Cheng if (GROUP_INTR_DISABLE_FUNC(grp) != NULL) 16338275SEric Cheng GROUP_INTR_DISABLE_FUNC(grp)(GROUP_INTR_HANDLE(grp)); 16348275SEric Cheng 16358275SEric Cheng break; 16368275SEric Cheng 16378275SEric Cheng case MAC_GROUP_STATE_SHARED: 16388275SEric Cheng /* 16398275SEric Cheng * Set all rings of this group to software classified. 16408275SEric Cheng * If the group has an overriding interrupt, then re-enable it. 16418275SEric Cheng */ 16428275SEric Cheng ASSERT(MAC_PERIM_HELD(grp->mrg_mh)); 16438275SEric Cheng 16448275SEric Cheng if (GROUP_INTR_ENABLE_FUNC(grp) != NULL) 16458275SEric Cheng GROUP_INTR_ENABLE_FUNC(grp)(GROUP_INTR_HANDLE(grp)); 16468275SEric Cheng 16478275SEric Cheng /* The ring is not available for reservations any more */ 16488275SEric Cheng break; 16498275SEric Cheng 16508275SEric Cheng case MAC_GROUP_STATE_REGISTERED: 16518275SEric Cheng /* Also callable from mac_register, perim is not held */ 16528275SEric Cheng break; 16538275SEric Cheng 16548275SEric Cheng default: 16558275SEric Cheng ASSERT(B_FALSE); 16568275SEric Cheng break; 16578275SEric Cheng } 16588275SEric Cheng 16598275SEric Cheng grp->mrg_state = state; 16608275SEric Cheng } 16618275SEric Cheng 16628275SEric Cheng /* 16638275SEric Cheng * Quiesce future hardware classified packets for the specified Rx ring 16648275SEric Cheng */ 16658275SEric Cheng static void 16668275SEric Cheng mac_rx_ring_quiesce(mac_ring_t *rx_ring, uint_t ring_flag) 16678275SEric Cheng { 16688275SEric Cheng ASSERT(rx_ring->mr_classify_type == MAC_HW_CLASSIFIER); 16698275SEric Cheng ASSERT(ring_flag == MR_CONDEMNED || ring_flag == MR_QUIESCE); 16708275SEric Cheng 16718275SEric Cheng mutex_enter(&rx_ring->mr_lock); 16728275SEric Cheng rx_ring->mr_flag |= ring_flag; 16738275SEric Cheng while (rx_ring->mr_refcnt != 0) 16748275SEric Cheng cv_wait(&rx_ring->mr_cv, &rx_ring->mr_lock); 16758275SEric Cheng mutex_exit(&rx_ring->mr_lock); 16760Sstevel@tonic-gate } 16770Sstevel@tonic-gate 16784913Sethindra /* 16798275SEric Cheng * Please see mac_tx for details about the per cpu locking scheme 16804913Sethindra */ 16818275SEric Cheng static void 16828275SEric Cheng mac_tx_lock_all(mac_client_impl_t *mcip) 16838275SEric Cheng { 16848275SEric Cheng int i; 16858275SEric Cheng 16868275SEric Cheng for (i = 0; i <= mac_tx_percpu_cnt; i++) 16878275SEric Cheng mutex_enter(&mcip->mci_tx_pcpu[i].pcpu_tx_lock); 16888275SEric Cheng } 16898275SEric Cheng 16908275SEric Cheng static void 16918275SEric Cheng mac_tx_unlock_all(mac_client_impl_t *mcip) 16928275SEric Cheng { 16938275SEric Cheng int i; 16948275SEric Cheng 16958275SEric Cheng for (i = mac_tx_percpu_cnt; i >= 0; i--) 16968275SEric Cheng mutex_exit(&mcip->mci_tx_pcpu[i].pcpu_tx_lock); 16978275SEric Cheng } 16988275SEric Cheng 16998275SEric Cheng static void 17008275SEric Cheng mac_tx_unlock_allbutzero(mac_client_impl_t *mcip) 17018275SEric Cheng { 17028275SEric Cheng int i; 17038275SEric Cheng 17048275SEric Cheng for (i = mac_tx_percpu_cnt; i > 0; i--) 17058275SEric Cheng mutex_exit(&mcip->mci_tx_pcpu[i].pcpu_tx_lock); 17068275SEric Cheng } 17078275SEric Cheng 17088275SEric Cheng static int 17098275SEric Cheng mac_tx_sum_refcnt(mac_client_impl_t *mcip) 17100Sstevel@tonic-gate { 17118275SEric Cheng int i; 17128275SEric Cheng int refcnt = 0; 17138275SEric Cheng 17148275SEric Cheng for (i = 0; i <= mac_tx_percpu_cnt; i++) 17158275SEric Cheng refcnt += mcip->mci_tx_pcpu[i].pcpu_tx_refcnt; 17168275SEric Cheng 17178275SEric Cheng return (refcnt); 17180Sstevel@tonic-gate } 17190Sstevel@tonic-gate 17208275SEric Cheng /* 17218275SEric Cheng * Stop future Tx packets coming down from the client in preparation for 17228275SEric Cheng * quiescing the Tx side. This is needed for dynamic reclaim and reassignment 17238275SEric Cheng * of rings between clients 17248275SEric Cheng */ 17258275SEric Cheng void 17268275SEric Cheng mac_tx_client_block(mac_client_impl_t *mcip) 17275084Sjohnlev { 17288275SEric Cheng mac_tx_lock_all(mcip); 17298275SEric Cheng mcip->mci_tx_flag |= MCI_TX_QUIESCE; 17308275SEric Cheng while (mac_tx_sum_refcnt(mcip) != 0) { 17318275SEric Cheng mac_tx_unlock_allbutzero(mcip); 17328275SEric Cheng cv_wait(&mcip->mci_tx_cv, &mcip->mci_tx_pcpu[0].pcpu_tx_lock); 17338275SEric Cheng mutex_exit(&mcip->mci_tx_pcpu[0].pcpu_tx_lock); 17348275SEric Cheng mac_tx_lock_all(mcip); 17358275SEric Cheng } 17368275SEric Cheng mac_tx_unlock_all(mcip); 17375084Sjohnlev } 17385084Sjohnlev 17398275SEric Cheng void 17408275SEric Cheng mac_tx_client_unblock(mac_client_impl_t *mcip) 17415084Sjohnlev { 17428275SEric Cheng mac_tx_lock_all(mcip); 17438275SEric Cheng mcip->mci_tx_flag &= ~MCI_TX_QUIESCE; 17448275SEric Cheng mac_tx_unlock_all(mcip); 17458833SVenu.Iyer@Sun.COM /* 17468833SVenu.Iyer@Sun.COM * We may fail to disable flow control for the last MAC_NOTE_TX 17478833SVenu.Iyer@Sun.COM * notification because the MAC client is quiesced. Send the 17488833SVenu.Iyer@Sun.COM * notification again. 17498833SVenu.Iyer@Sun.COM */ 17508833SVenu.Iyer@Sun.COM i_mac_notify(mcip->mci_mip, MAC_NOTE_TX); 17515084Sjohnlev } 17525084Sjohnlev 17530Sstevel@tonic-gate /* 17548275SEric Cheng * Wait for an SRS to quiesce. The SRS worker will signal us when the 17558275SEric Cheng * quiesce is done. 17568275SEric Cheng */ 17578275SEric Cheng static void 17588275SEric Cheng mac_srs_quiesce_wait(mac_soft_ring_set_t *srs, uint_t srs_flag) 17598275SEric Cheng { 17608275SEric Cheng mutex_enter(&srs->srs_lock); 17618275SEric Cheng while (!(srs->srs_state & srs_flag)) 17628275SEric Cheng cv_wait(&srs->srs_quiesce_done_cv, &srs->srs_lock); 17638275SEric Cheng mutex_exit(&srs->srs_lock); 17648275SEric Cheng } 17658275SEric Cheng 17668275SEric Cheng /* 17678275SEric Cheng * Quiescing an Rx SRS is achieved by the following sequence. The protocol 17688275SEric Cheng * works bottom up by cutting off packet flow from the bottommost point in the 17698275SEric Cheng * mac, then the SRS, and then the soft rings. There are 2 use cases of this 17708275SEric Cheng * mechanism. One is a temporary quiesce of the SRS, such as say while changing 17718275SEric Cheng * the Rx callbacks. Another use case is Rx SRS teardown. In the former case 17728275SEric Cheng * the QUIESCE prefix/suffix is used and in the latter the CONDEMNED is used 17738275SEric Cheng * for the SRS and MR flags. In the former case the threads pause waiting for 17748275SEric Cheng * a restart, while in the latter case the threads exit. The Tx SRS teardown 17758275SEric Cheng * is also mostly similar to the above. 17768275SEric Cheng * 17778275SEric Cheng * 1. Stop future hardware classified packets at the lowest level in the mac. 17788275SEric Cheng * Remove any hardware classification rule (CONDEMNED case) and mark the 17798275SEric Cheng * rings as CONDEMNED or QUIESCE as appropriate. This prevents the mr_refcnt 17808275SEric Cheng * from increasing. Upcalls from the driver that come through hardware 17818275SEric Cheng * classification will be dropped in mac_rx from now on. Then we wait for 17828275SEric Cheng * the mr_refcnt to drop to zero. When the mr_refcnt reaches zero we are 17838275SEric Cheng * sure there aren't any upcall threads from the driver through hardware 17848275SEric Cheng * classification. In the case of SRS teardown we also remove the 17858275SEric Cheng * classification rule in the driver. 17868275SEric Cheng * 17878275SEric Cheng * 2. Stop future software classified packets by marking the flow entry with 17888275SEric Cheng * FE_QUIESCE or FE_CONDEMNED as appropriate which prevents the refcnt from 17898275SEric Cheng * increasing. We also remove the flow entry from the table in the latter 17908275SEric Cheng * case. Then wait for the fe_refcnt to reach an appropriate quiescent value 17918275SEric Cheng * that indicates there aren't any active threads using that flow entry. 17928275SEric Cheng * 17938275SEric Cheng * 3. Quiesce the SRS and softrings by signaling the SRS. The SRS poll thread, 17948275SEric Cheng * SRS worker thread, and the soft ring threads are quiesced in sequence 17958275SEric Cheng * with the SRS worker thread serving as a master controller. This 17968275SEric Cheng * mechansim is explained in mac_srs_worker_quiesce(). 17978275SEric Cheng * 17988275SEric Cheng * The restart mechanism to reactivate the SRS and softrings is explained 17998275SEric Cheng * in mac_srs_worker_restart(). Here we just signal the SRS worker to start the 18008275SEric Cheng * restart sequence. 18010Sstevel@tonic-gate */ 18020Sstevel@tonic-gate void 18038275SEric Cheng mac_rx_srs_quiesce(mac_soft_ring_set_t *srs, uint_t srs_quiesce_flag) 18040Sstevel@tonic-gate { 18058275SEric Cheng flow_entry_t *flent = srs->srs_flent; 18068275SEric Cheng uint_t mr_flag, srs_done_flag; 18078275SEric Cheng 18088275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)FLENT_TO_MIP(flent))); 18098275SEric Cheng ASSERT(!(srs->srs_type & SRST_TX)); 18108275SEric Cheng 18118275SEric Cheng if (srs_quiesce_flag == SRS_CONDEMNED) { 18128275SEric Cheng mr_flag = MR_CONDEMNED; 18138275SEric Cheng srs_done_flag = SRS_CONDEMNED_DONE; 18148275SEric Cheng if (srs->srs_type & SRST_CLIENT_POLL_ENABLED) 18158275SEric Cheng mac_srs_client_poll_disable(srs->srs_mcip, srs); 18168275SEric Cheng } else { 18178275SEric Cheng ASSERT(srs_quiesce_flag == SRS_QUIESCE); 18188275SEric Cheng mr_flag = MR_QUIESCE; 18198275SEric Cheng srs_done_flag = SRS_QUIESCE_DONE; 18208275SEric Cheng if (srs->srs_type & SRST_CLIENT_POLL_ENABLED) 18218275SEric Cheng mac_srs_client_poll_quiesce(srs->srs_mcip, srs); 18228275SEric Cheng } 18238275SEric Cheng 18248275SEric Cheng if (srs->srs_ring != NULL) { 18258275SEric Cheng mac_rx_ring_quiesce(srs->srs_ring, mr_flag); 18268275SEric Cheng } else { 18278275SEric Cheng /* 18288275SEric Cheng * SRS is driven by software classification. In case 18298275SEric Cheng * of CONDEMNED, the top level teardown functions will 18308275SEric Cheng * deal with flow removal. 18318275SEric Cheng */ 18328275SEric Cheng if (srs_quiesce_flag != SRS_CONDEMNED) { 18338275SEric Cheng FLOW_MARK(flent, FE_QUIESCE); 18348275SEric Cheng mac_flow_wait(flent, FLOW_DRIVER_UPCALL); 18358275SEric Cheng } 18368275SEric Cheng } 18370Sstevel@tonic-gate 18380Sstevel@tonic-gate /* 18398275SEric Cheng * Signal the SRS to quiesce itself, and then cv_wait for the 18408275SEric Cheng * SRS quiesce to complete. The SRS worker thread will wake us 18418275SEric Cheng * up when the quiesce is complete 18424913Sethindra */ 18438275SEric Cheng mac_srs_signal(srs, srs_quiesce_flag); 18448275SEric Cheng mac_srs_quiesce_wait(srs, srs_done_flag); 18454913Sethindra } 18464913Sethindra 18474913Sethindra /* 18488275SEric Cheng * Remove an SRS. 18494913Sethindra */ 18504913Sethindra void 18518275SEric Cheng mac_rx_srs_remove(mac_soft_ring_set_t *srs) 18524913Sethindra { 18538275SEric Cheng flow_entry_t *flent = srs->srs_flent; 18548275SEric Cheng int i; 18558275SEric Cheng 18568275SEric Cheng mac_rx_srs_quiesce(srs, SRS_CONDEMNED); 18578275SEric Cheng /* 18588275SEric Cheng * Locate and remove our entry in the fe_rx_srs[] array, and 18598275SEric Cheng * adjust the fe_rx_srs array entries and array count by 18608275SEric Cheng * moving the last entry into the vacated spot. 18618275SEric Cheng */ 18628275SEric Cheng mutex_enter(&flent->fe_lock); 18638275SEric Cheng for (i = 0; i < flent->fe_rx_srs_cnt; i++) { 18648275SEric Cheng if (flent->fe_rx_srs[i] == srs) 18658275SEric Cheng break; 18664913Sethindra } 18678275SEric Cheng 18688275SEric Cheng ASSERT(i != 0 && i < flent->fe_rx_srs_cnt); 18698275SEric Cheng if (i != flent->fe_rx_srs_cnt - 1) { 18708275SEric Cheng flent->fe_rx_srs[i] = 18718275SEric Cheng flent->fe_rx_srs[flent->fe_rx_srs_cnt - 1]; 18728275SEric Cheng i = flent->fe_rx_srs_cnt - 1; 18738275SEric Cheng } 18748275SEric Cheng 18758275SEric Cheng flent->fe_rx_srs[i] = NULL; 18768275SEric Cheng flent->fe_rx_srs_cnt--; 18778275SEric Cheng mutex_exit(&flent->fe_lock); 18788275SEric Cheng 18798275SEric Cheng mac_srs_free(srs); 18800Sstevel@tonic-gate } 18810Sstevel@tonic-gate 18828275SEric Cheng static void 18838275SEric Cheng mac_srs_clear_flag(mac_soft_ring_set_t *srs, uint_t flag) 18840Sstevel@tonic-gate { 18858275SEric Cheng mutex_enter(&srs->srs_lock); 18868275SEric Cheng srs->srs_state &= ~flag; 18878275SEric Cheng mutex_exit(&srs->srs_lock); 18888275SEric Cheng } 18898275SEric Cheng 18908275SEric Cheng void 18918275SEric Cheng mac_rx_srs_restart(mac_soft_ring_set_t *srs) 18928275SEric Cheng { 18938275SEric Cheng flow_entry_t *flent = srs->srs_flent; 18948275SEric Cheng mac_ring_t *mr; 18958275SEric Cheng 18968275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)FLENT_TO_MIP(flent))); 18978275SEric Cheng ASSERT((srs->srs_type & SRST_TX) == 0); 18980Sstevel@tonic-gate 18990Sstevel@tonic-gate /* 19008275SEric Cheng * This handles a change in the number of SRSs between the quiesce and 19018275SEric Cheng * and restart operation of a flow. 19028275SEric Cheng */ 19038275SEric Cheng if (!SRS_QUIESCED(srs)) 19048275SEric Cheng return; 19058275SEric Cheng 19068275SEric Cheng /* 19078275SEric Cheng * Signal the SRS to restart itself. Wait for the restart to complete 19088275SEric Cheng * Note that we only restart the SRS if it is not marked as 19098275SEric Cheng * permanently quiesced. 19100Sstevel@tonic-gate */ 19118275SEric Cheng if (!SRS_QUIESCED_PERMANENT(srs)) { 19128275SEric Cheng mac_srs_signal(srs, SRS_RESTART); 19138275SEric Cheng mac_srs_quiesce_wait(srs, SRS_RESTART_DONE); 19148275SEric Cheng mac_srs_clear_flag(srs, SRS_RESTART_DONE); 19158275SEric Cheng 19168275SEric Cheng mac_srs_client_poll_restart(srs->srs_mcip, srs); 19178275SEric Cheng } 19188275SEric Cheng 19198275SEric Cheng /* Finally clear the flags to let the packets in */ 19208275SEric Cheng mr = srs->srs_ring; 19218275SEric Cheng if (mr != NULL) { 19228275SEric Cheng MAC_RING_UNMARK(mr, MR_QUIESCE); 19238275SEric Cheng /* In case the ring was stopped, safely restart it */ 19248275SEric Cheng (void) mac_start_ring(mr); 19258275SEric Cheng } else { 19268275SEric Cheng FLOW_UNMARK(flent, FE_QUIESCE); 19278275SEric Cheng } 19288275SEric Cheng } 19298275SEric Cheng 19308275SEric Cheng /* 19318275SEric Cheng * Temporary quiesce of a flow and associated Rx SRS. 19328275SEric Cheng * Please see block comment above mac_rx_classify_flow_rem. 19338275SEric Cheng */ 19348275SEric Cheng /* ARGSUSED */ 19358275SEric Cheng int 19368275SEric Cheng mac_rx_classify_flow_quiesce(flow_entry_t *flent, void *arg) 19378275SEric Cheng { 19388275SEric Cheng int i; 19398275SEric Cheng 19408275SEric Cheng for (i = 0; i < flent->fe_rx_srs_cnt; i++) { 19418275SEric Cheng mac_rx_srs_quiesce((mac_soft_ring_set_t *)flent->fe_rx_srs[i], 19428275SEric Cheng SRS_QUIESCE); 19438275SEric Cheng } 19448275SEric Cheng return (0); 19450Sstevel@tonic-gate } 19460Sstevel@tonic-gate 19470Sstevel@tonic-gate /* 19488275SEric Cheng * Restart a flow and associated Rx SRS that has been quiesced temporarily 19498275SEric Cheng * Please see block comment above mac_rx_classify_flow_rem 19500Sstevel@tonic-gate */ 19518275SEric Cheng /* ARGSUSED */ 19528275SEric Cheng int 19538275SEric Cheng mac_rx_classify_flow_restart(flow_entry_t *flent, void *arg) 19548275SEric Cheng { 19558275SEric Cheng int i; 19568275SEric Cheng 19578275SEric Cheng for (i = 0; i < flent->fe_rx_srs_cnt; i++) 19588275SEric Cheng mac_rx_srs_restart((mac_soft_ring_set_t *)flent->fe_rx_srs[i]); 19598275SEric Cheng 19608275SEric Cheng return (0); 19618275SEric Cheng } 19628275SEric Cheng 19630Sstevel@tonic-gate void 19648275SEric Cheng mac_srs_perm_quiesce(mac_client_handle_t mch, boolean_t on) 19650Sstevel@tonic-gate { 19668275SEric Cheng mac_client_impl_t *mcip = (mac_client_impl_t *)mch; 19678275SEric Cheng flow_entry_t *flent = mcip->mci_flent; 19688275SEric Cheng mac_impl_t *mip = mcip->mci_mip; 19698275SEric Cheng mac_soft_ring_set_t *mac_srs; 19708275SEric Cheng int i; 19718275SEric Cheng 19728275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 19738275SEric Cheng 19748275SEric Cheng if (flent == NULL) 19758275SEric Cheng return; 19768275SEric Cheng 19778275SEric Cheng for (i = 0; i < flent->fe_rx_srs_cnt; i++) { 19788275SEric Cheng mac_srs = flent->fe_rx_srs[i]; 19798275SEric Cheng mutex_enter(&mac_srs->srs_lock); 19808275SEric Cheng if (on) 19818275SEric Cheng mac_srs->srs_state |= SRS_QUIESCE_PERM; 19828275SEric Cheng else 19838275SEric Cheng mac_srs->srs_state &= ~SRS_QUIESCE_PERM; 19848275SEric Cheng mutex_exit(&mac_srs->srs_lock); 19850Sstevel@tonic-gate } 19868275SEric Cheng } 19878275SEric Cheng 19888275SEric Cheng void 19898275SEric Cheng mac_rx_client_quiesce(mac_client_handle_t mch) 19908275SEric Cheng { 19918275SEric Cheng mac_client_impl_t *mcip = (mac_client_impl_t *)mch; 19928275SEric Cheng mac_impl_t *mip = mcip->mci_mip; 19938275SEric Cheng 19948275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 19958275SEric Cheng 19968275SEric Cheng if (MCIP_DATAPATH_SETUP(mcip)) { 19978275SEric Cheng (void) mac_rx_classify_flow_quiesce(mcip->mci_flent, 19988275SEric Cheng NULL); 19998275SEric Cheng (void) mac_flow_walk_nolock(mcip->mci_subflow_tab, 20008275SEric Cheng mac_rx_classify_flow_quiesce, NULL); 20018275SEric Cheng } 20020Sstevel@tonic-gate } 20030Sstevel@tonic-gate 20040Sstevel@tonic-gate void 20058275SEric Cheng mac_rx_client_restart(mac_client_handle_t mch) 20060Sstevel@tonic-gate { 20078275SEric Cheng mac_client_impl_t *mcip = (mac_client_impl_t *)mch; 20088275SEric Cheng mac_impl_t *mip = mcip->mci_mip; 20098275SEric Cheng 20108275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 20118275SEric Cheng 20128275SEric Cheng if (MCIP_DATAPATH_SETUP(mcip)) { 20138275SEric Cheng (void) mac_rx_classify_flow_restart(mcip->mci_flent, NULL); 20148275SEric Cheng (void) mac_flow_walk_nolock(mcip->mci_subflow_tab, 20158275SEric Cheng mac_rx_classify_flow_restart, NULL); 20168275SEric Cheng } 20178275SEric Cheng } 20188275SEric Cheng 20198275SEric Cheng /* 20208275SEric Cheng * This function only quiesces the Tx SRS and softring worker threads. Callers 20218275SEric Cheng * need to make sure that there aren't any mac client threads doing current or 20228275SEric Cheng * future transmits in the mac before calling this function. 20238275SEric Cheng */ 20248275SEric Cheng void 20258275SEric Cheng mac_tx_srs_quiesce(mac_soft_ring_set_t *srs, uint_t srs_quiesce_flag) 20268275SEric Cheng { 20278275SEric Cheng mac_client_impl_t *mcip = srs->srs_mcip; 20288275SEric Cheng 20298275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mcip->mci_mip)); 20308275SEric Cheng 20318275SEric Cheng ASSERT(srs->srs_type & SRST_TX); 20328275SEric Cheng ASSERT(srs_quiesce_flag == SRS_CONDEMNED || 20338275SEric Cheng srs_quiesce_flag == SRS_QUIESCE); 20340Sstevel@tonic-gate 20350Sstevel@tonic-gate /* 20368275SEric Cheng * Signal the SRS to quiesce itself, and then cv_wait for the 20378275SEric Cheng * SRS quiesce to complete. The SRS worker thread will wake us 20388275SEric Cheng * up when the quiesce is complete 20390Sstevel@tonic-gate */ 20408275SEric Cheng mac_srs_signal(srs, srs_quiesce_flag); 20418275SEric Cheng mac_srs_quiesce_wait(srs, srs_quiesce_flag == SRS_QUIESCE ? 20428275SEric Cheng SRS_QUIESCE_DONE : SRS_CONDEMNED_DONE); 20438275SEric Cheng } 20448275SEric Cheng 20458275SEric Cheng void 20468275SEric Cheng mac_tx_srs_restart(mac_soft_ring_set_t *srs) 20478275SEric Cheng { 20488275SEric Cheng /* 20498275SEric Cheng * Resizing the fanout could result in creation of new SRSs. 20508275SEric Cheng * They may not necessarily be in the quiesced state in which 20518275SEric Cheng * case it need be restarted 20528275SEric Cheng */ 20538275SEric Cheng if (!SRS_QUIESCED(srs)) 20548275SEric Cheng return; 20558275SEric Cheng 20568275SEric Cheng mac_srs_signal(srs, SRS_RESTART); 20578275SEric Cheng mac_srs_quiesce_wait(srs, SRS_RESTART_DONE); 20588275SEric Cheng mac_srs_clear_flag(srs, SRS_RESTART_DONE); 20590Sstevel@tonic-gate } 20600Sstevel@tonic-gate 20610Sstevel@tonic-gate /* 20628275SEric Cheng * Temporary quiesce of a flow and associated Rx SRS. 20638275SEric Cheng * Please see block comment above mac_rx_srs_quiesce 20640Sstevel@tonic-gate */ 20658275SEric Cheng /* ARGSUSED */ 20668275SEric Cheng int 20678275SEric Cheng mac_tx_flow_quiesce(flow_entry_t *flent, void *arg) 20680Sstevel@tonic-gate { 20692311Sseb /* 20708275SEric Cheng * The fe_tx_srs is null for a subflow on an interface that is 20718275SEric Cheng * not plumbed 20722311Sseb */ 20738275SEric Cheng if (flent->fe_tx_srs != NULL) 20748275SEric Cheng mac_tx_srs_quiesce(flent->fe_tx_srs, SRS_QUIESCE); 20758275SEric Cheng return (0); 20768275SEric Cheng } 20778275SEric Cheng 20788275SEric Cheng /* ARGSUSED */ 20798275SEric Cheng int 20808275SEric Cheng mac_tx_flow_restart(flow_entry_t *flent, void *arg) 20818275SEric Cheng { 20828275SEric Cheng /* 20838275SEric Cheng * The fe_tx_srs is null for a subflow on an interface that is 20848275SEric Cheng * not plumbed 20858275SEric Cheng */ 20868275SEric Cheng if (flent->fe_tx_srs != NULL) 20878275SEric Cheng mac_tx_srs_restart(flent->fe_tx_srs); 20888275SEric Cheng return (0); 20892311Sseb } 20902311Sseb 20912311Sseb void 20928275SEric Cheng mac_tx_client_quiesce(mac_client_impl_t *mcip, uint_t srs_quiesce_flag) 20938275SEric Cheng { 20948275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mcip->mci_mip)); 20958275SEric Cheng 20968275SEric Cheng mac_tx_client_block(mcip); 20978275SEric Cheng if (MCIP_TX_SRS(mcip) != NULL) { 20988275SEric Cheng mac_tx_srs_quiesce(MCIP_TX_SRS(mcip), srs_quiesce_flag); 20998275SEric Cheng (void) mac_flow_walk_nolock(mcip->mci_subflow_tab, 21008275SEric Cheng mac_tx_flow_quiesce, NULL); 21018275SEric Cheng } 21028275SEric Cheng } 21038275SEric Cheng 21048275SEric Cheng void 21058275SEric Cheng mac_tx_client_restart(mac_client_impl_t *mcip) 21062311Sseb { 21078275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mcip->mci_mip)); 21088275SEric Cheng 21098275SEric Cheng mac_tx_client_unblock(mcip); 21108275SEric Cheng if (MCIP_TX_SRS(mcip) != NULL) { 21118275SEric Cheng mac_tx_srs_restart(MCIP_TX_SRS(mcip)); 21128275SEric Cheng (void) mac_flow_walk_nolock(mcip->mci_subflow_tab, 21138275SEric Cheng mac_tx_flow_restart, NULL); 21148275SEric Cheng } 21158275SEric Cheng } 21168275SEric Cheng 21178275SEric Cheng void 21188275SEric Cheng mac_tx_client_flush(mac_client_impl_t *mcip) 21198275SEric Cheng { 21208275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mcip->mci_mip)); 21218275SEric Cheng 21228275SEric Cheng mac_tx_client_quiesce(mcip, SRS_QUIESCE); 21238275SEric Cheng mac_tx_client_restart(mcip); 21248275SEric Cheng } 21258275SEric Cheng 21268275SEric Cheng void 21278275SEric Cheng mac_client_quiesce(mac_client_impl_t *mcip) 21288275SEric Cheng { 21298275SEric Cheng mac_rx_client_quiesce((mac_client_handle_t)mcip); 21308275SEric Cheng mac_tx_client_quiesce(mcip, SRS_QUIESCE); 21318275SEric Cheng } 21328275SEric Cheng 21338275SEric Cheng void 21348275SEric Cheng mac_client_restart(mac_client_impl_t *mcip) 21358275SEric Cheng { 21368275SEric Cheng mac_rx_client_restart((mac_client_handle_t)mcip); 21378275SEric Cheng mac_tx_client_restart(mcip); 21382311Sseb } 21392311Sseb 21402311Sseb /* 21415895Syz147064 * Allocate a minor number. 21425895Syz147064 */ 21435895Syz147064 minor_t 21445895Syz147064 mac_minor_hold(boolean_t sleep) 21455895Syz147064 { 21465895Syz147064 minor_t minor; 21475895Syz147064 21485895Syz147064 /* 21495895Syz147064 * Grab a value from the arena. 21505895Syz147064 */ 21515895Syz147064 atomic_add_32(&minor_count, 1); 21525895Syz147064 21535895Syz147064 if (sleep) 21545895Syz147064 minor = (uint_t)id_alloc(minor_ids); 21555895Syz147064 else 21565895Syz147064 minor = (uint_t)id_alloc_nosleep(minor_ids); 21575895Syz147064 21585895Syz147064 if (minor == 0) { 21595895Syz147064 atomic_add_32(&minor_count, -1); 21605895Syz147064 return (0); 21615895Syz147064 } 21625895Syz147064 21635895Syz147064 return (minor); 21645895Syz147064 } 21655895Syz147064 21665895Syz147064 /* 21675895Syz147064 * Release a previously allocated minor number. 21685895Syz147064 */ 21695895Syz147064 void 21705895Syz147064 mac_minor_rele(minor_t minor) 21715895Syz147064 { 21725895Syz147064 /* 21735895Syz147064 * Return the value to the arena. 21745895Syz147064 */ 21755895Syz147064 id_free(minor_ids, minor); 21765895Syz147064 atomic_add_32(&minor_count, -1); 21775895Syz147064 } 21785895Syz147064 21795895Syz147064 uint32_t 21805895Syz147064 mac_no_notification(mac_handle_t mh) 21815895Syz147064 { 21825895Syz147064 mac_impl_t *mip = (mac_impl_t *)mh; 21839073SCathy.Zhou@Sun.COM 21849073SCathy.Zhou@Sun.COM return (((mip->mi_state_flags & MIS_LEGACY) != 0) ? 21859073SCathy.Zhou@Sun.COM mip->mi_capab_legacy.ml_unsup_note : 0); 21865895Syz147064 } 21875895Syz147064 21885895Syz147064 /* 21898275SEric Cheng * Prevent any new opens of this mac in preparation for unregister 21902311Sseb */ 21912311Sseb int 21928275SEric Cheng i_mac_disable(mac_impl_t *mip) 21932311Sseb { 21948275SEric Cheng mac_client_impl_t *mcip; 21958275SEric Cheng 21968275SEric Cheng rw_enter(&i_mac_impl_lock, RW_WRITER); 21978275SEric Cheng if (mip->mi_state_flags & MIS_DISABLED) { 21988275SEric Cheng /* Already disabled, return success */ 21998275SEric Cheng rw_exit(&i_mac_impl_lock); 22008275SEric Cheng return (0); 22015895Syz147064 } 22022311Sseb /* 22038275SEric Cheng * See if there are any other references to this mac_t (e.g., VLAN's). 22048275SEric Cheng * If so return failure. If all the other checks below pass, then 22058275SEric Cheng * set mi_disabled atomically under the i_mac_impl_lock to prevent 22068275SEric Cheng * any new VLAN's from being created or new mac client opens of this 22078275SEric Cheng * mac end point. 22082311Sseb */ 22098275SEric Cheng if (mip->mi_ref > 0) { 22108275SEric Cheng rw_exit(&i_mac_impl_lock); 22118275SEric Cheng return (EBUSY); 22122311Sseb } 22132311Sseb 22142311Sseb /* 22158275SEric Cheng * mac clients must delete all multicast groups they join before 22168275SEric Cheng * closing. bcast groups are reference counted, the last client 22178275SEric Cheng * to delete the group will wait till the group is physically 22188275SEric Cheng * deleted. Since all clients have closed this mac end point 22198275SEric Cheng * mi_bcast_ngrps must be zero at this point 22202311Sseb */ 22218275SEric Cheng ASSERT(mip->mi_bcast_ngrps == 0); 22225009Sgd78059 22235009Sgd78059 /* 22248275SEric Cheng * Don't let go of this if it has some flows. 22258275SEric Cheng * All other code guarantees no flows are added to a disabled 22268275SEric Cheng * mac, therefore it is sufficient to check for the flow table 22278275SEric Cheng * only here. 22282311Sseb */ 22298275SEric Cheng mcip = mac_primary_client_handle(mip); 22308275SEric Cheng if ((mcip != NULL) && mac_link_has_flows((mac_client_handle_t)mcip)) { 22318275SEric Cheng rw_exit(&i_mac_impl_lock); 22328275SEric Cheng return (ENOTEMPTY); 22335895Syz147064 } 22345895Syz147064 22358275SEric Cheng mip->mi_state_flags |= MIS_DISABLED; 22361852Syz147064 rw_exit(&i_mac_impl_lock); 2237269Sericheng return (0); 22388275SEric Cheng } 22398275SEric Cheng 22408275SEric Cheng int 22418275SEric Cheng mac_disable_nowait(mac_handle_t mh) 22428275SEric Cheng { 22438275SEric Cheng mac_impl_t *mip = (mac_impl_t *)mh; 22448275SEric Cheng int err; 22458275SEric Cheng 22468275SEric Cheng if ((err = i_mac_perim_enter_nowait(mip)) != 0) 22478275SEric Cheng return (err); 22488275SEric Cheng err = i_mac_disable(mip); 22498275SEric Cheng i_mac_perim_exit(mip); 2250269Sericheng return (err); 22510Sstevel@tonic-gate } 22520Sstevel@tonic-gate 22530Sstevel@tonic-gate int 22545084Sjohnlev mac_disable(mac_handle_t mh) 22550Sstevel@tonic-gate { 22568275SEric Cheng mac_impl_t *mip = (mac_impl_t *)mh; 22578275SEric Cheng int err; 22588275SEric Cheng 22598275SEric Cheng i_mac_perim_enter(mip); 22608275SEric Cheng err = i_mac_disable(mip); 22618275SEric Cheng i_mac_perim_exit(mip); 22625084Sjohnlev 22630Sstevel@tonic-gate /* 22648275SEric Cheng * Clean up notification thread and wait for it to exit. 22655009Sgd78059 */ 22668275SEric Cheng if (err == 0) 22678275SEric Cheng i_mac_notify_exit(mip); 22688275SEric Cheng 22698275SEric Cheng return (err); 22700Sstevel@tonic-gate } 22710Sstevel@tonic-gate 22724913Sethindra /* 22738275SEric Cheng * Called when the MAC instance has a non empty flow table, to de-multiplex 22748275SEric Cheng * incoming packets to the right flow. 22758275SEric Cheng * The MAC's rw lock is assumed held as a READER. 22764913Sethindra */ 22778275SEric Cheng /* ARGSUSED */ 22788275SEric Cheng static mblk_t * 22798275SEric Cheng mac_rx_classify(mac_impl_t *mip, mac_resource_handle_t mrh, mblk_t *mp) 22800Sstevel@tonic-gate { 22818275SEric Cheng flow_entry_t *flent = NULL; 22828275SEric Cheng uint_t flags = FLOW_INBOUND; 22838275SEric Cheng int err; 22844913Sethindra 22854913Sethindra /* 22868275SEric Cheng * If the mac is a port of an aggregation, pass FLOW_IGNORE_VLAN 22878275SEric Cheng * to mac_flow_lookup() so that the VLAN packets can be successfully 22888275SEric Cheng * passed to the non-VLAN aggregation flows. 22898275SEric Cheng * 22908275SEric Cheng * Note that there is possibly a race between this and 22918275SEric Cheng * mac_unicast_remove/add() and VLAN packets could be incorrectly 22928275SEric Cheng * classified to non-VLAN flows of non-aggregation mac clients. These 22938275SEric Cheng * VLAN packets will be then filtered out by the mac module. 22944913Sethindra */ 22958275SEric Cheng if ((mip->mi_state_flags & MIS_EXCLUSIVE) != 0) 22968275SEric Cheng flags |= FLOW_IGNORE_VLAN; 22978275SEric Cheng 22988275SEric Cheng err = mac_flow_lookup(mip->mi_flow_tab, mp, flags, &flent); 22998275SEric Cheng if (err != 0) { 23008275SEric Cheng /* no registered receive function */ 23018275SEric Cheng return (mp); 23028275SEric Cheng } else { 23038275SEric Cheng mac_client_impl_t *mcip; 23044913Sethindra 23054913Sethindra /* 23068275SEric Cheng * This flent might just be an additional one on the MAC client, 23078275SEric Cheng * i.e. for classification purposes (different fdesc), however 23088275SEric Cheng * the resources, SRS et. al., are in the mci_flent, so if 23098275SEric Cheng * this isn't the mci_flent, we need to get it. 23104913Sethindra */ 23118275SEric Cheng if ((mcip = flent->fe_mcip) != NULL && 23128275SEric Cheng mcip->mci_flent != flent) { 23138275SEric Cheng FLOW_REFRELE(flent); 23148275SEric Cheng flent = mcip->mci_flent; 23158275SEric Cheng FLOW_TRY_REFHOLD(flent, err); 23168275SEric Cheng if (err != 0) 23178275SEric Cheng return (mp); 23188275SEric Cheng } 23198275SEric Cheng (flent->fe_cb_fn)(flent->fe_cb_arg1, flent->fe_cb_arg2, mp, 23208275SEric Cheng B_FALSE); 23218275SEric Cheng FLOW_REFRELE(flent); 23225084Sjohnlev } 23235084Sjohnlev return (NULL); 23245084Sjohnlev } 23255084Sjohnlev 23265084Sjohnlev mblk_t * 23278275SEric Cheng mac_rx_flow(mac_handle_t mh, mac_resource_handle_t mrh, mblk_t *mp_chain) 23280Sstevel@tonic-gate { 23292311Sseb mac_impl_t *mip = (mac_impl_t *)mh; 23308275SEric Cheng mblk_t *bp, *bp1, **bpp, *list = NULL; 23310Sstevel@tonic-gate 23320Sstevel@tonic-gate /* 23338275SEric Cheng * We walk the chain and attempt to classify each packet. 23348275SEric Cheng * The packets that couldn't be classified will be returned 23358275SEric Cheng * back to the caller. 23360Sstevel@tonic-gate */ 23378275SEric Cheng bp = mp_chain; 23388275SEric Cheng bpp = &list; 23398275SEric Cheng while (bp != NULL) { 23408275SEric Cheng bp1 = bp; 23418275SEric Cheng bp = bp->b_next; 23428275SEric Cheng bp1->b_next = NULL; 23438275SEric Cheng 23448275SEric Cheng if (mac_rx_classify(mip, mrh, bp1) != NULL) { 23458275SEric Cheng *bpp = bp1; 23468275SEric Cheng bpp = &bp1->b_next; 23478275SEric Cheng } 23488275SEric Cheng } 23498275SEric Cheng return (list); 23500Sstevel@tonic-gate } 23510Sstevel@tonic-gate 23528275SEric Cheng static int 23538275SEric Cheng mac_tx_flow_srs_wakeup(flow_entry_t *flent, void *arg) 23540Sstevel@tonic-gate { 23558275SEric Cheng mac_ring_handle_t ring = arg; 23568275SEric Cheng 23578275SEric Cheng if (flent->fe_tx_srs) 23588275SEric Cheng mac_tx_srs_wakeup(flent->fe_tx_srs, ring); 23592311Sseb return (0); 23602311Sseb } 23612311Sseb 23620Sstevel@tonic-gate void 23638275SEric Cheng i_mac_tx_srs_notify(mac_impl_t *mip, mac_ring_handle_t ring) 23648275SEric Cheng { 23658275SEric Cheng mac_client_impl_t *cclient; 23668275SEric Cheng mac_soft_ring_set_t *mac_srs; 23678275SEric Cheng 23688275SEric Cheng /* 23698275SEric Cheng * After grabbing the mi_rw_lock, the list of clients can't change. 23708275SEric Cheng * If there are any clients mi_disabled must be B_FALSE and can't 23718275SEric Cheng * get set since there are clients. If there aren't any clients we 23728275SEric Cheng * don't do anything. In any case the mip has to be valid. The driver 23738275SEric Cheng * must make sure that it goes single threaded (with respect to mac 23748275SEric Cheng * calls) and wait for all pending mac calls to finish before calling 23758275SEric Cheng * mac_unregister. 23768275SEric Cheng */ 23778275SEric Cheng rw_enter(&i_mac_impl_lock, RW_READER); 23788275SEric Cheng if (mip->mi_state_flags & MIS_DISABLED) { 23798275SEric Cheng rw_exit(&i_mac_impl_lock); 23808275SEric Cheng return; 23818275SEric Cheng } 23828275SEric Cheng 23838275SEric Cheng /* 23848275SEric Cheng * Get MAC tx srs from walking mac_client_handle list. 23858275SEric Cheng */ 23868275SEric Cheng rw_enter(&mip->mi_rw_lock, RW_READER); 23878275SEric Cheng for (cclient = mip->mi_clients_list; cclient != NULL; 23888275SEric Cheng cclient = cclient->mci_client_next) { 23898275SEric Cheng if ((mac_srs = MCIP_TX_SRS(cclient)) != NULL) 23908275SEric Cheng mac_tx_srs_wakeup(mac_srs, ring); 23918833SVenu.Iyer@Sun.COM (void) mac_flow_walk(cclient->mci_subflow_tab, 23928833SVenu.Iyer@Sun.COM mac_tx_flow_srs_wakeup, ring); 23938275SEric Cheng } 23948275SEric Cheng rw_exit(&mip->mi_rw_lock); 23958275SEric Cheng rw_exit(&i_mac_impl_lock); 23968275SEric Cheng } 23978275SEric Cheng 23988275SEric Cheng /* ARGSUSED */ 23998275SEric Cheng void 24008275SEric Cheng mac_multicast_refresh(mac_handle_t mh, mac_multicst_t refresh, void *arg, 24010Sstevel@tonic-gate boolean_t add) 24020Sstevel@tonic-gate { 24038275SEric Cheng mac_impl_t *mip = (mac_impl_t *)mh; 24048275SEric Cheng 24058275SEric Cheng i_mac_perim_enter((mac_impl_t *)mh); 24060Sstevel@tonic-gate /* 24070Sstevel@tonic-gate * If no specific refresh function was given then default to the 24080Sstevel@tonic-gate * driver's m_multicst entry point. 24090Sstevel@tonic-gate */ 24100Sstevel@tonic-gate if (refresh == NULL) { 24112311Sseb refresh = mip->mi_multicst; 24122311Sseb arg = mip->mi_driver; 24130Sstevel@tonic-gate } 24148275SEric Cheng 24158275SEric Cheng mac_bcast_refresh(mip, refresh, arg, add); 24168275SEric Cheng i_mac_perim_exit((mac_impl_t *)mh); 24170Sstevel@tonic-gate } 24180Sstevel@tonic-gate 24190Sstevel@tonic-gate void 24202311Sseb mac_promisc_refresh(mac_handle_t mh, mac_setpromisc_t refresh, void *arg) 24210Sstevel@tonic-gate { 24222311Sseb mac_impl_t *mip = (mac_impl_t *)mh; 24230Sstevel@tonic-gate 24240Sstevel@tonic-gate /* 24250Sstevel@tonic-gate * If no specific refresh function was given then default to the 24260Sstevel@tonic-gate * driver's m_promisc entry point. 24270Sstevel@tonic-gate */ 24280Sstevel@tonic-gate if (refresh == NULL) { 24292311Sseb refresh = mip->mi_setpromisc; 24302311Sseb arg = mip->mi_driver; 24310Sstevel@tonic-gate } 24320Sstevel@tonic-gate ASSERT(refresh != NULL); 24330Sstevel@tonic-gate 24340Sstevel@tonic-gate /* 24350Sstevel@tonic-gate * Call the refresh function with the current promiscuity. 24360Sstevel@tonic-gate */ 24370Sstevel@tonic-gate refresh(arg, (mip->mi_devpromisc != 0)); 24380Sstevel@tonic-gate } 24390Sstevel@tonic-gate 24405895Syz147064 /* 24415895Syz147064 * The mac client requests that the mac not to change its margin size to 24425895Syz147064 * be less than the specified value. If "current" is B_TRUE, then the client 24435895Syz147064 * requests the mac not to change its margin size to be smaller than the 24445895Syz147064 * current size. Further, return the current margin size value in this case. 24455895Syz147064 * 24465895Syz147064 * We keep every requested size in an ordered list from largest to smallest. 24475895Syz147064 */ 24485895Syz147064 int 24495895Syz147064 mac_margin_add(mac_handle_t mh, uint32_t *marginp, boolean_t current) 24505895Syz147064 { 24515895Syz147064 mac_impl_t *mip = (mac_impl_t *)mh; 24525895Syz147064 mac_margin_req_t **pp, *p; 24535895Syz147064 int err = 0; 24545895Syz147064 24558275SEric Cheng rw_enter(&(mip->mi_rw_lock), RW_WRITER); 24565895Syz147064 if (current) 24575895Syz147064 *marginp = mip->mi_margin; 24585895Syz147064 24595895Syz147064 /* 24605895Syz147064 * If the current margin value cannot satisfy the margin requested, 24615895Syz147064 * return ENOTSUP directly. 24625895Syz147064 */ 24635895Syz147064 if (*marginp > mip->mi_margin) { 24645895Syz147064 err = ENOTSUP; 24655895Syz147064 goto done; 24665895Syz147064 } 24675895Syz147064 24685895Syz147064 /* 24695895Syz147064 * Check whether the given margin is already in the list. If so, 24705895Syz147064 * bump the reference count. 24715895Syz147064 */ 24728275SEric Cheng for (pp = &mip->mi_mmrp; (p = *pp) != NULL; pp = &p->mmr_nextp) { 24735895Syz147064 if (p->mmr_margin == *marginp) { 24745895Syz147064 /* 24755895Syz147064 * The margin requested is already in the list, 24765895Syz147064 * so just bump the reference count. 24775895Syz147064 */ 24785895Syz147064 p->mmr_ref++; 24795895Syz147064 goto done; 24805895Syz147064 } 24815895Syz147064 if (p->mmr_margin < *marginp) 24825895Syz147064 break; 24835895Syz147064 } 24845895Syz147064 24855895Syz147064 24868275SEric Cheng p = kmem_zalloc(sizeof (mac_margin_req_t), KM_SLEEP); 24875895Syz147064 p->mmr_margin = *marginp; 24885895Syz147064 p->mmr_ref++; 24895895Syz147064 p->mmr_nextp = *pp; 24905895Syz147064 *pp = p; 24915895Syz147064 24925895Syz147064 done: 24938275SEric Cheng rw_exit(&(mip->mi_rw_lock)); 24945895Syz147064 return (err); 24955895Syz147064 } 24965895Syz147064 24975895Syz147064 /* 24985895Syz147064 * The mac client requests to cancel its previous mac_margin_add() request. 24995895Syz147064 * We remove the requested margin size from the list. 25005895Syz147064 */ 25015895Syz147064 int 25025895Syz147064 mac_margin_remove(mac_handle_t mh, uint32_t margin) 25035895Syz147064 { 25045895Syz147064 mac_impl_t *mip = (mac_impl_t *)mh; 25055895Syz147064 mac_margin_req_t **pp, *p; 25065895Syz147064 int err = 0; 25075895Syz147064 25088275SEric Cheng rw_enter(&(mip->mi_rw_lock), RW_WRITER); 25095895Syz147064 /* 25105895Syz147064 * Find the entry in the list for the given margin. 25115895Syz147064 */ 25125895Syz147064 for (pp = &(mip->mi_mmrp); (p = *pp) != NULL; pp = &(p->mmr_nextp)) { 25135895Syz147064 if (p->mmr_margin == margin) { 25145895Syz147064 if (--p->mmr_ref == 0) 25155895Syz147064 break; 25165895Syz147064 25175895Syz147064 /* 25185895Syz147064 * There is still a reference to this address so 25195895Syz147064 * there's nothing more to do. 25205895Syz147064 */ 25215895Syz147064 goto done; 25225895Syz147064 } 25235895Syz147064 } 25245895Syz147064 25255895Syz147064 /* 25265895Syz147064 * We did not find an entry for the given margin. 25275895Syz147064 */ 25285895Syz147064 if (p == NULL) { 25295895Syz147064 err = ENOENT; 25305895Syz147064 goto done; 25315895Syz147064 } 25325895Syz147064 25335895Syz147064 ASSERT(p->mmr_ref == 0); 25345895Syz147064 25355895Syz147064 /* 25365895Syz147064 * Remove it from the list. 25375895Syz147064 */ 25385895Syz147064 *pp = p->mmr_nextp; 25395895Syz147064 kmem_free(p, sizeof (mac_margin_req_t)); 25405895Syz147064 done: 25418275SEric Cheng rw_exit(&(mip->mi_rw_lock)); 25425895Syz147064 return (err); 25435895Syz147064 } 25445895Syz147064 25455895Syz147064 boolean_t 25465895Syz147064 mac_margin_update(mac_handle_t mh, uint32_t margin) 25475895Syz147064 { 25485895Syz147064 mac_impl_t *mip = (mac_impl_t *)mh; 25495895Syz147064 uint32_t margin_needed = 0; 25505895Syz147064 25518275SEric Cheng rw_enter(&(mip->mi_rw_lock), RW_WRITER); 25525895Syz147064 25535895Syz147064 if (mip->mi_mmrp != NULL) 25545895Syz147064 margin_needed = mip->mi_mmrp->mmr_margin; 25555895Syz147064 25565895Syz147064 if (margin_needed <= margin) 25575895Syz147064 mip->mi_margin = margin; 25585895Syz147064 25598275SEric Cheng rw_exit(&(mip->mi_rw_lock)); 25605895Syz147064 25615895Syz147064 if (margin_needed <= margin) 25625895Syz147064 i_mac_notify(mip, MAC_NOTE_MARGIN); 25635895Syz147064 25645895Syz147064 return (margin_needed <= margin); 25655895Syz147064 } 25665895Syz147064 25672311Sseb /* 25682311Sseb * MAC Type Plugin functions. 25692311Sseb */ 25702311Sseb 25718275SEric Cheng mactype_t * 25728275SEric Cheng mactype_getplugin(const char *pname) 25738275SEric Cheng { 25748275SEric Cheng mactype_t *mtype = NULL; 25758275SEric Cheng boolean_t tried_modload = B_FALSE; 25768275SEric Cheng 25778275SEric Cheng mutex_enter(&i_mactype_lock); 25788275SEric Cheng 25798275SEric Cheng find_registered_mactype: 25808275SEric Cheng if (mod_hash_find(i_mactype_hash, (mod_hash_key_t)pname, 25818275SEric Cheng (mod_hash_val_t *)&mtype) != 0) { 25828275SEric Cheng if (!tried_modload) { 25838275SEric Cheng /* 25848275SEric Cheng * If the plugin has not yet been loaded, then 25858275SEric Cheng * attempt to load it now. If modload() succeeds, 25868275SEric Cheng * the plugin should have registered using 25878275SEric Cheng * mactype_register(), in which case we can go back 25888275SEric Cheng * and attempt to find it again. 25898275SEric Cheng */ 25908275SEric Cheng if (modload(MACTYPE_KMODDIR, (char *)pname) != -1) { 25918275SEric Cheng tried_modload = B_TRUE; 25928275SEric Cheng goto find_registered_mactype; 25938275SEric Cheng } 25948275SEric Cheng } 25958275SEric Cheng } else { 25968275SEric Cheng /* 25978275SEric Cheng * Note that there's no danger that the plugin we've loaded 25988275SEric Cheng * could be unloaded between the modload() step and the 25998275SEric Cheng * reference count bump here, as we're holding 26008275SEric Cheng * i_mactype_lock, which mactype_unregister() also holds. 26018275SEric Cheng */ 26028275SEric Cheng atomic_inc_32(&mtype->mt_ref); 26038275SEric Cheng } 26048275SEric Cheng 26058275SEric Cheng mutex_exit(&i_mactype_lock); 26068275SEric Cheng return (mtype); 26078275SEric Cheng } 26088275SEric Cheng 26092311Sseb mactype_register_t * 26102311Sseb mactype_alloc(uint_t mactype_version) 26112311Sseb { 26122311Sseb mactype_register_t *mtrp; 26132311Sseb 26142311Sseb /* 26152311Sseb * Make sure there isn't a version mismatch between the plugin and 26162311Sseb * the framework. In the future, if multiple versions are 26172311Sseb * supported, this check could become more sophisticated. 26182311Sseb */ 26192311Sseb if (mactype_version != MACTYPE_VERSION) 26202311Sseb return (NULL); 26212311Sseb 26222311Sseb mtrp = kmem_zalloc(sizeof (mactype_register_t), KM_SLEEP); 26232311Sseb mtrp->mtr_version = mactype_version; 26242311Sseb return (mtrp); 26252311Sseb } 26262311Sseb 26272311Sseb void 26282311Sseb mactype_free(mactype_register_t *mtrp) 26292311Sseb { 26302311Sseb kmem_free(mtrp, sizeof (mactype_register_t)); 26312311Sseb } 26322311Sseb 26332311Sseb int 26342311Sseb mactype_register(mactype_register_t *mtrp) 26352311Sseb { 26362311Sseb mactype_t *mtp; 26372311Sseb mactype_ops_t *ops = mtrp->mtr_ops; 26382311Sseb 26392311Sseb /* Do some sanity checking before we register this MAC type. */ 26406353Sdr146992 if (mtrp->mtr_ident == NULL || ops == NULL) 26412311Sseb return (EINVAL); 26422311Sseb 26432311Sseb /* 26442311Sseb * Verify that all mandatory callbacks are set in the ops 26452311Sseb * vector. 26462311Sseb */ 26472311Sseb if (ops->mtops_unicst_verify == NULL || 26482311Sseb ops->mtops_multicst_verify == NULL || 26492311Sseb ops->mtops_sap_verify == NULL || 26502311Sseb ops->mtops_header == NULL || 26512311Sseb ops->mtops_header_info == NULL) { 26522311Sseb return (EINVAL); 26532311Sseb } 26542311Sseb 26552311Sseb mtp = kmem_zalloc(sizeof (*mtp), KM_SLEEP); 26562311Sseb mtp->mt_ident = mtrp->mtr_ident; 26572311Sseb mtp->mt_ops = *ops; 26582311Sseb mtp->mt_type = mtrp->mtr_mactype; 26593147Sxc151355 mtp->mt_nativetype = mtrp->mtr_nativetype; 26602311Sseb mtp->mt_addr_length = mtrp->mtr_addrlen; 26612311Sseb if (mtrp->mtr_brdcst_addr != NULL) { 26622311Sseb mtp->mt_brdcst_addr = kmem_alloc(mtrp->mtr_addrlen, KM_SLEEP); 26632311Sseb bcopy(mtrp->mtr_brdcst_addr, mtp->mt_brdcst_addr, 26642311Sseb mtrp->mtr_addrlen); 26652311Sseb } 26662311Sseb 26672311Sseb mtp->mt_stats = mtrp->mtr_stats; 26682311Sseb mtp->mt_statcount = mtrp->mtr_statcount; 26692311Sseb 26706512Ssowmini mtp->mt_mapping = mtrp->mtr_mapping; 26716512Ssowmini mtp->mt_mappingcount = mtrp->mtr_mappingcount; 26726512Ssowmini 26732311Sseb if (mod_hash_insert(i_mactype_hash, 26742311Sseb (mod_hash_key_t)mtp->mt_ident, (mod_hash_val_t)mtp) != 0) { 26752311Sseb kmem_free(mtp->mt_brdcst_addr, mtp->mt_addr_length); 26762311Sseb kmem_free(mtp, sizeof (*mtp)); 26772311Sseb return (EEXIST); 26782311Sseb } 26792311Sseb return (0); 26802311Sseb } 26812311Sseb 26822311Sseb int 26832311Sseb mactype_unregister(const char *ident) 26842311Sseb { 26852311Sseb mactype_t *mtp; 26862311Sseb mod_hash_val_t val; 26872311Sseb int err; 26882311Sseb 26892311Sseb /* 26902311Sseb * Let's not allow MAC drivers to use this plugin while we're 26913288Sseb * trying to unregister it. Holding i_mactype_lock also prevents a 26923288Sseb * plugin from unregistering while a MAC driver is attempting to 26933288Sseb * hold a reference to it in i_mactype_getplugin(). 26942311Sseb */ 26953288Sseb mutex_enter(&i_mactype_lock); 26962311Sseb 26972311Sseb if ((err = mod_hash_find(i_mactype_hash, (mod_hash_key_t)ident, 26982311Sseb (mod_hash_val_t *)&mtp)) != 0) { 26992311Sseb /* A plugin is trying to unregister, but it never registered. */ 27003288Sseb err = ENXIO; 27013288Sseb goto done; 27022311Sseb } 27032311Sseb 27043288Sseb if (mtp->mt_ref != 0) { 27053288Sseb err = EBUSY; 27063288Sseb goto done; 27072311Sseb } 27082311Sseb 27092311Sseb err = mod_hash_remove(i_mactype_hash, (mod_hash_key_t)ident, &val); 27102311Sseb ASSERT(err == 0); 27112311Sseb if (err != 0) { 27122311Sseb /* This should never happen, thus the ASSERT() above. */ 27133288Sseb err = EINVAL; 27143288Sseb goto done; 27152311Sseb } 27162311Sseb ASSERT(mtp == (mactype_t *)val); 27172311Sseb 271810616SSebastien.Roy@Sun.COM if (mtp->mt_brdcst_addr != NULL) 271910616SSebastien.Roy@Sun.COM kmem_free(mtp->mt_brdcst_addr, mtp->mt_addr_length); 27202311Sseb kmem_free(mtp, sizeof (mactype_t)); 27213288Sseb done: 27223288Sseb mutex_exit(&i_mactype_lock); 27233288Sseb return (err); 27242311Sseb } 27255903Ssowmini 27268275SEric Cheng /* 27278275SEric Cheng * mac_set_prop() sets mac or hardware driver properties: 272810491SRishi.Srivatsavai@Sun.COM * MAC resource properties include maxbw, priority, and cpu binding list. 272910491SRishi.Srivatsavai@Sun.COM * Driver properties are private properties to the hardware, such as mtu 273010491SRishi.Srivatsavai@Sun.COM * and speed. There's one other MAC property -- the PVID. 27318275SEric Cheng * If the property is a driver property, mac_set_prop() calls driver's callback 27328275SEric Cheng * function to set it. 273310491SRishi.Srivatsavai@Sun.COM * If the property is a mac resource property, mac_set_prop() invokes 273410491SRishi.Srivatsavai@Sun.COM * mac_set_resources() which will cache the property value in mac_impl_t and 273510491SRishi.Srivatsavai@Sun.COM * may call mac_client_set_resource() to update property value of the primary 273610491SRishi.Srivatsavai@Sun.COM * mac client, if it exists. 27378275SEric Cheng */ 27385903Ssowmini int 27395903Ssowmini mac_set_prop(mac_handle_t mh, mac_prop_t *macprop, void *val, uint_t valsize) 27405903Ssowmini { 27415903Ssowmini int err = ENOTSUP; 27425903Ssowmini mac_impl_t *mip = (mac_impl_t *)mh; 27435903Ssowmini 27448275SEric Cheng ASSERT(MAC_PERIM_HELD(mh)); 27458275SEric Cheng 274610491SRishi.Srivatsavai@Sun.COM switch (macprop->mp_id) { 274710491SRishi.Srivatsavai@Sun.COM case MAC_PROP_MAXBW: 274810491SRishi.Srivatsavai@Sun.COM case MAC_PROP_PRIO: 274910734SEric Cheng case MAC_PROP_PROTECT: 275010491SRishi.Srivatsavai@Sun.COM case MAC_PROP_BIND_CPU: { 27518275SEric Cheng mac_resource_props_t mrp; 27528275SEric Cheng 275310491SRishi.Srivatsavai@Sun.COM /* If it is mac property, call mac_set_resources() */ 27548275SEric Cheng if (valsize < sizeof (mac_resource_props_t)) 27558275SEric Cheng return (EINVAL); 27568275SEric Cheng bcopy(val, &mrp, sizeof (mrp)); 275710491SRishi.Srivatsavai@Sun.COM err = mac_set_resources(mh, &mrp); 275810491SRishi.Srivatsavai@Sun.COM break; 27598275SEric Cheng } 276010491SRishi.Srivatsavai@Sun.COM 276110491SRishi.Srivatsavai@Sun.COM case MAC_PROP_PVID: 276210491SRishi.Srivatsavai@Sun.COM if (valsize < sizeof (uint16_t) || 276310491SRishi.Srivatsavai@Sun.COM (mip->mi_state_flags & MIS_IS_VNIC)) 276410491SRishi.Srivatsavai@Sun.COM return (EINVAL); 276510491SRishi.Srivatsavai@Sun.COM err = mac_set_pvid(mh, *(uint16_t *)val); 276610491SRishi.Srivatsavai@Sun.COM break; 276710491SRishi.Srivatsavai@Sun.COM 27688603SGirish.Moodalbail@Sun.COM case MAC_PROP_MTU: { 27698603SGirish.Moodalbail@Sun.COM uint32_t mtu; 27708603SGirish.Moodalbail@Sun.COM 27718603SGirish.Moodalbail@Sun.COM if (valsize < sizeof (mtu)) 27728603SGirish.Moodalbail@Sun.COM return (EINVAL); 27738603SGirish.Moodalbail@Sun.COM bcopy(val, &mtu, sizeof (mtu)); 27748603SGirish.Moodalbail@Sun.COM err = mac_set_mtu(mh, mtu, NULL); 27758603SGirish.Moodalbail@Sun.COM break; 27765903Ssowmini } 277710491SRishi.Srivatsavai@Sun.COM 277810491SRishi.Srivatsavai@Sun.COM case MAC_PROP_LLIMIT: 277910491SRishi.Srivatsavai@Sun.COM case MAC_PROP_LDECAY: { 278010491SRishi.Srivatsavai@Sun.COM uint32_t learnval; 278110491SRishi.Srivatsavai@Sun.COM 278210491SRishi.Srivatsavai@Sun.COM if (valsize < sizeof (learnval) || 278310491SRishi.Srivatsavai@Sun.COM (mip->mi_state_flags & MIS_IS_VNIC)) 278410491SRishi.Srivatsavai@Sun.COM return (EINVAL); 278510491SRishi.Srivatsavai@Sun.COM bcopy(val, &learnval, sizeof (learnval)); 278610491SRishi.Srivatsavai@Sun.COM if (learnval == 0 && macprop->mp_id == MAC_PROP_LDECAY) 278710491SRishi.Srivatsavai@Sun.COM return (EINVAL); 278810491SRishi.Srivatsavai@Sun.COM if (macprop->mp_id == MAC_PROP_LLIMIT) 278910491SRishi.Srivatsavai@Sun.COM mip->mi_llimit = learnval; 279010491SRishi.Srivatsavai@Sun.COM else 279110491SRishi.Srivatsavai@Sun.COM mip->mi_ldecay = learnval; 279210491SRishi.Srivatsavai@Sun.COM err = 0; 279310491SRishi.Srivatsavai@Sun.COM break; 279410491SRishi.Srivatsavai@Sun.COM } 279510491SRishi.Srivatsavai@Sun.COM 27968603SGirish.Moodalbail@Sun.COM default: 27978603SGirish.Moodalbail@Sun.COM /* For other driver properties, call driver's callback */ 27988603SGirish.Moodalbail@Sun.COM if (mip->mi_callbacks->mc_callbacks & MC_SETPROP) { 27998603SGirish.Moodalbail@Sun.COM err = mip->mi_callbacks->mc_setprop(mip->mi_driver, 28008603SGirish.Moodalbail@Sun.COM macprop->mp_name, macprop->mp_id, valsize, val); 28018603SGirish.Moodalbail@Sun.COM } 28028603SGirish.Moodalbail@Sun.COM } 28035903Ssowmini return (err); 28045903Ssowmini } 28055903Ssowmini 28068275SEric Cheng /* 28078275SEric Cheng * mac_get_prop() gets mac or hardware driver properties. 28088275SEric Cheng * 28098275SEric Cheng * If the property is a driver property, mac_get_prop() calls driver's callback 28108275SEric Cheng * function to get it. 28118275SEric Cheng * If the property is a mac property, mac_get_prop() invokes mac_get_resources() 28128275SEric Cheng * which returns the cached value in mac_impl_t. 28138275SEric Cheng */ 28145903Ssowmini int 28158118SVasumathi.Sundaram@Sun.COM mac_get_prop(mac_handle_t mh, mac_prop_t *macprop, void *val, uint_t valsize, 28168118SVasumathi.Sundaram@Sun.COM uint_t *perm) 28175903Ssowmini { 28185903Ssowmini int err = ENOTSUP; 28195903Ssowmini mac_impl_t *mip = (mac_impl_t *)mh; 28206512Ssowmini link_state_t link_state; 28219514SGirish.Moodalbail@Sun.COM boolean_t is_getprop, is_setprop; 28229514SGirish.Moodalbail@Sun.COM 28239514SGirish.Moodalbail@Sun.COM is_getprop = (mip->mi_callbacks->mc_callbacks & MC_GETPROP); 28249514SGirish.Moodalbail@Sun.COM is_setprop = (mip->mi_callbacks->mc_callbacks & MC_SETPROP); 28256512Ssowmini 282610491SRishi.Srivatsavai@Sun.COM switch (macprop->mp_id) { 282710491SRishi.Srivatsavai@Sun.COM case MAC_PROP_MAXBW: 282810491SRishi.Srivatsavai@Sun.COM case MAC_PROP_PRIO: 282910734SEric Cheng case MAC_PROP_PROTECT: 283010491SRishi.Srivatsavai@Sun.COM case MAC_PROP_BIND_CPU: { 28318275SEric Cheng mac_resource_props_t mrp; 28328275SEric Cheng 283310491SRishi.Srivatsavai@Sun.COM /* If mac property, read from cache */ 28348275SEric Cheng if (valsize < sizeof (mac_resource_props_t)) 28358275SEric Cheng return (EINVAL); 28368275SEric Cheng mac_get_resources(mh, &mrp); 28378275SEric Cheng bcopy(&mrp, val, sizeof (mac_resource_props_t)); 28388275SEric Cheng return (0); 28398275SEric Cheng } 28408275SEric Cheng 284110491SRishi.Srivatsavai@Sun.COM case MAC_PROP_PVID: 284210491SRishi.Srivatsavai@Sun.COM if (valsize < sizeof (uint16_t) || 284310491SRishi.Srivatsavai@Sun.COM (mip->mi_state_flags & MIS_IS_VNIC)) 284410491SRishi.Srivatsavai@Sun.COM return (EINVAL); 284510491SRishi.Srivatsavai@Sun.COM *(uint16_t *)val = mac_get_pvid(mh); 284610491SRishi.Srivatsavai@Sun.COM return (0); 284710491SRishi.Srivatsavai@Sun.COM 284810491SRishi.Srivatsavai@Sun.COM case MAC_PROP_LLIMIT: 284910491SRishi.Srivatsavai@Sun.COM case MAC_PROP_LDECAY: 285010491SRishi.Srivatsavai@Sun.COM if (valsize < sizeof (uint32_t) || 285110491SRishi.Srivatsavai@Sun.COM (mip->mi_state_flags & MIS_IS_VNIC)) 285210491SRishi.Srivatsavai@Sun.COM return (EINVAL); 285310491SRishi.Srivatsavai@Sun.COM if (macprop->mp_id == MAC_PROP_LLIMIT) 285410491SRishi.Srivatsavai@Sun.COM bcopy(&mip->mi_llimit, val, sizeof (mip->mi_llimit)); 285510491SRishi.Srivatsavai@Sun.COM else 285610491SRishi.Srivatsavai@Sun.COM bcopy(&mip->mi_ldecay, val, sizeof (mip->mi_ldecay)); 285710491SRishi.Srivatsavai@Sun.COM return (0); 285810491SRishi.Srivatsavai@Sun.COM 28599514SGirish.Moodalbail@Sun.COM case MAC_PROP_MTU: { 28609514SGirish.Moodalbail@Sun.COM uint32_t sdu; 28619514SGirish.Moodalbail@Sun.COM mac_propval_range_t range; 28629514SGirish.Moodalbail@Sun.COM 28639514SGirish.Moodalbail@Sun.COM if ((macprop->mp_flags & MAC_PROP_POSSIBLE) != 0) { 28649514SGirish.Moodalbail@Sun.COM if (valsize < sizeof (mac_propval_range_t)) 28659514SGirish.Moodalbail@Sun.COM return (EINVAL); 28669514SGirish.Moodalbail@Sun.COM if (is_getprop) { 28679514SGirish.Moodalbail@Sun.COM err = mip->mi_callbacks->mc_getprop(mip-> 28689514SGirish.Moodalbail@Sun.COM mi_driver, macprop->mp_name, macprop->mp_id, 28699514SGirish.Moodalbail@Sun.COM macprop->mp_flags, valsize, val, perm); 28709514SGirish.Moodalbail@Sun.COM } 28719514SGirish.Moodalbail@Sun.COM /* 28729514SGirish.Moodalbail@Sun.COM * If the driver doesn't have *_m_getprop defined or 28739514SGirish.Moodalbail@Sun.COM * if the driver doesn't support setting MTU then 28749514SGirish.Moodalbail@Sun.COM * return the CURRENT value as POSSIBLE value. 28759514SGirish.Moodalbail@Sun.COM */ 28769514SGirish.Moodalbail@Sun.COM if (!is_getprop || err == ENOTSUP) { 28779514SGirish.Moodalbail@Sun.COM mac_sdu_get(mh, NULL, &sdu); 28789514SGirish.Moodalbail@Sun.COM range.mpr_count = 1; 28799514SGirish.Moodalbail@Sun.COM range.mpr_type = MAC_PROPVAL_UINT32; 28809514SGirish.Moodalbail@Sun.COM range.range_uint32[0].mpur_min = 28819514SGirish.Moodalbail@Sun.COM range.range_uint32[0].mpur_max = sdu; 28829514SGirish.Moodalbail@Sun.COM bcopy(&range, val, sizeof (range)); 28839514SGirish.Moodalbail@Sun.COM err = 0; 28849514SGirish.Moodalbail@Sun.COM } 28859514SGirish.Moodalbail@Sun.COM return (err); 28869514SGirish.Moodalbail@Sun.COM } 28876512Ssowmini if (valsize < sizeof (sdu)) 28886512Ssowmini return (EINVAL); 28896789Sam223141 if ((macprop->mp_flags & MAC_PROP_DEFAULT) == 0) { 28906512Ssowmini mac_sdu_get(mh, NULL, &sdu); 28916512Ssowmini bcopy(&sdu, val, sizeof (sdu)); 28929514SGirish.Moodalbail@Sun.COM if (is_setprop && (mip->mi_callbacks->mc_setprop(mip-> 28939514SGirish.Moodalbail@Sun.COM mi_driver, macprop->mp_name, macprop->mp_id, 28949514SGirish.Moodalbail@Sun.COM valsize, val) == 0)) { 28958603SGirish.Moodalbail@Sun.COM *perm = MAC_PROP_PERM_RW; 28968603SGirish.Moodalbail@Sun.COM } else { 28978118SVasumathi.Sundaram@Sun.COM *perm = MAC_PROP_PERM_READ; 28988603SGirish.Moodalbail@Sun.COM } 28996512Ssowmini return (0); 29006512Ssowmini } else { 29016512Ssowmini if (mip->mi_info.mi_media == DL_ETHER) { 29026512Ssowmini sdu = ETHERMTU; 29036512Ssowmini bcopy(&sdu, val, sizeof (sdu)); 29048603SGirish.Moodalbail@Sun.COM 29056512Ssowmini return (0); 29066512Ssowmini } 29076512Ssowmini /* 29086512Ssowmini * ask driver for its default. 29096512Ssowmini */ 29106512Ssowmini break; 29116512Ssowmini } 29129514SGirish.Moodalbail@Sun.COM } 29136789Sam223141 case MAC_PROP_STATUS: 29146512Ssowmini if (valsize < sizeof (link_state)) 29156512Ssowmini return (EINVAL); 29168118SVasumathi.Sundaram@Sun.COM *perm = MAC_PROP_PERM_READ; 29176512Ssowmini link_state = mac_link_get(mh); 29186512Ssowmini bcopy(&link_state, val, sizeof (link_state)); 29196512Ssowmini return (0); 29206512Ssowmini default: 29216512Ssowmini break; 29228275SEric Cheng 29236512Ssowmini } 29248275SEric Cheng /* If driver property, request from driver */ 29259514SGirish.Moodalbail@Sun.COM if (is_getprop) { 29265903Ssowmini err = mip->mi_callbacks->mc_getprop(mip->mi_driver, 29276512Ssowmini macprop->mp_name, macprop->mp_id, macprop->mp_flags, 29288118SVasumathi.Sundaram@Sun.COM valsize, val, perm); 29295903Ssowmini } 29305903Ssowmini return (err); 29315903Ssowmini } 29325903Ssowmini 29339073SCathy.Zhou@Sun.COM int 29349073SCathy.Zhou@Sun.COM mac_fastpath_disable(mac_handle_t mh) 29359073SCathy.Zhou@Sun.COM { 29369073SCathy.Zhou@Sun.COM mac_impl_t *mip = (mac_impl_t *)mh; 29379073SCathy.Zhou@Sun.COM 29389073SCathy.Zhou@Sun.COM if ((mip->mi_state_flags & MIS_LEGACY) == 0) 29399073SCathy.Zhou@Sun.COM return (0); 29409073SCathy.Zhou@Sun.COM 29419073SCathy.Zhou@Sun.COM return (mip->mi_capab_legacy.ml_fastpath_disable(mip->mi_driver)); 29429073SCathy.Zhou@Sun.COM } 29439073SCathy.Zhou@Sun.COM 29449073SCathy.Zhou@Sun.COM void 29459073SCathy.Zhou@Sun.COM mac_fastpath_enable(mac_handle_t mh) 29469073SCathy.Zhou@Sun.COM { 29479073SCathy.Zhou@Sun.COM mac_impl_t *mip = (mac_impl_t *)mh; 29489073SCathy.Zhou@Sun.COM 29499073SCathy.Zhou@Sun.COM if ((mip->mi_state_flags & MIS_LEGACY) == 0) 29509073SCathy.Zhou@Sun.COM return; 29519073SCathy.Zhou@Sun.COM 29529073SCathy.Zhou@Sun.COM mip->mi_capab_legacy.ml_fastpath_enable(mip->mi_driver); 29539073SCathy.Zhou@Sun.COM } 29549073SCathy.Zhou@Sun.COM 29558275SEric Cheng void 29566512Ssowmini mac_register_priv_prop(mac_impl_t *mip, mac_priv_prop_t *mpp, uint_t nprop) 29576512Ssowmini { 29586512Ssowmini mac_priv_prop_t *mpriv; 29596512Ssowmini 29606512Ssowmini if (mpp == NULL) 29616512Ssowmini return; 29626512Ssowmini 29636512Ssowmini mpriv = kmem_zalloc(nprop * sizeof (*mpriv), KM_SLEEP); 29646512Ssowmini (void) memcpy(mpriv, mpp, nprop * sizeof (*mpriv)); 29656512Ssowmini mip->mi_priv_prop = mpriv; 29666512Ssowmini mip->mi_priv_prop_count = nprop; 29676512Ssowmini } 29687406SSowmini.Varadhan@Sun.COM 29698275SEric Cheng void 29707406SSowmini.Varadhan@Sun.COM mac_unregister_priv_prop(mac_impl_t *mip) 29717406SSowmini.Varadhan@Sun.COM { 29727406SSowmini.Varadhan@Sun.COM mac_priv_prop_t *mpriv; 29737406SSowmini.Varadhan@Sun.COM 29747406SSowmini.Varadhan@Sun.COM mpriv = mip->mi_priv_prop; 29757406SSowmini.Varadhan@Sun.COM if (mpriv != NULL) { 29767406SSowmini.Varadhan@Sun.COM kmem_free(mpriv, mip->mi_priv_prop_count * sizeof (*mpriv)); 29777406SSowmini.Varadhan@Sun.COM mip->mi_priv_prop = NULL; 29787406SSowmini.Varadhan@Sun.COM } 29797406SSowmini.Varadhan@Sun.COM mip->mi_priv_prop_count = 0; 29807406SSowmini.Varadhan@Sun.COM } 29818275SEric Cheng 29828275SEric Cheng /* 29838275SEric Cheng * mac_ring_t 'mr' macros. Some rogue drivers may access ring structure 29848275SEric Cheng * (by invoking mac_rx()) even after processing mac_stop_ring(). In such 29858275SEric Cheng * cases if MAC free's the ring structure after mac_stop_ring(), any 29868275SEric Cheng * illegal access to the ring structure coming from the driver will panic 29878275SEric Cheng * the system. In order to protect the system from such inadverent access, 29888275SEric Cheng * we maintain a cache of rings in the mac_impl_t after they get free'd up. 29898275SEric Cheng * When packets are received on free'd up rings, MAC (through the generation 29908275SEric Cheng * count mechanism) will drop such packets. 29918275SEric Cheng */ 29928275SEric Cheng static mac_ring_t * 29938275SEric Cheng mac_ring_alloc(mac_impl_t *mip, mac_capab_rings_t *cap_rings) 29948275SEric Cheng { 29958275SEric Cheng mac_ring_t *ring; 29968275SEric Cheng 29978275SEric Cheng if (cap_rings->mr_type == MAC_RING_TYPE_RX) { 29988275SEric Cheng mutex_enter(&mip->mi_ring_lock); 29998275SEric Cheng if (mip->mi_ring_freelist != NULL) { 30008275SEric Cheng ring = mip->mi_ring_freelist; 30018275SEric Cheng mip->mi_ring_freelist = ring->mr_next; 30028275SEric Cheng bzero(ring, sizeof (mac_ring_t)); 30038275SEric Cheng } else { 30048275SEric Cheng ring = kmem_cache_alloc(mac_ring_cache, KM_SLEEP); 30058275SEric Cheng } 30068275SEric Cheng mutex_exit(&mip->mi_ring_lock); 30078275SEric Cheng } else { 30088275SEric Cheng ring = kmem_zalloc(sizeof (mac_ring_t), KM_SLEEP); 30098275SEric Cheng } 30108275SEric Cheng ASSERT((ring != NULL) && (ring->mr_state == MR_FREE)); 30118275SEric Cheng return (ring); 30128275SEric Cheng } 30138275SEric Cheng 30148275SEric Cheng static void 30158275SEric Cheng mac_ring_free(mac_impl_t *mip, mac_ring_t *ring) 30168275SEric Cheng { 30178275SEric Cheng if (ring->mr_type == MAC_RING_TYPE_RX) { 30188275SEric Cheng mutex_enter(&mip->mi_ring_lock); 30198275SEric Cheng ring->mr_state = MR_FREE; 30208275SEric Cheng ring->mr_flag = 0; 30218275SEric Cheng ring->mr_next = mip->mi_ring_freelist; 30228275SEric Cheng mip->mi_ring_freelist = ring; 30238275SEric Cheng mutex_exit(&mip->mi_ring_lock); 30248275SEric Cheng } else { 30258275SEric Cheng kmem_free(ring, sizeof (mac_ring_t)); 30268275SEric Cheng } 30278275SEric Cheng } 30288275SEric Cheng 30298275SEric Cheng static void 30308275SEric Cheng mac_ring_freeall(mac_impl_t *mip) 30318275SEric Cheng { 30328275SEric Cheng mac_ring_t *ring_next; 30338275SEric Cheng mutex_enter(&mip->mi_ring_lock); 30348275SEric Cheng mac_ring_t *ring = mip->mi_ring_freelist; 30358275SEric Cheng while (ring != NULL) { 30368275SEric Cheng ring_next = ring->mr_next; 30378275SEric Cheng kmem_cache_free(mac_ring_cache, ring); 30388275SEric Cheng ring = ring_next; 30398275SEric Cheng } 30408275SEric Cheng mip->mi_ring_freelist = NULL; 30418275SEric Cheng mutex_exit(&mip->mi_ring_lock); 30428275SEric Cheng } 30438275SEric Cheng 30448275SEric Cheng int 30458275SEric Cheng mac_start_ring(mac_ring_t *ring) 30468275SEric Cheng { 30478275SEric Cheng int rv = 0; 30488275SEric Cheng 30498275SEric Cheng if (ring->mr_start != NULL) 30508275SEric Cheng rv = ring->mr_start(ring->mr_driver, ring->mr_gen_num); 30518275SEric Cheng 30528275SEric Cheng return (rv); 30538275SEric Cheng } 30548275SEric Cheng 30558275SEric Cheng void 30568275SEric Cheng mac_stop_ring(mac_ring_t *ring) 30578275SEric Cheng { 30588275SEric Cheng if (ring->mr_stop != NULL) 30598275SEric Cheng ring->mr_stop(ring->mr_driver); 30608275SEric Cheng 30618275SEric Cheng /* 30628275SEric Cheng * Increment the ring generation number for this ring. 30638275SEric Cheng */ 30648275SEric Cheng ring->mr_gen_num++; 30658275SEric Cheng } 30668275SEric Cheng 30678275SEric Cheng int 30688275SEric Cheng mac_start_group(mac_group_t *group) 30698275SEric Cheng { 30708275SEric Cheng int rv = 0; 30718275SEric Cheng 30728275SEric Cheng if (group->mrg_start != NULL) 30738275SEric Cheng rv = group->mrg_start(group->mrg_driver); 30748275SEric Cheng 30758275SEric Cheng return (rv); 30768275SEric Cheng } 30778275SEric Cheng 30788275SEric Cheng void 30798275SEric Cheng mac_stop_group(mac_group_t *group) 30808275SEric Cheng { 30818275SEric Cheng if (group->mrg_stop != NULL) 30828275SEric Cheng group->mrg_stop(group->mrg_driver); 30838275SEric Cheng } 30848275SEric Cheng 30858275SEric Cheng /* 30868275SEric Cheng * Called from mac_start() on the default Rx group. Broadcast and multicast 30878275SEric Cheng * packets are received only on the default group. Hence the default group 30888275SEric Cheng * needs to be up even if the primary client is not up, for the other groups 30898275SEric Cheng * to be functional. We do this by calling this function at mac_start time 30908275SEric Cheng * itself. However the broadcast packets that are received can't make their 30918275SEric Cheng * way beyond mac_rx until a mac client creates a broadcast flow. 30928275SEric Cheng */ 30938275SEric Cheng static int 30948275SEric Cheng mac_start_group_and_rings(mac_group_t *group) 30958275SEric Cheng { 30968275SEric Cheng mac_ring_t *ring; 30978275SEric Cheng int rv = 0; 30988275SEric Cheng 30998275SEric Cheng ASSERT(group->mrg_state == MAC_GROUP_STATE_REGISTERED); 31008275SEric Cheng if ((rv = mac_start_group(group)) != 0) 31018275SEric Cheng return (rv); 31028275SEric Cheng 31038275SEric Cheng for (ring = group->mrg_rings; ring != NULL; ring = ring->mr_next) { 31048275SEric Cheng ASSERT(ring->mr_state == MR_FREE); 31058275SEric Cheng if ((rv = mac_start_ring(ring)) != 0) 31068275SEric Cheng goto error; 31078275SEric Cheng ring->mr_state = MR_INUSE; 31088275SEric Cheng ring->mr_classify_type = MAC_SW_CLASSIFIER; 31098275SEric Cheng } 31108275SEric Cheng return (0); 31118275SEric Cheng 31128275SEric Cheng error: 31138275SEric Cheng mac_stop_group_and_rings(group); 31148275SEric Cheng return (rv); 31158275SEric Cheng } 31168275SEric Cheng 31178275SEric Cheng /* Called from mac_stop on the default Rx group */ 31188275SEric Cheng static void 31198275SEric Cheng mac_stop_group_and_rings(mac_group_t *group) 31208275SEric Cheng { 31218275SEric Cheng mac_ring_t *ring; 31228275SEric Cheng 31238275SEric Cheng for (ring = group->mrg_rings; ring != NULL; ring = ring->mr_next) { 31248275SEric Cheng if (ring->mr_state != MR_FREE) { 31258275SEric Cheng mac_stop_ring(ring); 31268275SEric Cheng ring->mr_state = MR_FREE; 31278275SEric Cheng ring->mr_flag = 0; 31288275SEric Cheng ring->mr_classify_type = MAC_NO_CLASSIFIER; 31298275SEric Cheng } 31308275SEric Cheng } 31318275SEric Cheng mac_stop_group(group); 31328275SEric Cheng } 31338275SEric Cheng 31348275SEric Cheng 31358275SEric Cheng static mac_ring_t * 31368275SEric Cheng mac_init_ring(mac_impl_t *mip, mac_group_t *group, int index, 31378275SEric Cheng mac_capab_rings_t *cap_rings) 31388275SEric Cheng { 31398275SEric Cheng mac_ring_t *ring; 31408275SEric Cheng mac_ring_info_t ring_info; 31418275SEric Cheng 31428275SEric Cheng ring = mac_ring_alloc(mip, cap_rings); 31438275SEric Cheng 31448275SEric Cheng /* Prepare basic information of ring */ 31458275SEric Cheng ring->mr_index = index; 31468275SEric Cheng ring->mr_type = group->mrg_type; 31478275SEric Cheng ring->mr_gh = (mac_group_handle_t)group; 31488275SEric Cheng 31498275SEric Cheng /* Insert the new ring to the list. */ 31508275SEric Cheng ring->mr_next = group->mrg_rings; 31518275SEric Cheng group->mrg_rings = ring; 31528275SEric Cheng 31538275SEric Cheng /* Zero to reuse the info data structure */ 31548275SEric Cheng bzero(&ring_info, sizeof (ring_info)); 31558275SEric Cheng 31568275SEric Cheng /* Query ring information from driver */ 31578275SEric Cheng cap_rings->mr_rget(mip->mi_driver, group->mrg_type, group->mrg_index, 31588275SEric Cheng index, &ring_info, (mac_ring_handle_t)ring); 31598275SEric Cheng 31608275SEric Cheng ring->mr_info = ring_info; 31618275SEric Cheng 31628275SEric Cheng /* Update ring's status */ 31638275SEric Cheng ring->mr_state = MR_FREE; 31648275SEric Cheng ring->mr_flag = 0; 31658275SEric Cheng 31668275SEric Cheng /* Update the ring count of the group */ 31678275SEric Cheng group->mrg_cur_count++; 31688275SEric Cheng return (ring); 31698275SEric Cheng } 31708275SEric Cheng 31718275SEric Cheng /* 31728275SEric Cheng * Rings are chained together for easy regrouping. 31738275SEric Cheng */ 31748275SEric Cheng static void 31758275SEric Cheng mac_init_group(mac_impl_t *mip, mac_group_t *group, int size, 31768275SEric Cheng mac_capab_rings_t *cap_rings) 31778275SEric Cheng { 31788275SEric Cheng int index; 31798275SEric Cheng 31808275SEric Cheng /* 31818275SEric Cheng * Initialize all ring members of this group. Size of zero will not 31828275SEric Cheng * enter the loop, so it's safe for initializing an empty group. 31838275SEric Cheng */ 31848275SEric Cheng for (index = size - 1; index >= 0; index--) 31858275SEric Cheng (void) mac_init_ring(mip, group, index, cap_rings); 31868275SEric Cheng } 31878275SEric Cheng 31888275SEric Cheng int 31898275SEric Cheng mac_init_rings(mac_impl_t *mip, mac_ring_type_t rtype) 31908275SEric Cheng { 31918275SEric Cheng mac_capab_rings_t *cap_rings; 31928275SEric Cheng mac_group_t *group, *groups; 31938275SEric Cheng mac_group_info_t group_info; 31948275SEric Cheng uint_t group_free = 0; 31958275SEric Cheng uint_t ring_left; 31968275SEric Cheng mac_ring_t *ring; 31978275SEric Cheng int g, err = 0; 31988275SEric Cheng 31998275SEric Cheng switch (rtype) { 32008275SEric Cheng case MAC_RING_TYPE_RX: 32018275SEric Cheng ASSERT(mip->mi_rx_groups == NULL); 32028275SEric Cheng 32038275SEric Cheng cap_rings = &mip->mi_rx_rings_cap; 32048275SEric Cheng cap_rings->mr_type = MAC_RING_TYPE_RX; 32058275SEric Cheng break; 32068275SEric Cheng case MAC_RING_TYPE_TX: 32078275SEric Cheng ASSERT(mip->mi_tx_groups == NULL); 32088275SEric Cheng 32098275SEric Cheng cap_rings = &mip->mi_tx_rings_cap; 32108275SEric Cheng cap_rings->mr_type = MAC_RING_TYPE_TX; 32118275SEric Cheng break; 32128275SEric Cheng default: 32138275SEric Cheng ASSERT(B_FALSE); 32148275SEric Cheng } 32158275SEric Cheng 32168275SEric Cheng if (!i_mac_capab_get((mac_handle_t)mip, MAC_CAPAB_RINGS, 32178275SEric Cheng cap_rings)) 32188275SEric Cheng return (0); 32198275SEric Cheng 32208275SEric Cheng /* 32218275SEric Cheng * Allocate a contiguous buffer for all groups. 32228275SEric Cheng */ 32238275SEric Cheng groups = kmem_zalloc(sizeof (mac_group_t) * (cap_rings->mr_gnum + 1), 32248275SEric Cheng KM_SLEEP); 32258275SEric Cheng 32268275SEric Cheng ring_left = cap_rings->mr_rnum; 32278275SEric Cheng 32288275SEric Cheng /* 32298275SEric Cheng * Get all ring groups if any, and get their ring members 32308275SEric Cheng * if any. 32318275SEric Cheng */ 32328275SEric Cheng for (g = 0; g < cap_rings->mr_gnum; g++) { 32338275SEric Cheng group = groups + g; 32348275SEric Cheng 32358275SEric Cheng /* Prepare basic information of the group */ 32368275SEric Cheng group->mrg_index = g; 32378275SEric Cheng group->mrg_type = rtype; 32388275SEric Cheng group->mrg_state = MAC_GROUP_STATE_UNINIT; 32398275SEric Cheng group->mrg_mh = (mac_handle_t)mip; 32408275SEric Cheng group->mrg_next = group + 1; 32418275SEric Cheng 32428275SEric Cheng /* Zero to reuse the info data structure */ 32438275SEric Cheng bzero(&group_info, sizeof (group_info)); 32448275SEric Cheng 32458275SEric Cheng /* Query group information from driver */ 32468275SEric Cheng cap_rings->mr_gget(mip->mi_driver, rtype, g, &group_info, 32478275SEric Cheng (mac_group_handle_t)group); 32488275SEric Cheng 32498275SEric Cheng switch (cap_rings->mr_group_type) { 32508275SEric Cheng case MAC_GROUP_TYPE_DYNAMIC: 32518275SEric Cheng if (cap_rings->mr_gaddring == NULL || 32528275SEric Cheng cap_rings->mr_gremring == NULL) { 32538275SEric Cheng DTRACE_PROBE3( 32548275SEric Cheng mac__init__rings_no_addremring, 32558275SEric Cheng char *, mip->mi_name, 32568275SEric Cheng mac_group_add_ring_t, 32578275SEric Cheng cap_rings->mr_gaddring, 32588275SEric Cheng mac_group_add_ring_t, 32598275SEric Cheng cap_rings->mr_gremring); 32608275SEric Cheng err = EINVAL; 32618275SEric Cheng goto bail; 32628275SEric Cheng } 32638275SEric Cheng 32648275SEric Cheng switch (rtype) { 32658275SEric Cheng case MAC_RING_TYPE_RX: 32668275SEric Cheng /* 32678275SEric Cheng * The first RX group must have non-zero 32688275SEric Cheng * rings, and the following groups must 32698275SEric Cheng * have zero rings. 32708275SEric Cheng */ 32718275SEric Cheng if (g == 0 && group_info.mgi_count == 0) { 32728275SEric Cheng DTRACE_PROBE1( 32738275SEric Cheng mac__init__rings__rx__def__zero, 32748275SEric Cheng char *, mip->mi_name); 32758275SEric Cheng err = EINVAL; 32768275SEric Cheng goto bail; 32778275SEric Cheng } 32788275SEric Cheng if (g > 0 && group_info.mgi_count != 0) { 32798275SEric Cheng DTRACE_PROBE3( 32808275SEric Cheng mac__init__rings__rx__nonzero, 32818275SEric Cheng char *, mip->mi_name, 32828275SEric Cheng int, g, int, group_info.mgi_count); 32838275SEric Cheng err = EINVAL; 32848275SEric Cheng goto bail; 32858275SEric Cheng } 32868275SEric Cheng break; 32878275SEric Cheng case MAC_RING_TYPE_TX: 32888275SEric Cheng /* 32898275SEric Cheng * All TX ring groups must have zero rings. 32908275SEric Cheng */ 32918275SEric Cheng if (group_info.mgi_count != 0) { 32928275SEric Cheng DTRACE_PROBE3( 32938275SEric Cheng mac__init__rings__tx__nonzero, 32948275SEric Cheng char *, mip->mi_name, 32958275SEric Cheng int, g, int, group_info.mgi_count); 32968275SEric Cheng err = EINVAL; 32978275SEric Cheng goto bail; 32988275SEric Cheng } 32998275SEric Cheng break; 33008275SEric Cheng } 33018275SEric Cheng break; 33028275SEric Cheng case MAC_GROUP_TYPE_STATIC: 33038275SEric Cheng /* 33048275SEric Cheng * Note that an empty group is allowed, e.g., an aggr 33058275SEric Cheng * would start with an empty group. 33068275SEric Cheng */ 33078275SEric Cheng break; 33088275SEric Cheng default: 33098275SEric Cheng /* unknown group type */ 33108275SEric Cheng DTRACE_PROBE2(mac__init__rings__unknown__type, 33118275SEric Cheng char *, mip->mi_name, 33128275SEric Cheng int, cap_rings->mr_group_type); 33138275SEric Cheng err = EINVAL; 33148275SEric Cheng goto bail; 33158275SEric Cheng } 33168275SEric Cheng 33178275SEric Cheng 33188275SEric Cheng /* 33198275SEric Cheng * Driver must register group->mgi_addmac/remmac() for rx groups 33208275SEric Cheng * to support multiple MAC addresses. 33218275SEric Cheng */ 33228275SEric Cheng if (rtype == MAC_RING_TYPE_RX) { 33238275SEric Cheng if ((group_info.mgi_addmac == NULL) || 33248275SEric Cheng (group_info.mgi_addmac == NULL)) 33258275SEric Cheng goto bail; 33268275SEric Cheng } 33278275SEric Cheng 33288275SEric Cheng /* Cache driver-supplied information */ 33298275SEric Cheng group->mrg_info = group_info; 33308275SEric Cheng 33318275SEric Cheng /* Update the group's status and group count. */ 33328275SEric Cheng mac_set_rx_group_state(group, MAC_GROUP_STATE_REGISTERED); 33338275SEric Cheng group_free++; 33348275SEric Cheng 33358275SEric Cheng group->mrg_rings = NULL; 33368275SEric Cheng group->mrg_cur_count = 0; 33378275SEric Cheng mac_init_group(mip, group, group_info.mgi_count, cap_rings); 33388275SEric Cheng ring_left -= group_info.mgi_count; 33398275SEric Cheng 33408275SEric Cheng /* The current group size should be equal to default value */ 33418275SEric Cheng ASSERT(group->mrg_cur_count == group_info.mgi_count); 33428275SEric Cheng } 33438275SEric Cheng 33448275SEric Cheng /* Build up a dummy group for free resources as a pool */ 33458275SEric Cheng group = groups + cap_rings->mr_gnum; 33468275SEric Cheng 33478275SEric Cheng /* Prepare basic information of the group */ 33488275SEric Cheng group->mrg_index = -1; 33498275SEric Cheng group->mrg_type = rtype; 33508275SEric Cheng group->mrg_state = MAC_GROUP_STATE_UNINIT; 33518275SEric Cheng group->mrg_mh = (mac_handle_t)mip; 33528275SEric Cheng group->mrg_next = NULL; 33538275SEric Cheng 33548275SEric Cheng /* 33558275SEric Cheng * If there are ungrouped rings, allocate a continuous buffer for 33568275SEric Cheng * remaining resources. 33578275SEric Cheng */ 33588275SEric Cheng if (ring_left != 0) { 33598275SEric Cheng group->mrg_rings = NULL; 33608275SEric Cheng group->mrg_cur_count = 0; 33618275SEric Cheng mac_init_group(mip, group, ring_left, cap_rings); 33628275SEric Cheng 33638275SEric Cheng /* The current group size should be equal to ring_left */ 33648275SEric Cheng ASSERT(group->mrg_cur_count == ring_left); 33658275SEric Cheng 33668275SEric Cheng ring_left = 0; 33678275SEric Cheng 33688275SEric Cheng /* Update this group's status */ 33698275SEric Cheng mac_set_rx_group_state(group, MAC_GROUP_STATE_REGISTERED); 33708275SEric Cheng } else 33718275SEric Cheng group->mrg_rings = NULL; 33728275SEric Cheng 33738275SEric Cheng ASSERT(ring_left == 0); 33748275SEric Cheng 33758275SEric Cheng bail: 33768275SEric Cheng /* Cache other important information to finalize the initialization */ 33778275SEric Cheng switch (rtype) { 33788275SEric Cheng case MAC_RING_TYPE_RX: 33798275SEric Cheng mip->mi_rx_group_type = cap_rings->mr_group_type; 33808275SEric Cheng mip->mi_rx_group_count = cap_rings->mr_gnum; 33818275SEric Cheng mip->mi_rx_groups = groups; 33828275SEric Cheng break; 33838275SEric Cheng case MAC_RING_TYPE_TX: 33848275SEric Cheng mip->mi_tx_group_type = cap_rings->mr_group_type; 33858275SEric Cheng mip->mi_tx_group_count = cap_rings->mr_gnum; 33868275SEric Cheng mip->mi_tx_group_free = group_free; 33878275SEric Cheng mip->mi_tx_groups = groups; 33888275SEric Cheng 33898275SEric Cheng /* 33908275SEric Cheng * Ring 0 is used as the default one and it could be assigned 33918275SEric Cheng * to a client as well. 33928275SEric Cheng */ 33938275SEric Cheng group = groups + cap_rings->mr_gnum; 33948275SEric Cheng ring = group->mrg_rings; 33958275SEric Cheng while ((ring->mr_index != 0) && (ring->mr_next != NULL)) 33968275SEric Cheng ring = ring->mr_next; 33978275SEric Cheng ASSERT(ring->mr_index == 0); 33988275SEric Cheng mip->mi_default_tx_ring = (mac_ring_handle_t)ring; 33998275SEric Cheng break; 34008275SEric Cheng default: 34018275SEric Cheng ASSERT(B_FALSE); 34028275SEric Cheng } 34038275SEric Cheng 34048275SEric Cheng if (err != 0) 34058275SEric Cheng mac_free_rings(mip, rtype); 34068275SEric Cheng 34078275SEric Cheng return (err); 34088275SEric Cheng } 34098275SEric Cheng 34108275SEric Cheng /* 34118275SEric Cheng * Called to free all ring groups with particular type. It's supposed all groups 34128275SEric Cheng * have been released by clinet. 34138275SEric Cheng */ 34148275SEric Cheng void 34158275SEric Cheng mac_free_rings(mac_impl_t *mip, mac_ring_type_t rtype) 34168275SEric Cheng { 34178275SEric Cheng mac_group_t *group, *groups; 34188275SEric Cheng uint_t group_count; 34198275SEric Cheng 34208275SEric Cheng switch (rtype) { 34218275SEric Cheng case MAC_RING_TYPE_RX: 34228275SEric Cheng if (mip->mi_rx_groups == NULL) 34238275SEric Cheng return; 34248275SEric Cheng 34258275SEric Cheng groups = mip->mi_rx_groups; 34268275SEric Cheng group_count = mip->mi_rx_group_count; 34278275SEric Cheng 34288275SEric Cheng mip->mi_rx_groups = NULL; 34298275SEric Cheng mip->mi_rx_group_count = 0; 34308275SEric Cheng break; 34318275SEric Cheng case MAC_RING_TYPE_TX: 34328275SEric Cheng ASSERT(mip->mi_tx_group_count == mip->mi_tx_group_free); 34338275SEric Cheng 34348275SEric Cheng if (mip->mi_tx_groups == NULL) 34358275SEric Cheng return; 34368275SEric Cheng 34378275SEric Cheng groups = mip->mi_tx_groups; 34388275SEric Cheng group_count = mip->mi_tx_group_count; 34398275SEric Cheng 34408275SEric Cheng mip->mi_tx_groups = NULL; 34418275SEric Cheng mip->mi_tx_group_count = 0; 34428275SEric Cheng mip->mi_tx_group_free = 0; 34438275SEric Cheng mip->mi_default_tx_ring = NULL; 34448275SEric Cheng break; 34458275SEric Cheng default: 34468275SEric Cheng ASSERT(B_FALSE); 34478275SEric Cheng } 34488275SEric Cheng 34498275SEric Cheng for (group = groups; group != NULL; group = group->mrg_next) { 34508275SEric Cheng mac_ring_t *ring; 34518275SEric Cheng 34528275SEric Cheng if (group->mrg_cur_count == 0) 34538275SEric Cheng continue; 34548275SEric Cheng 34558275SEric Cheng ASSERT(group->mrg_rings != NULL); 34568275SEric Cheng 34578275SEric Cheng while ((ring = group->mrg_rings) != NULL) { 34588275SEric Cheng group->mrg_rings = ring->mr_next; 34598275SEric Cheng mac_ring_free(mip, ring); 34608275SEric Cheng } 34618275SEric Cheng } 34628275SEric Cheng 34638275SEric Cheng /* Free all the cached rings */ 34648275SEric Cheng mac_ring_freeall(mip); 34658275SEric Cheng /* Free the block of group data strutures */ 34668275SEric Cheng kmem_free(groups, sizeof (mac_group_t) * (group_count + 1)); 34678275SEric Cheng } 34688275SEric Cheng 34698275SEric Cheng /* 34708275SEric Cheng * Associate a MAC address with a receive group. 34718275SEric Cheng * 34728275SEric Cheng * The return value of this function should always be checked properly, because 34738275SEric Cheng * any type of failure could cause unexpected results. A group can be added 34748275SEric Cheng * or removed with a MAC address only after it has been reserved. Ideally, 34758275SEric Cheng * a successful reservation always leads to calling mac_group_addmac() to 34768275SEric Cheng * steer desired traffic. Failure of adding an unicast MAC address doesn't 34778275SEric Cheng * always imply that the group is functioning abnormally. 34788275SEric Cheng * 34798275SEric Cheng * Currently this function is called everywhere, and it reflects assumptions 34808275SEric Cheng * about MAC addresses in the implementation. CR 6735196. 34818275SEric Cheng */ 34828275SEric Cheng int 34838275SEric Cheng mac_group_addmac(mac_group_t *group, const uint8_t *addr) 34848275SEric Cheng { 34858275SEric Cheng ASSERT(group->mrg_type == MAC_RING_TYPE_RX); 34868275SEric Cheng ASSERT(group->mrg_info.mgi_addmac != NULL); 34878275SEric Cheng 34888275SEric Cheng return (group->mrg_info.mgi_addmac(group->mrg_info.mgi_driver, addr)); 34898275SEric Cheng } 34908275SEric Cheng 34918275SEric Cheng /* 34928275SEric Cheng * Remove the association between MAC address and receive group. 34938275SEric Cheng */ 34948275SEric Cheng int 34958275SEric Cheng mac_group_remmac(mac_group_t *group, const uint8_t *addr) 34968275SEric Cheng { 34978275SEric Cheng ASSERT(group->mrg_type == MAC_RING_TYPE_RX); 34988275SEric Cheng ASSERT(group->mrg_info.mgi_remmac != NULL); 34998275SEric Cheng 35008275SEric Cheng return (group->mrg_info.mgi_remmac(group->mrg_info.mgi_driver, addr)); 35018275SEric Cheng } 35028275SEric Cheng 35038275SEric Cheng /* 35048275SEric Cheng * Release a ring in use by marking it MR_FREE. 35058275SEric Cheng * Any other client may reserve it for its use. 35068275SEric Cheng */ 35078275SEric Cheng void 35088275SEric Cheng mac_release_tx_ring(mac_ring_handle_t rh) 35098275SEric Cheng { 35108275SEric Cheng mac_ring_t *ring = (mac_ring_t *)rh; 35118275SEric Cheng mac_group_t *group = (mac_group_t *)ring->mr_gh; 35128275SEric Cheng mac_impl_t *mip = (mac_impl_t *)group->mrg_mh; 35138275SEric Cheng 35148275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 35158275SEric Cheng ASSERT(ring->mr_state != MR_FREE); 35168275SEric Cheng 35178275SEric Cheng /* 35188275SEric Cheng * Default tx ring will be released by mac_stop(). 35198275SEric Cheng */ 35208275SEric Cheng if (rh == mip->mi_default_tx_ring) 35218275SEric Cheng return; 35228275SEric Cheng 35238275SEric Cheng mac_stop_ring(ring); 35248275SEric Cheng 35258275SEric Cheng ring->mr_state = MR_FREE; 35268275SEric Cheng ring->mr_flag = 0; 35278275SEric Cheng } 35288275SEric Cheng 35298275SEric Cheng /* 353010491SRishi.Srivatsavai@Sun.COM * This is the entry point for packets transmitted through the bridging code. 353110491SRishi.Srivatsavai@Sun.COM * If no bridge is in place, MAC_RING_TX transmits using tx ring. The 'rh' 353210491SRishi.Srivatsavai@Sun.COM * pointer may be NULL to select the default ring. 353310491SRishi.Srivatsavai@Sun.COM */ 353410491SRishi.Srivatsavai@Sun.COM mblk_t * 353510491SRishi.Srivatsavai@Sun.COM mac_bridge_tx(mac_impl_t *mip, mac_ring_handle_t rh, mblk_t *mp) 353610491SRishi.Srivatsavai@Sun.COM { 353710491SRishi.Srivatsavai@Sun.COM mac_handle_t mh; 353810491SRishi.Srivatsavai@Sun.COM 353910491SRishi.Srivatsavai@Sun.COM /* 354010491SRishi.Srivatsavai@Sun.COM * Once we take a reference on the bridge link, the bridge 354110491SRishi.Srivatsavai@Sun.COM * module itself can't unload, so the callback pointers are 354210491SRishi.Srivatsavai@Sun.COM * stable. 354310491SRishi.Srivatsavai@Sun.COM */ 354410491SRishi.Srivatsavai@Sun.COM mutex_enter(&mip->mi_bridge_lock); 354510491SRishi.Srivatsavai@Sun.COM if ((mh = mip->mi_bridge_link) != NULL) 354610491SRishi.Srivatsavai@Sun.COM mac_bridge_ref_cb(mh, B_TRUE); 354710491SRishi.Srivatsavai@Sun.COM mutex_exit(&mip->mi_bridge_lock); 354810491SRishi.Srivatsavai@Sun.COM if (mh == NULL) { 354910491SRishi.Srivatsavai@Sun.COM MAC_RING_TX(mip, rh, mp, mp); 355010491SRishi.Srivatsavai@Sun.COM } else { 355110491SRishi.Srivatsavai@Sun.COM mp = mac_bridge_tx_cb(mh, rh, mp); 355210491SRishi.Srivatsavai@Sun.COM mac_bridge_ref_cb(mh, B_FALSE); 355310491SRishi.Srivatsavai@Sun.COM } 355410491SRishi.Srivatsavai@Sun.COM 355510491SRishi.Srivatsavai@Sun.COM return (mp); 355610491SRishi.Srivatsavai@Sun.COM } 355710491SRishi.Srivatsavai@Sun.COM 355810491SRishi.Srivatsavai@Sun.COM /* 35598275SEric Cheng * Find a ring from its index. 35608275SEric Cheng */ 35618275SEric Cheng mac_ring_t * 35628275SEric Cheng mac_find_ring(mac_group_t *group, int index) 35638275SEric Cheng { 35648275SEric Cheng mac_ring_t *ring = group->mrg_rings; 35658275SEric Cheng 35668275SEric Cheng for (ring = group->mrg_rings; ring != NULL; ring = ring->mr_next) 35678275SEric Cheng if (ring->mr_index == index) 35688275SEric Cheng break; 35698275SEric Cheng 35708275SEric Cheng return (ring); 35718275SEric Cheng } 35728275SEric Cheng /* 35738275SEric Cheng * Add a ring to an existing group. 35748275SEric Cheng * 35758275SEric Cheng * The ring must be either passed directly (for example if the ring 35768275SEric Cheng * movement is initiated by the framework), or specified through a driver 35778275SEric Cheng * index (for example when the ring is added by the driver. 35788275SEric Cheng * 35798275SEric Cheng * The caller needs to call mac_perim_enter() before calling this function. 35808275SEric Cheng */ 35818275SEric Cheng int 35828275SEric Cheng i_mac_group_add_ring(mac_group_t *group, mac_ring_t *ring, int index) 35838275SEric Cheng { 35848275SEric Cheng mac_impl_t *mip = (mac_impl_t *)group->mrg_mh; 35858275SEric Cheng mac_capab_rings_t *cap_rings; 35868275SEric Cheng boolean_t driver_call = (ring == NULL); 35878275SEric Cheng mac_group_type_t group_type; 35888275SEric Cheng int ret = 0; 35898275SEric Cheng 35908275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 35918275SEric Cheng 35928275SEric Cheng switch (group->mrg_type) { 35938275SEric Cheng case MAC_RING_TYPE_RX: 35948275SEric Cheng cap_rings = &mip->mi_rx_rings_cap; 35958275SEric Cheng group_type = mip->mi_rx_group_type; 35968275SEric Cheng break; 35978275SEric Cheng case MAC_RING_TYPE_TX: 35988275SEric Cheng cap_rings = &mip->mi_tx_rings_cap; 35998275SEric Cheng group_type = mip->mi_tx_group_type; 36008275SEric Cheng break; 36018275SEric Cheng default: 36028275SEric Cheng ASSERT(B_FALSE); 36038275SEric Cheng } 36048275SEric Cheng 36058275SEric Cheng /* 36068275SEric Cheng * There should be no ring with the same ring index in the target 36078275SEric Cheng * group. 36088275SEric Cheng */ 36098275SEric Cheng ASSERT(mac_find_ring(group, driver_call ? index : ring->mr_index) == 36108275SEric Cheng NULL); 36118275SEric Cheng 36128275SEric Cheng if (driver_call) { 36138275SEric Cheng /* 36148275SEric Cheng * The function is called as a result of a request from 36158275SEric Cheng * a driver to add a ring to an existing group, for example 36168275SEric Cheng * from the aggregation driver. Allocate a new mac_ring_t 36178275SEric Cheng * for that ring. 36188275SEric Cheng */ 36198275SEric Cheng ring = mac_init_ring(mip, group, index, cap_rings); 36208275SEric Cheng ASSERT(group->mrg_state > MAC_GROUP_STATE_UNINIT); 36218275SEric Cheng } else { 36228275SEric Cheng /* 36238275SEric Cheng * The function is called as a result of a MAC layer request 36248275SEric Cheng * to add a ring to an existing group. In this case the 36258275SEric Cheng * ring is being moved between groups, which requires 36268275SEric Cheng * the underlying driver to support dynamic grouping, 36278275SEric Cheng * and the mac_ring_t already exists. 36288275SEric Cheng */ 36298275SEric Cheng ASSERT(group_type == MAC_GROUP_TYPE_DYNAMIC); 36308275SEric Cheng ASSERT(cap_rings->mr_gaddring != NULL); 36318275SEric Cheng ASSERT(ring->mr_gh == NULL); 36328275SEric Cheng } 36338275SEric Cheng 36348275SEric Cheng /* 36358275SEric Cheng * At this point the ring should not be in use, and it should be 36368275SEric Cheng * of the right for the target group. 36378275SEric Cheng */ 36388275SEric Cheng ASSERT(ring->mr_state < MR_INUSE); 36398275SEric Cheng ASSERT(ring->mr_srs == NULL); 36408275SEric Cheng ASSERT(ring->mr_type == group->mrg_type); 36418275SEric Cheng 36428275SEric Cheng if (!driver_call) { 36438275SEric Cheng /* 36448275SEric Cheng * Add the driver level hardware ring if the process was not 36458275SEric Cheng * initiated by the driver, and the target group is not the 36468275SEric Cheng * group. 36478275SEric Cheng */ 36488275SEric Cheng if (group->mrg_driver != NULL) { 36498275SEric Cheng cap_rings->mr_gaddring(group->mrg_driver, 36508275SEric Cheng ring->mr_driver, ring->mr_type); 36518275SEric Cheng } 36528275SEric Cheng 36538275SEric Cheng /* 36548275SEric Cheng * Insert the ring ahead existing rings. 36558275SEric Cheng */ 36568275SEric Cheng ring->mr_next = group->mrg_rings; 36578275SEric Cheng group->mrg_rings = ring; 36588275SEric Cheng ring->mr_gh = (mac_group_handle_t)group; 36598275SEric Cheng group->mrg_cur_count++; 36608275SEric Cheng } 36618275SEric Cheng 36628275SEric Cheng /* 36638275SEric Cheng * If the group has not been actively used, we're done. 36648275SEric Cheng */ 36658275SEric Cheng if (group->mrg_index != -1 && 36668275SEric Cheng group->mrg_state < MAC_GROUP_STATE_RESERVED) 36678275SEric Cheng return (0); 36688275SEric Cheng 36698275SEric Cheng /* 36708275SEric Cheng * Set up SRS/SR according to the ring type. 36718275SEric Cheng */ 36728275SEric Cheng switch (ring->mr_type) { 36738275SEric Cheng case MAC_RING_TYPE_RX: 36748275SEric Cheng /* 36758275SEric Cheng * Setup SRS on top of the new ring if the group is 36768275SEric Cheng * reserved for someones exclusive use. 36778275SEric Cheng */ 36788275SEric Cheng if (group->mrg_state == MAC_GROUP_STATE_RESERVED) { 36798275SEric Cheng flow_entry_t *flent; 36808275SEric Cheng mac_client_impl_t *mcip; 36818275SEric Cheng 36828275SEric Cheng mcip = MAC_RX_GROUP_ONLY_CLIENT(group); 36838275SEric Cheng ASSERT(mcip != NULL); 36848275SEric Cheng flent = mcip->mci_flent; 36858275SEric Cheng ASSERT(flent->fe_rx_srs_cnt > 0); 36868275SEric Cheng mac_srs_group_setup(mcip, flent, group, SRST_LINK); 36878275SEric Cheng } 36888275SEric Cheng break; 36898275SEric Cheng case MAC_RING_TYPE_TX: 36908275SEric Cheng /* 36918275SEric Cheng * For TX this function is only invoked during the 36928275SEric Cheng * initial creation of a group when a share is 36938275SEric Cheng * associated with a MAC client. So the datapath is not 36948275SEric Cheng * yet setup, and will be setup later after the 36958275SEric Cheng * group has been reserved and populated. 36968275SEric Cheng */ 36978275SEric Cheng break; 36988275SEric Cheng default: 36998275SEric Cheng ASSERT(B_FALSE); 37008275SEric Cheng } 37018275SEric Cheng 37028275SEric Cheng /* 37038275SEric Cheng * Start the ring if needed. Failure causes to undo the grouping action. 37048275SEric Cheng */ 37058275SEric Cheng if ((ret = mac_start_ring(ring)) != 0) { 37068275SEric Cheng if (ring->mr_type == MAC_RING_TYPE_RX) { 37078275SEric Cheng if (ring->mr_srs != NULL) { 37088275SEric Cheng mac_rx_srs_remove(ring->mr_srs); 37098275SEric Cheng ring->mr_srs = NULL; 37108275SEric Cheng } 37118275SEric Cheng } 37128275SEric Cheng if (!driver_call) { 37138275SEric Cheng cap_rings->mr_gremring(group->mrg_driver, 37148275SEric Cheng ring->mr_driver, ring->mr_type); 37158275SEric Cheng } 37168275SEric Cheng group->mrg_cur_count--; 37178275SEric Cheng group->mrg_rings = ring->mr_next; 37188275SEric Cheng 37198275SEric Cheng ring->mr_gh = NULL; 37208275SEric Cheng 37218275SEric Cheng if (driver_call) 37228275SEric Cheng mac_ring_free(mip, ring); 37238275SEric Cheng 37248275SEric Cheng return (ret); 37258275SEric Cheng } 37268275SEric Cheng 37278275SEric Cheng /* 37288275SEric Cheng * Update the ring's state. 37298275SEric Cheng */ 37308275SEric Cheng ring->mr_state = MR_INUSE; 37318275SEric Cheng MAC_RING_UNMARK(ring, MR_INCIPIENT); 37328275SEric Cheng return (0); 37338275SEric Cheng } 37348275SEric Cheng 37358275SEric Cheng /* 37368275SEric Cheng * Remove a ring from it's current group. MAC internal function for dynamic 37378275SEric Cheng * grouping. 37388275SEric Cheng * 37398275SEric Cheng * The caller needs to call mac_perim_enter() before calling this function. 37408275SEric Cheng */ 37418275SEric Cheng void 37428275SEric Cheng i_mac_group_rem_ring(mac_group_t *group, mac_ring_t *ring, 37438275SEric Cheng boolean_t driver_call) 37448275SEric Cheng { 37458275SEric Cheng mac_impl_t *mip = (mac_impl_t *)group->mrg_mh; 37468275SEric Cheng mac_capab_rings_t *cap_rings = NULL; 37478275SEric Cheng mac_group_type_t group_type; 37488275SEric Cheng 37498275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 37508275SEric Cheng 37518275SEric Cheng ASSERT(mac_find_ring(group, ring->mr_index) == ring); 37528275SEric Cheng ASSERT((mac_group_t *)ring->mr_gh == group); 37538275SEric Cheng ASSERT(ring->mr_type == group->mrg_type); 37548275SEric Cheng 37558275SEric Cheng switch (ring->mr_type) { 37568275SEric Cheng case MAC_RING_TYPE_RX: 37578275SEric Cheng group_type = mip->mi_rx_group_type; 37588275SEric Cheng cap_rings = &mip->mi_rx_rings_cap; 37598275SEric Cheng 37608275SEric Cheng if (group->mrg_state >= MAC_GROUP_STATE_RESERVED) 37618275SEric Cheng mac_stop_ring(ring); 37628275SEric Cheng 37638275SEric Cheng /* 37648275SEric Cheng * Only hardware classified packets hold a reference to the 37658275SEric Cheng * ring all the way up the Rx path. mac_rx_srs_remove() 37668275SEric Cheng * will take care of quiescing the Rx path and removing the 37678275SEric Cheng * SRS. The software classified path neither holds a reference 37688275SEric Cheng * nor any association with the ring in mac_rx. 37698275SEric Cheng */ 37708275SEric Cheng if (ring->mr_srs != NULL) { 37718275SEric Cheng mac_rx_srs_remove(ring->mr_srs); 37728275SEric Cheng ring->mr_srs = NULL; 37738275SEric Cheng } 37748275SEric Cheng ring->mr_state = MR_FREE; 37758275SEric Cheng ring->mr_flag = 0; 37768275SEric Cheng 37778275SEric Cheng break; 37788275SEric Cheng case MAC_RING_TYPE_TX: 37798275SEric Cheng /* 37808275SEric Cheng * For TX this function is only invoked in two 37818275SEric Cheng * cases: 37828275SEric Cheng * 37838275SEric Cheng * 1) In the case of a failure during the 37848275SEric Cheng * initial creation of a group when a share is 37858275SEric Cheng * associated with a MAC client. So the SRS is not 37868275SEric Cheng * yet setup, and will be setup later after the 37878275SEric Cheng * group has been reserved and populated. 37888275SEric Cheng * 37898275SEric Cheng * 2) From mac_release_tx_group() when freeing 37908275SEric Cheng * a TX SRS. 37918275SEric Cheng * 37928275SEric Cheng * In both cases the SRS and its soft rings are 37938275SEric Cheng * already quiesced. 37948275SEric Cheng */ 37958275SEric Cheng ASSERT(!driver_call); 37968275SEric Cheng group_type = mip->mi_tx_group_type; 37978275SEric Cheng cap_rings = &mip->mi_tx_rings_cap; 37988275SEric Cheng break; 37998275SEric Cheng default: 38008275SEric Cheng ASSERT(B_FALSE); 38018275SEric Cheng } 38028275SEric Cheng 38038275SEric Cheng /* 38048275SEric Cheng * Remove the ring from the group. 38058275SEric Cheng */ 38068275SEric Cheng if (ring == group->mrg_rings) 38078275SEric Cheng group->mrg_rings = ring->mr_next; 38088275SEric Cheng else { 38098275SEric Cheng mac_ring_t *pre; 38108275SEric Cheng 38118275SEric Cheng pre = group->mrg_rings; 38128275SEric Cheng while (pre->mr_next != ring) 38138275SEric Cheng pre = pre->mr_next; 38148275SEric Cheng pre->mr_next = ring->mr_next; 38158275SEric Cheng } 38168275SEric Cheng group->mrg_cur_count--; 38178275SEric Cheng 38188275SEric Cheng if (!driver_call) { 38198275SEric Cheng ASSERT(group_type == MAC_GROUP_TYPE_DYNAMIC); 38208275SEric Cheng ASSERT(cap_rings->mr_gremring != NULL); 38218275SEric Cheng 38228275SEric Cheng /* 38238275SEric Cheng * Remove the driver level hardware ring. 38248275SEric Cheng */ 38258275SEric Cheng if (group->mrg_driver != NULL) { 38268275SEric Cheng cap_rings->mr_gremring(group->mrg_driver, 38278275SEric Cheng ring->mr_driver, ring->mr_type); 38288275SEric Cheng } 38298275SEric Cheng } 38308275SEric Cheng 38318275SEric Cheng ring->mr_gh = NULL; 38328275SEric Cheng if (driver_call) { 38338275SEric Cheng mac_ring_free(mip, ring); 38348275SEric Cheng } else { 38358275SEric Cheng ring->mr_state = MR_FREE; 38368275SEric Cheng ring->mr_flag = 0; 38378275SEric Cheng } 38388275SEric Cheng } 38398275SEric Cheng 38408275SEric Cheng /* 38418275SEric Cheng * Move a ring to the target group. If needed, remove the ring from the group 38428275SEric Cheng * that it currently belongs to. 38438275SEric Cheng * 38448275SEric Cheng * The caller need to enter MAC's perimeter by calling mac_perim_enter(). 38458275SEric Cheng */ 38468275SEric Cheng static int 38478275SEric Cheng mac_group_mov_ring(mac_impl_t *mip, mac_group_t *d_group, mac_ring_t *ring) 38488275SEric Cheng { 38498275SEric Cheng mac_group_t *s_group = (mac_group_t *)ring->mr_gh; 38508275SEric Cheng int rv; 38518275SEric Cheng 38528275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 38538275SEric Cheng ASSERT(d_group != NULL); 38548275SEric Cheng ASSERT(s_group->mrg_mh == d_group->mrg_mh); 38558275SEric Cheng 38568275SEric Cheng if (s_group == d_group) 38578275SEric Cheng return (0); 38588275SEric Cheng 38598275SEric Cheng /* 38608275SEric Cheng * Remove it from current group first. 38618275SEric Cheng */ 38628275SEric Cheng if (s_group != NULL) 38638275SEric Cheng i_mac_group_rem_ring(s_group, ring, B_FALSE); 38648275SEric Cheng 38658275SEric Cheng /* 38668275SEric Cheng * Add it to the new group. 38678275SEric Cheng */ 38688275SEric Cheng rv = i_mac_group_add_ring(d_group, ring, 0); 38698275SEric Cheng if (rv != 0) { 38708275SEric Cheng /* 38718275SEric Cheng * Failed to add ring back to source group. If 38728275SEric Cheng * that fails, the ring is stuck in limbo, log message. 38738275SEric Cheng */ 38748275SEric Cheng if (i_mac_group_add_ring(s_group, ring, 0)) { 38758275SEric Cheng cmn_err(CE_WARN, "%s: failed to move ring %p\n", 38768275SEric Cheng mip->mi_name, (void *)ring); 38778275SEric Cheng } 38788275SEric Cheng } 38798275SEric Cheng 38808275SEric Cheng return (rv); 38818275SEric Cheng } 38828275SEric Cheng 38838275SEric Cheng /* 38848275SEric Cheng * Find a MAC address according to its value. 38858275SEric Cheng */ 38868275SEric Cheng mac_address_t * 38878275SEric Cheng mac_find_macaddr(mac_impl_t *mip, uint8_t *mac_addr) 38888275SEric Cheng { 38898275SEric Cheng mac_address_t *map; 38908275SEric Cheng 38918275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 38928275SEric Cheng 38938275SEric Cheng for (map = mip->mi_addresses; map != NULL; map = map->ma_next) { 38948275SEric Cheng if (bcmp(mac_addr, map->ma_addr, map->ma_len) == 0) 38958275SEric Cheng break; 38968275SEric Cheng } 38978275SEric Cheng 38988275SEric Cheng return (map); 38998275SEric Cheng } 39008275SEric Cheng 39018275SEric Cheng /* 39028275SEric Cheng * Check whether the MAC address is shared by multiple clients. 39038275SEric Cheng */ 39048275SEric Cheng boolean_t 39058275SEric Cheng mac_check_macaddr_shared(mac_address_t *map) 39068275SEric Cheng { 39078275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)map->ma_mip)); 39088275SEric Cheng 39098275SEric Cheng return (map->ma_nusers > 1); 39108275SEric Cheng } 39118275SEric Cheng 39128275SEric Cheng /* 39138275SEric Cheng * Remove the specified MAC address from the MAC address list and free it. 39148275SEric Cheng */ 39158275SEric Cheng static void 39168275SEric Cheng mac_free_macaddr(mac_address_t *map) 39178275SEric Cheng { 39188275SEric Cheng mac_impl_t *mip = map->ma_mip; 39198275SEric Cheng 39208275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 39218275SEric Cheng ASSERT(mip->mi_addresses != NULL); 39228275SEric Cheng 39238275SEric Cheng map = mac_find_macaddr(mip, map->ma_addr); 39248275SEric Cheng 39258275SEric Cheng ASSERT(map != NULL); 39268275SEric Cheng ASSERT(map->ma_nusers == 0); 39278275SEric Cheng 39288275SEric Cheng if (map == mip->mi_addresses) { 39298275SEric Cheng mip->mi_addresses = map->ma_next; 39308275SEric Cheng } else { 39318275SEric Cheng mac_address_t *pre; 39328275SEric Cheng 39338275SEric Cheng pre = mip->mi_addresses; 39348275SEric Cheng while (pre->ma_next != map) 39358275SEric Cheng pre = pre->ma_next; 39368275SEric Cheng pre->ma_next = map->ma_next; 39378275SEric Cheng } 39388275SEric Cheng 39398275SEric Cheng kmem_free(map, sizeof (mac_address_t)); 39408275SEric Cheng } 39418275SEric Cheng 39428275SEric Cheng /* 39438275SEric Cheng * Add a MAC address reference for a client. If the desired MAC address 39448275SEric Cheng * exists, add a reference to it. Otherwise, add the new address by adding 39458275SEric Cheng * it to a reserved group or setting promiscuous mode. Won't try different 39468275SEric Cheng * group is the group is non-NULL, so the caller must explictly share 39478275SEric Cheng * default group when needed. 39488275SEric Cheng * 39498275SEric Cheng * Note, the primary MAC address is initialized at registration time, so 39508275SEric Cheng * to add it to default group only need to activate it if its reference 39518275SEric Cheng * count is still zero. Also, some drivers may not have advertised RINGS 39528275SEric Cheng * capability. 39538275SEric Cheng */ 39548275SEric Cheng int 39558400SNicolas.Droux@Sun.COM mac_add_macaddr(mac_impl_t *mip, mac_group_t *group, uint8_t *mac_addr, 39568400SNicolas.Droux@Sun.COM boolean_t use_hw) 39578275SEric Cheng { 39588275SEric Cheng mac_address_t *map; 39598275SEric Cheng int err = 0; 39608275SEric Cheng boolean_t allocated_map = B_FALSE; 39618275SEric Cheng 39628275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 39638275SEric Cheng 39648275SEric Cheng map = mac_find_macaddr(mip, mac_addr); 39658275SEric Cheng 39668275SEric Cheng /* 39678275SEric Cheng * If the new MAC address has not been added. Allocate a new one 39688275SEric Cheng * and set it up. 39698275SEric Cheng */ 39708275SEric Cheng if (map == NULL) { 39718275SEric Cheng map = kmem_zalloc(sizeof (mac_address_t), KM_SLEEP); 39728275SEric Cheng map->ma_len = mip->mi_type->mt_addr_length; 39738275SEric Cheng bcopy(mac_addr, map->ma_addr, map->ma_len); 39748275SEric Cheng map->ma_nusers = 0; 39758275SEric Cheng map->ma_group = group; 39768275SEric Cheng map->ma_mip = mip; 39778275SEric Cheng 39788275SEric Cheng /* add the new MAC address to the head of the address list */ 39798275SEric Cheng map->ma_next = mip->mi_addresses; 39808275SEric Cheng mip->mi_addresses = map; 39818275SEric Cheng 39828275SEric Cheng allocated_map = B_TRUE; 39838275SEric Cheng } 39848275SEric Cheng 39858275SEric Cheng ASSERT(map->ma_group == group); 39868275SEric Cheng 39878275SEric Cheng /* 39888275SEric Cheng * If the MAC address is already in use, simply account for the 39898275SEric Cheng * new client. 39908275SEric Cheng */ 39918275SEric Cheng if (map->ma_nusers++ > 0) 39928275SEric Cheng return (0); 39938275SEric Cheng 39948275SEric Cheng /* 39958275SEric Cheng * Activate this MAC address by adding it to the reserved group. 39968275SEric Cheng */ 39978275SEric Cheng if (group != NULL) { 39988275SEric Cheng err = mac_group_addmac(group, (const uint8_t *)mac_addr); 39998275SEric Cheng if (err == 0) { 40008275SEric Cheng map->ma_type = MAC_ADDRESS_TYPE_UNICAST_CLASSIFIED; 40018275SEric Cheng return (0); 40028275SEric Cheng } 40038275SEric Cheng } 40048275SEric Cheng 40058275SEric Cheng /* 40068400SNicolas.Droux@Sun.COM * The MAC address addition failed. If the client requires a 40078400SNicolas.Droux@Sun.COM * hardware classified MAC address, fail the operation. 40088400SNicolas.Droux@Sun.COM */ 40098400SNicolas.Droux@Sun.COM if (use_hw) { 40108400SNicolas.Droux@Sun.COM err = ENOSPC; 40118400SNicolas.Droux@Sun.COM goto bail; 40128400SNicolas.Droux@Sun.COM } 40138400SNicolas.Droux@Sun.COM 40148400SNicolas.Droux@Sun.COM /* 40158400SNicolas.Droux@Sun.COM * Try promiscuous mode. 40168400SNicolas.Droux@Sun.COM * 40178400SNicolas.Droux@Sun.COM * For drivers that don't advertise RINGS capability, do 40188400SNicolas.Droux@Sun.COM * nothing for the primary address. 40198275SEric Cheng */ 40208400SNicolas.Droux@Sun.COM if ((group == NULL) && 40218400SNicolas.Droux@Sun.COM (bcmp(map->ma_addr, mip->mi_addr, map->ma_len) == 0)) { 40228400SNicolas.Droux@Sun.COM map->ma_type = MAC_ADDRESS_TYPE_UNICAST_CLASSIFIED; 40238400SNicolas.Droux@Sun.COM return (0); 40248400SNicolas.Droux@Sun.COM } 40258400SNicolas.Droux@Sun.COM 40268400SNicolas.Droux@Sun.COM /* 40278400SNicolas.Droux@Sun.COM * Enable promiscuous mode in order to receive traffic 40288400SNicolas.Droux@Sun.COM * to the new MAC address. 40298400SNicolas.Droux@Sun.COM */ 40309641SGirish.Moodalbail@Sun.COM if ((err = i_mac_promisc_set(mip, B_TRUE)) == 0) { 40318400SNicolas.Droux@Sun.COM map->ma_type = MAC_ADDRESS_TYPE_UNICAST_PROMISC; 40328400SNicolas.Droux@Sun.COM return (0); 40338275SEric Cheng } 40348275SEric Cheng 40358275SEric Cheng /* 40368275SEric Cheng * Free the MAC address that could not be added. Don't free 40378275SEric Cheng * a pre-existing address, it could have been the entry 40388275SEric Cheng * for the primary MAC address which was pre-allocated by 40398275SEric Cheng * mac_init_macaddr(), and which must remain on the list. 40408275SEric Cheng */ 40418400SNicolas.Droux@Sun.COM bail: 40428275SEric Cheng map->ma_nusers--; 40438275SEric Cheng if (allocated_map) 40448275SEric Cheng mac_free_macaddr(map); 40458275SEric Cheng return (err); 40468275SEric Cheng } 40478275SEric Cheng 40488275SEric Cheng /* 40498275SEric Cheng * Remove a reference to a MAC address. This may cause to remove the MAC 40508275SEric Cheng * address from an associated group or to turn off promiscuous mode. 40518275SEric Cheng * The caller needs to handle the failure properly. 40528275SEric Cheng */ 40538275SEric Cheng int 40548275SEric Cheng mac_remove_macaddr(mac_address_t *map) 40558275SEric Cheng { 40568275SEric Cheng mac_impl_t *mip = map->ma_mip; 40578275SEric Cheng int err = 0; 40588275SEric Cheng 40598275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 40608275SEric Cheng 40618275SEric Cheng ASSERT(map == mac_find_macaddr(mip, map->ma_addr)); 40628275SEric Cheng 40638275SEric Cheng /* 40648275SEric Cheng * If it's not the last client using this MAC address, only update 40658275SEric Cheng * the MAC clients count. 40668275SEric Cheng */ 40678275SEric Cheng if (--map->ma_nusers > 0) 40688275SEric Cheng return (0); 40698275SEric Cheng 40708275SEric Cheng /* 40718275SEric Cheng * The MAC address is no longer used by any MAC client, so remove 40728275SEric Cheng * it from its associated group, or turn off promiscuous mode 40738275SEric Cheng * if it was enabled for the MAC address. 40748275SEric Cheng */ 40758275SEric Cheng switch (map->ma_type) { 40768275SEric Cheng case MAC_ADDRESS_TYPE_UNICAST_CLASSIFIED: 40778275SEric Cheng /* 40788275SEric Cheng * Don't free the preset primary address for drivers that 40798275SEric Cheng * don't advertise RINGS capability. 40808275SEric Cheng */ 40818275SEric Cheng if (map->ma_group == NULL) 40828275SEric Cheng return (0); 40838275SEric Cheng 40848275SEric Cheng err = mac_group_remmac(map->ma_group, map->ma_addr); 40858275SEric Cheng break; 40868275SEric Cheng case MAC_ADDRESS_TYPE_UNICAST_PROMISC: 40879641SGirish.Moodalbail@Sun.COM err = i_mac_promisc_set(mip, B_FALSE); 40888275SEric Cheng break; 40898275SEric Cheng default: 40908275SEric Cheng ASSERT(B_FALSE); 40918275SEric Cheng } 40928275SEric Cheng 40938275SEric Cheng if (err != 0) 40948275SEric Cheng return (err); 40958275SEric Cheng 40968275SEric Cheng /* 40978275SEric Cheng * We created MAC address for the primary one at registration, so we 40988275SEric Cheng * won't free it here. mac_fini_macaddr() will take care of it. 40998275SEric Cheng */ 41008275SEric Cheng if (bcmp(map->ma_addr, mip->mi_addr, map->ma_len) != 0) 41018275SEric Cheng mac_free_macaddr(map); 41028275SEric Cheng 41038275SEric Cheng return (0); 41048275SEric Cheng } 41058275SEric Cheng 41068275SEric Cheng /* 41078275SEric Cheng * Update an existing MAC address. The caller need to make sure that the new 41088275SEric Cheng * value has not been used. 41098275SEric Cheng */ 41108275SEric Cheng int 41118275SEric Cheng mac_update_macaddr(mac_address_t *map, uint8_t *mac_addr) 41128275SEric Cheng { 41138275SEric Cheng mac_impl_t *mip = map->ma_mip; 41148275SEric Cheng int err = 0; 41158275SEric Cheng 41168275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 41178275SEric Cheng ASSERT(mac_find_macaddr(mip, mac_addr) == NULL); 41188275SEric Cheng 41198275SEric Cheng switch (map->ma_type) { 41208275SEric Cheng case MAC_ADDRESS_TYPE_UNICAST_CLASSIFIED: 41218275SEric Cheng /* 41228275SEric Cheng * Update the primary address for drivers that are not 41238275SEric Cheng * RINGS capable. 41248275SEric Cheng */ 41258275SEric Cheng if (map->ma_group == NULL) { 41268275SEric Cheng err = mip->mi_unicst(mip->mi_driver, (const uint8_t *) 41278275SEric Cheng mac_addr); 41288275SEric Cheng if (err != 0) 41298275SEric Cheng return (err); 41308275SEric Cheng break; 41318275SEric Cheng } 41328275SEric Cheng 41338275SEric Cheng /* 41348275SEric Cheng * If this MAC address is not currently in use, 41358275SEric Cheng * simply break out and update the value. 41368275SEric Cheng */ 41378275SEric Cheng if (map->ma_nusers == 0) 41388275SEric Cheng break; 41398275SEric Cheng 41408275SEric Cheng /* 41418275SEric Cheng * Need to replace the MAC address associated with a group. 41428275SEric Cheng */ 41438275SEric Cheng err = mac_group_remmac(map->ma_group, map->ma_addr); 41448275SEric Cheng if (err != 0) 41458275SEric Cheng return (err); 41468275SEric Cheng 41478275SEric Cheng err = mac_group_addmac(map->ma_group, mac_addr); 41488275SEric Cheng 41498275SEric Cheng /* 41508275SEric Cheng * Failure hints hardware error. The MAC layer needs to 41518275SEric Cheng * have error notification facility to handle this. 41528275SEric Cheng * Now, simply try to restore the value. 41538275SEric Cheng */ 41548275SEric Cheng if (err != 0) 41558275SEric Cheng (void) mac_group_addmac(map->ma_group, map->ma_addr); 41568275SEric Cheng 41578275SEric Cheng break; 41588275SEric Cheng case MAC_ADDRESS_TYPE_UNICAST_PROMISC: 41598275SEric Cheng /* 41608275SEric Cheng * Need to do nothing more if in promiscuous mode. 41618275SEric Cheng */ 41628275SEric Cheng break; 41638275SEric Cheng default: 41648275SEric Cheng ASSERT(B_FALSE); 41658275SEric Cheng } 41668275SEric Cheng 41678275SEric Cheng /* 41688275SEric Cheng * Successfully replaced the MAC address. 41698275SEric Cheng */ 41708275SEric Cheng if (err == 0) 41718275SEric Cheng bcopy(mac_addr, map->ma_addr, map->ma_len); 41728275SEric Cheng 41738275SEric Cheng return (err); 41748275SEric Cheng } 41758275SEric Cheng 41768275SEric Cheng /* 41778275SEric Cheng * Freshen the MAC address with new value. Its caller must have updated the 41788275SEric Cheng * hardware MAC address before calling this function. 41798275SEric Cheng * This funcitons is supposed to be used to handle the MAC address change 41808275SEric Cheng * notification from underlying drivers. 41818275SEric Cheng */ 41828275SEric Cheng void 41838275SEric Cheng mac_freshen_macaddr(mac_address_t *map, uint8_t *mac_addr) 41848275SEric Cheng { 41858275SEric Cheng mac_impl_t *mip = map->ma_mip; 41868275SEric Cheng 41878275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 41888275SEric Cheng ASSERT(mac_find_macaddr(mip, mac_addr) == NULL); 41898275SEric Cheng 41908275SEric Cheng /* 41918275SEric Cheng * Freshen the MAC address with new value. 41928275SEric Cheng */ 41938275SEric Cheng bcopy(mac_addr, map->ma_addr, map->ma_len); 41948275SEric Cheng bcopy(mac_addr, mip->mi_addr, map->ma_len); 41958275SEric Cheng 41968275SEric Cheng /* 41978275SEric Cheng * Update all MAC clients that share this MAC address. 41988275SEric Cheng */ 41998275SEric Cheng mac_unicast_update_clients(mip, map); 42008275SEric Cheng } 42018275SEric Cheng 42028275SEric Cheng /* 42038275SEric Cheng * Set up the primary MAC address. 42048275SEric Cheng */ 42058275SEric Cheng void 42068275SEric Cheng mac_init_macaddr(mac_impl_t *mip) 42078275SEric Cheng { 42088275SEric Cheng mac_address_t *map; 42098275SEric Cheng 42108275SEric Cheng /* 42118275SEric Cheng * The reference count is initialized to zero, until it's really 42128275SEric Cheng * activated. 42138275SEric Cheng */ 42148275SEric Cheng map = kmem_zalloc(sizeof (mac_address_t), KM_SLEEP); 42158275SEric Cheng map->ma_len = mip->mi_type->mt_addr_length; 42168275SEric Cheng bcopy(mip->mi_addr, map->ma_addr, map->ma_len); 42178275SEric Cheng 42188275SEric Cheng /* 42198275SEric Cheng * If driver advertises RINGS capability, it shouldn't have initialized 42208275SEric Cheng * its primary MAC address. For other drivers, including VNIC, the 42218275SEric Cheng * primary address must work after registration. 42228275SEric Cheng */ 42238275SEric Cheng if (mip->mi_rx_groups == NULL) 42248275SEric Cheng map->ma_type = MAC_ADDRESS_TYPE_UNICAST_CLASSIFIED; 42258275SEric Cheng 42268275SEric Cheng /* 42278275SEric Cheng * The primary MAC address is reserved for default group according 42288275SEric Cheng * to current design. 42298275SEric Cheng */ 42308275SEric Cheng map->ma_group = mip->mi_rx_groups; 42318275SEric Cheng map->ma_mip = mip; 42328275SEric Cheng 42338275SEric Cheng mip->mi_addresses = map; 42348275SEric Cheng } 42358275SEric Cheng 42368275SEric Cheng /* 42378275SEric Cheng * Clean up the primary MAC address. Note, only one primary MAC address 42388275SEric Cheng * is allowed. All other MAC addresses must have been freed appropriately. 42398275SEric Cheng */ 42408275SEric Cheng void 42418275SEric Cheng mac_fini_macaddr(mac_impl_t *mip) 42428275SEric Cheng { 42438275SEric Cheng mac_address_t *map = mip->mi_addresses; 42448275SEric Cheng 42458833SVenu.Iyer@Sun.COM if (map == NULL) 42468833SVenu.Iyer@Sun.COM return; 42478833SVenu.Iyer@Sun.COM 42488833SVenu.Iyer@Sun.COM /* 42498833SVenu.Iyer@Sun.COM * If mi_addresses is initialized, there should be exactly one 42508833SVenu.Iyer@Sun.COM * entry left on the list with no users. 42518833SVenu.Iyer@Sun.COM */ 42528275SEric Cheng ASSERT(map->ma_nusers == 0); 42538275SEric Cheng ASSERT(map->ma_next == NULL); 42548275SEric Cheng 42558275SEric Cheng kmem_free(map, sizeof (mac_address_t)); 42568275SEric Cheng mip->mi_addresses = NULL; 42578275SEric Cheng } 42588275SEric Cheng 42598275SEric Cheng /* 42608275SEric Cheng * Logging related functions. 42618275SEric Cheng */ 42628275SEric Cheng 42638275SEric Cheng /* Write the Flow description to the log file */ 42648275SEric Cheng int 42658275SEric Cheng mac_write_flow_desc(flow_entry_t *flent, mac_client_impl_t *mcip) 42668275SEric Cheng { 42678275SEric Cheng flow_desc_t *fdesc; 42688275SEric Cheng mac_resource_props_t *mrp; 42698275SEric Cheng net_desc_t ndesc; 42708275SEric Cheng 42718275SEric Cheng bzero(&ndesc, sizeof (net_desc_t)); 42728275SEric Cheng 42738275SEric Cheng /* 42748275SEric Cheng * Grab the fe_lock to see a self-consistent fe_flow_desc. 42758275SEric Cheng * Updates to the fe_flow_desc are done under the fe_lock 42768275SEric Cheng */ 42778275SEric Cheng mutex_enter(&flent->fe_lock); 42788275SEric Cheng fdesc = &flent->fe_flow_desc; 42798275SEric Cheng mrp = &flent->fe_resource_props; 42808275SEric Cheng 42818275SEric Cheng ndesc.nd_name = flent->fe_flow_name; 42828275SEric Cheng ndesc.nd_devname = mcip->mci_name; 42838275SEric Cheng bcopy(fdesc->fd_src_mac, ndesc.nd_ehost, ETHERADDRL); 42848275SEric Cheng bcopy(fdesc->fd_dst_mac, ndesc.nd_edest, ETHERADDRL); 42858275SEric Cheng ndesc.nd_sap = htonl(fdesc->fd_sap); 42868275SEric Cheng ndesc.nd_isv4 = (uint8_t)fdesc->fd_ipversion == IPV4_VERSION; 42878275SEric Cheng ndesc.nd_bw_limit = mrp->mrp_maxbw; 42888275SEric Cheng if (ndesc.nd_isv4) { 42898275SEric Cheng ndesc.nd_saddr[3] = htonl(fdesc->fd_local_addr.s6_addr32[3]); 42908275SEric Cheng ndesc.nd_daddr[3] = htonl(fdesc->fd_remote_addr.s6_addr32[3]); 42918275SEric Cheng } else { 42928275SEric Cheng bcopy(&fdesc->fd_local_addr, ndesc.nd_saddr, IPV6_ADDR_LEN); 42938275SEric Cheng bcopy(&fdesc->fd_remote_addr, ndesc.nd_daddr, IPV6_ADDR_LEN); 42948275SEric Cheng } 42958275SEric Cheng ndesc.nd_sport = htons(fdesc->fd_local_port); 42968275SEric Cheng ndesc.nd_dport = htons(fdesc->fd_remote_port); 42978275SEric Cheng ndesc.nd_protocol = (uint8_t)fdesc->fd_protocol; 42988275SEric Cheng mutex_exit(&flent->fe_lock); 42998275SEric Cheng 43008275SEric Cheng return (exacct_commit_netinfo((void *)&ndesc, EX_NET_FLDESC_REC)); 43018275SEric Cheng } 43028275SEric Cheng 43038275SEric Cheng /* Write the Flow statistics to the log file */ 43048275SEric Cheng int 43058275SEric Cheng mac_write_flow_stats(flow_entry_t *flent) 43068275SEric Cheng { 43078275SEric Cheng flow_stats_t *fl_stats; 43088275SEric Cheng net_stat_t nstat; 43098275SEric Cheng 43108275SEric Cheng fl_stats = &flent->fe_flowstats; 43118275SEric Cheng nstat.ns_name = flent->fe_flow_name; 43128275SEric Cheng nstat.ns_ibytes = fl_stats->fs_rbytes; 43138275SEric Cheng nstat.ns_obytes = fl_stats->fs_obytes; 43148275SEric Cheng nstat.ns_ipackets = fl_stats->fs_ipackets; 43158275SEric Cheng nstat.ns_opackets = fl_stats->fs_opackets; 43168275SEric Cheng nstat.ns_ierrors = fl_stats->fs_ierrors; 43178275SEric Cheng nstat.ns_oerrors = fl_stats->fs_oerrors; 43188275SEric Cheng 43198275SEric Cheng return (exacct_commit_netinfo((void *)&nstat, EX_NET_FLSTAT_REC)); 43208275SEric Cheng } 43218275SEric Cheng 43228275SEric Cheng /* Write the Link Description to the log file */ 43238275SEric Cheng int 43248275SEric Cheng mac_write_link_desc(mac_client_impl_t *mcip) 43258275SEric Cheng { 43268275SEric Cheng net_desc_t ndesc; 43278275SEric Cheng flow_entry_t *flent = mcip->mci_flent; 43288275SEric Cheng 43298275SEric Cheng bzero(&ndesc, sizeof (net_desc_t)); 43308275SEric Cheng 43318275SEric Cheng ndesc.nd_name = mcip->mci_name; 43328275SEric Cheng ndesc.nd_devname = mcip->mci_name; 43338275SEric Cheng ndesc.nd_isv4 = B_TRUE; 43348275SEric Cheng /* 43358275SEric Cheng * Grab the fe_lock to see a self-consistent fe_flow_desc. 43368275SEric Cheng * Updates to the fe_flow_desc are done under the fe_lock 43378275SEric Cheng * after removing the flent from the flow table. 43388275SEric Cheng */ 43398275SEric Cheng mutex_enter(&flent->fe_lock); 43408275SEric Cheng bcopy(flent->fe_flow_desc.fd_src_mac, ndesc.nd_ehost, ETHERADDRL); 43418275SEric Cheng mutex_exit(&flent->fe_lock); 43428275SEric Cheng 43438275SEric Cheng return (exacct_commit_netinfo((void *)&ndesc, EX_NET_LNDESC_REC)); 43448275SEric Cheng } 43458275SEric Cheng 43468275SEric Cheng /* Write the Link statistics to the log file */ 43478275SEric Cheng int 43488275SEric Cheng mac_write_link_stats(mac_client_impl_t *mcip) 43498275SEric Cheng { 43508275SEric Cheng net_stat_t nstat; 43518275SEric Cheng 43528275SEric Cheng nstat.ns_name = mcip->mci_name; 43538275SEric Cheng nstat.ns_ibytes = mcip->mci_stat_ibytes; 43548275SEric Cheng nstat.ns_obytes = mcip->mci_stat_obytes; 43558275SEric Cheng nstat.ns_ipackets = mcip->mci_stat_ipackets; 43568275SEric Cheng nstat.ns_opackets = mcip->mci_stat_opackets; 43578275SEric Cheng nstat.ns_ierrors = mcip->mci_stat_ierrors; 43588275SEric Cheng nstat.ns_oerrors = mcip->mci_stat_oerrors; 43598275SEric Cheng 43608275SEric Cheng return (exacct_commit_netinfo((void *)&nstat, EX_NET_LNSTAT_REC)); 43618275SEric Cheng } 43628275SEric Cheng 43638275SEric Cheng /* 43648275SEric Cheng * For a given flow, if the descrition has not been logged before, do it now. 43658275SEric Cheng * If it is a VNIC, then we have collected information about it from the MAC 43668275SEric Cheng * table, so skip it. 43678275SEric Cheng */ 43688275SEric Cheng /*ARGSUSED*/ 43698275SEric Cheng static int 43708275SEric Cheng mac_log_flowinfo(flow_entry_t *flent, void *args) 43718275SEric Cheng { 43728275SEric Cheng mac_client_impl_t *mcip = flent->fe_mcip; 43738275SEric Cheng 43748275SEric Cheng if (mcip == NULL) 43758275SEric Cheng return (0); 43768275SEric Cheng 43778275SEric Cheng /* 43788275SEric Cheng * If the name starts with "vnic", and fe_user_generated is true (to 43798275SEric Cheng * exclude the mcast and active flow entries created implicitly for 43808275SEric Cheng * a vnic, it is a VNIC flow. i.e. vnic1 is a vnic flow, 43818275SEric Cheng * vnic/bge1/mcast1 is not and neither is vnic/bge1/active. 43828275SEric Cheng */ 43838275SEric Cheng if (strncasecmp(flent->fe_flow_name, "vnic", 4) == 0 && 43848275SEric Cheng (flent->fe_type & FLOW_USER) != 0) { 43858275SEric Cheng return (0); 43868275SEric Cheng } 43878275SEric Cheng 43888275SEric Cheng if (!flent->fe_desc_logged) { 43898275SEric Cheng /* 43908275SEric Cheng * We don't return error because we want to continu the 43918275SEric Cheng * walk in case this is the last walk which means we 43928275SEric Cheng * need to reset fe_desc_logged in all the flows. 43938275SEric Cheng */ 43948275SEric Cheng if (mac_write_flow_desc(flent, mcip) != 0) 43958275SEric Cheng return (0); 43968275SEric Cheng flent->fe_desc_logged = B_TRUE; 43978275SEric Cheng } 43988275SEric Cheng 43998275SEric Cheng /* 44008275SEric Cheng * Regardless of the error, we want to proceed in case we have to 44018275SEric Cheng * reset fe_desc_logged. 44028275SEric Cheng */ 44038275SEric Cheng (void) mac_write_flow_stats(flent); 44048275SEric Cheng 44058275SEric Cheng if (mcip != NULL && !(mcip->mci_state_flags & MCIS_DESC_LOGGED)) 44068275SEric Cheng flent->fe_desc_logged = B_FALSE; 44078275SEric Cheng 44088275SEric Cheng return (0); 44098275SEric Cheng } 44108275SEric Cheng 44118275SEric Cheng typedef struct i_mac_log_state_s { 44128275SEric Cheng boolean_t mi_last; 44138275SEric Cheng int mi_fenable; 44148275SEric Cheng int mi_lenable; 44158275SEric Cheng } i_mac_log_state_t; 44168275SEric Cheng 44178275SEric Cheng /* 44188275SEric Cheng * Walk the mac_impl_ts and log the description for each mac client of this mac, 44198275SEric Cheng * if it hasn't already been done. Additionally, log statistics for the link as 44208275SEric Cheng * well. Walk the flow table and log information for each flow as well. 44218275SEric Cheng * If it is the last walk (mci_last), then we turn off mci_desc_logged (and 44228275SEric Cheng * also fe_desc_logged, if flow logging is on) since we want to log the 44238275SEric Cheng * description if and when logging is restarted. 44248275SEric Cheng */ 44258275SEric Cheng /*ARGSUSED*/ 44268275SEric Cheng static uint_t 44278275SEric Cheng i_mac_log_walker(mod_hash_key_t key, mod_hash_val_t *val, void *arg) 44288275SEric Cheng { 44298275SEric Cheng mac_impl_t *mip = (mac_impl_t *)val; 44308275SEric Cheng i_mac_log_state_t *lstate = (i_mac_log_state_t *)arg; 44318275SEric Cheng int ret; 44328275SEric Cheng mac_client_impl_t *mcip; 44338275SEric Cheng 44348275SEric Cheng /* 44358275SEric Cheng * Only walk the client list for NIC and etherstub 44368275SEric Cheng */ 44378275SEric Cheng if ((mip->mi_state_flags & MIS_DISABLED) || 44388275SEric Cheng ((mip->mi_state_flags & MIS_IS_VNIC) && 44398275SEric Cheng (mac_get_lower_mac_handle((mac_handle_t)mip) != NULL))) 44408275SEric Cheng return (MH_WALK_CONTINUE); 44418275SEric Cheng 44428275SEric Cheng for (mcip = mip->mi_clients_list; mcip != NULL; 44438275SEric Cheng mcip = mcip->mci_client_next) { 44448275SEric Cheng if (!MCIP_DATAPATH_SETUP(mcip)) 44458275SEric Cheng continue; 44468275SEric Cheng if (lstate->mi_lenable) { 44478275SEric Cheng if (!(mcip->mci_state_flags & MCIS_DESC_LOGGED)) { 44488275SEric Cheng ret = mac_write_link_desc(mcip); 44498275SEric Cheng if (ret != 0) { 44508275SEric Cheng /* 44518275SEric Cheng * We can't terminate it if this is the last 44528275SEric Cheng * walk, else there might be some links with 44538275SEric Cheng * mi_desc_logged set to true, which means 44548275SEric Cheng * their description won't be logged the next 44558275SEric Cheng * time logging is started (similarly for the 44568275SEric Cheng * flows within such links). We can continue 44578275SEric Cheng * without walking the flow table (i.e. to 44588275SEric Cheng * set fe_desc_logged to false) because we 44598275SEric Cheng * won't have written any flow stuff for this 44608275SEric Cheng * link as we haven't logged the link itself. 44618275SEric Cheng */ 44628275SEric Cheng if (lstate->mi_last) 44638275SEric Cheng return (MH_WALK_CONTINUE); 44648275SEric Cheng else 44658275SEric Cheng return (MH_WALK_TERMINATE); 44668275SEric Cheng } 44678275SEric Cheng mcip->mci_state_flags |= MCIS_DESC_LOGGED; 44688275SEric Cheng } 44698275SEric Cheng } 44708275SEric Cheng 44718275SEric Cheng if (mac_write_link_stats(mcip) != 0 && !lstate->mi_last) 44728275SEric Cheng return (MH_WALK_TERMINATE); 44738275SEric Cheng 44748275SEric Cheng if (lstate->mi_last) 44758275SEric Cheng mcip->mci_state_flags &= ~MCIS_DESC_LOGGED; 44768275SEric Cheng 44778275SEric Cheng if (lstate->mi_fenable) { 44788275SEric Cheng if (mcip->mci_subflow_tab != NULL) { 44798275SEric Cheng (void) mac_flow_walk(mcip->mci_subflow_tab, 44808275SEric Cheng mac_log_flowinfo, mip); 44818275SEric Cheng } 44828275SEric Cheng } 44838275SEric Cheng } 44848275SEric Cheng return (MH_WALK_CONTINUE); 44858275SEric Cheng } 44868275SEric Cheng 44878275SEric Cheng /* 44888275SEric Cheng * The timer thread that runs every mac_logging_interval seconds and logs 44898275SEric Cheng * link and/or flow information. 44908275SEric Cheng */ 44918275SEric Cheng /* ARGSUSED */ 44928275SEric Cheng void 44938275SEric Cheng mac_log_linkinfo(void *arg) 44948275SEric Cheng { 44958275SEric Cheng i_mac_log_state_t lstate; 44968275SEric Cheng 44978275SEric Cheng rw_enter(&i_mac_impl_lock, RW_READER); 44988275SEric Cheng if (!mac_flow_log_enable && !mac_link_log_enable) { 44998275SEric Cheng rw_exit(&i_mac_impl_lock); 45008275SEric Cheng return; 45018275SEric Cheng } 45028275SEric Cheng lstate.mi_fenable = mac_flow_log_enable; 45038275SEric Cheng lstate.mi_lenable = mac_link_log_enable; 45048275SEric Cheng lstate.mi_last = B_FALSE; 45058275SEric Cheng rw_exit(&i_mac_impl_lock); 45068275SEric Cheng 45078275SEric Cheng mod_hash_walk(i_mac_impl_hash, i_mac_log_walker, &lstate); 45088275SEric Cheng 45098275SEric Cheng rw_enter(&i_mac_impl_lock, RW_WRITER); 45108275SEric Cheng if (mac_flow_log_enable || mac_link_log_enable) { 45118275SEric Cheng mac_logging_timer = timeout(mac_log_linkinfo, NULL, 45128275SEric Cheng SEC_TO_TICK(mac_logging_interval)); 45138275SEric Cheng } 45148275SEric Cheng rw_exit(&i_mac_impl_lock); 45158275SEric Cheng } 45168275SEric Cheng 45179073SCathy.Zhou@Sun.COM typedef struct i_mac_fastpath_state_s { 45189073SCathy.Zhou@Sun.COM boolean_t mf_disable; 45199073SCathy.Zhou@Sun.COM int mf_err; 45209073SCathy.Zhou@Sun.COM } i_mac_fastpath_state_t; 45219073SCathy.Zhou@Sun.COM 45229073SCathy.Zhou@Sun.COM /*ARGSUSED*/ 45239073SCathy.Zhou@Sun.COM static uint_t 45249073SCathy.Zhou@Sun.COM i_mac_fastpath_disable_walker(mod_hash_key_t key, mod_hash_val_t *val, 45259073SCathy.Zhou@Sun.COM void *arg) 45269073SCathy.Zhou@Sun.COM { 45279073SCathy.Zhou@Sun.COM i_mac_fastpath_state_t *state = arg; 45289073SCathy.Zhou@Sun.COM mac_handle_t mh = (mac_handle_t)val; 45299073SCathy.Zhou@Sun.COM 45309073SCathy.Zhou@Sun.COM if (state->mf_disable) 45319073SCathy.Zhou@Sun.COM state->mf_err = mac_fastpath_disable(mh); 45329073SCathy.Zhou@Sun.COM else 45339073SCathy.Zhou@Sun.COM mac_fastpath_enable(mh); 45349073SCathy.Zhou@Sun.COM 45359073SCathy.Zhou@Sun.COM return (state->mf_err == 0 ? MH_WALK_CONTINUE : MH_WALK_TERMINATE); 45369073SCathy.Zhou@Sun.COM } 45379073SCathy.Zhou@Sun.COM 45388275SEric Cheng /* 45398275SEric Cheng * Start the logging timer. 45408275SEric Cheng */ 45419073SCathy.Zhou@Sun.COM int 45428275SEric Cheng mac_start_logusage(mac_logtype_t type, uint_t interval) 45438275SEric Cheng { 45449073SCathy.Zhou@Sun.COM i_mac_fastpath_state_t state = {B_TRUE, 0}; 45459073SCathy.Zhou@Sun.COM int err; 45469073SCathy.Zhou@Sun.COM 45478275SEric Cheng rw_enter(&i_mac_impl_lock, RW_WRITER); 45488275SEric Cheng switch (type) { 45498275SEric Cheng case MAC_LOGTYPE_FLOW: 45508275SEric Cheng if (mac_flow_log_enable) { 45518275SEric Cheng rw_exit(&i_mac_impl_lock); 45529073SCathy.Zhou@Sun.COM return (0); 45538275SEric Cheng } 45548275SEric Cheng /* FALLTHRU */ 45558275SEric Cheng case MAC_LOGTYPE_LINK: 45568275SEric Cheng if (mac_link_log_enable) { 45578275SEric Cheng rw_exit(&i_mac_impl_lock); 45589073SCathy.Zhou@Sun.COM return (0); 45598275SEric Cheng } 45608275SEric Cheng break; 45618275SEric Cheng default: 45628275SEric Cheng ASSERT(0); 45638275SEric Cheng } 45649073SCathy.Zhou@Sun.COM 45659073SCathy.Zhou@Sun.COM /* Disable fastpath */ 45669073SCathy.Zhou@Sun.COM mod_hash_walk(i_mac_impl_hash, i_mac_fastpath_disable_walker, &state); 45679073SCathy.Zhou@Sun.COM if ((err = state.mf_err) != 0) { 45689073SCathy.Zhou@Sun.COM /* Reenable fastpath */ 45699073SCathy.Zhou@Sun.COM state.mf_disable = B_FALSE; 45709073SCathy.Zhou@Sun.COM state.mf_err = 0; 45719073SCathy.Zhou@Sun.COM mod_hash_walk(i_mac_impl_hash, 45729073SCathy.Zhou@Sun.COM i_mac_fastpath_disable_walker, &state); 45739073SCathy.Zhou@Sun.COM rw_exit(&i_mac_impl_lock); 45749073SCathy.Zhou@Sun.COM return (err); 45759073SCathy.Zhou@Sun.COM } 45769073SCathy.Zhou@Sun.COM 45779073SCathy.Zhou@Sun.COM switch (type) { 45789073SCathy.Zhou@Sun.COM case MAC_LOGTYPE_FLOW: 45799073SCathy.Zhou@Sun.COM mac_flow_log_enable = B_TRUE; 45809073SCathy.Zhou@Sun.COM /* FALLTHRU */ 45819073SCathy.Zhou@Sun.COM case MAC_LOGTYPE_LINK: 45829073SCathy.Zhou@Sun.COM mac_link_log_enable = B_TRUE; 45839073SCathy.Zhou@Sun.COM break; 45849073SCathy.Zhou@Sun.COM } 45859073SCathy.Zhou@Sun.COM 45868275SEric Cheng mac_logging_interval = interval; 45878275SEric Cheng rw_exit(&i_mac_impl_lock); 45888275SEric Cheng mac_log_linkinfo(NULL); 45899073SCathy.Zhou@Sun.COM return (0); 45908275SEric Cheng } 45918275SEric Cheng 45928275SEric Cheng /* 45938275SEric Cheng * Stop the logging timer if both Link and Flow logging are turned off. 45948275SEric Cheng */ 45958275SEric Cheng void 45968275SEric Cheng mac_stop_logusage(mac_logtype_t type) 45978275SEric Cheng { 45988275SEric Cheng i_mac_log_state_t lstate; 45999073SCathy.Zhou@Sun.COM i_mac_fastpath_state_t state = {B_FALSE, 0}; 46008275SEric Cheng 46018275SEric Cheng rw_enter(&i_mac_impl_lock, RW_WRITER); 46028275SEric Cheng lstate.mi_fenable = mac_flow_log_enable; 46038275SEric Cheng lstate.mi_lenable = mac_link_log_enable; 46048275SEric Cheng 46058275SEric Cheng /* Last walk */ 46068275SEric Cheng lstate.mi_last = B_TRUE; 46078275SEric Cheng 46088275SEric Cheng switch (type) { 46098275SEric Cheng case MAC_LOGTYPE_FLOW: 46108275SEric Cheng if (lstate.mi_fenable) { 46118275SEric Cheng ASSERT(mac_link_log_enable); 46128275SEric Cheng mac_flow_log_enable = B_FALSE; 46138275SEric Cheng mac_link_log_enable = B_FALSE; 46148275SEric Cheng break; 46158275SEric Cheng } 46168275SEric Cheng /* FALLTHRU */ 46178275SEric Cheng case MAC_LOGTYPE_LINK: 46188275SEric Cheng if (!lstate.mi_lenable || mac_flow_log_enable) { 46198275SEric Cheng rw_exit(&i_mac_impl_lock); 46208275SEric Cheng return; 46218275SEric Cheng } 46228275SEric Cheng mac_link_log_enable = B_FALSE; 46238275SEric Cheng break; 46248275SEric Cheng default: 46258275SEric Cheng ASSERT(0); 46268275SEric Cheng } 46279073SCathy.Zhou@Sun.COM 46289073SCathy.Zhou@Sun.COM /* Reenable fastpath */ 46299073SCathy.Zhou@Sun.COM mod_hash_walk(i_mac_impl_hash, i_mac_fastpath_disable_walker, &state); 46309073SCathy.Zhou@Sun.COM 46318275SEric Cheng rw_exit(&i_mac_impl_lock); 46328275SEric Cheng (void) untimeout(mac_logging_timer); 46338275SEric Cheng mac_logging_timer = 0; 46348275SEric Cheng 46358275SEric Cheng /* Last walk */ 46368275SEric Cheng mod_hash_walk(i_mac_impl_hash, i_mac_log_walker, &lstate); 46378275SEric Cheng } 46388275SEric Cheng 46398275SEric Cheng /* 46408275SEric Cheng * Walk the rx and tx SRS/SRs for a flow and update the priority value. 46418275SEric Cheng */ 46428275SEric Cheng void 46438275SEric Cheng mac_flow_update_priority(mac_client_impl_t *mcip, flow_entry_t *flent) 46448275SEric Cheng { 46458275SEric Cheng pri_t pri; 46468275SEric Cheng int count; 46478275SEric Cheng mac_soft_ring_set_t *mac_srs; 46488275SEric Cheng 46498275SEric Cheng if (flent->fe_rx_srs_cnt <= 0) 46508275SEric Cheng return; 46518275SEric Cheng 46528275SEric Cheng if (((mac_soft_ring_set_t *)flent->fe_rx_srs[0])->srs_type == 46538275SEric Cheng SRST_FLOW) { 46548275SEric Cheng pri = FLOW_PRIORITY(mcip->mci_min_pri, 46558275SEric Cheng mcip->mci_max_pri, 46568275SEric Cheng flent->fe_resource_props.mrp_priority); 46578275SEric Cheng } else { 46588275SEric Cheng pri = mcip->mci_max_pri; 46598275SEric Cheng } 46608275SEric Cheng 46618275SEric Cheng for (count = 0; count < flent->fe_rx_srs_cnt; count++) { 46628275SEric Cheng mac_srs = flent->fe_rx_srs[count]; 46638275SEric Cheng mac_update_srs_priority(mac_srs, pri); 46648275SEric Cheng } 46658275SEric Cheng /* 46668275SEric Cheng * If we have a Tx SRS, we need to modify all the threads associated 46678275SEric Cheng * with it. 46688275SEric Cheng */ 46698275SEric Cheng if (flent->fe_tx_srs != NULL) 46708275SEric Cheng mac_update_srs_priority(flent->fe_tx_srs, pri); 46718275SEric Cheng } 46728275SEric Cheng 46738275SEric Cheng /* 46748275SEric Cheng * RX and TX rings are reserved according to different semantics depending 46758275SEric Cheng * on the requests from the MAC clients and type of rings: 46768275SEric Cheng * 46778275SEric Cheng * On the Tx side, by default we reserve individual rings, independently from 46788275SEric Cheng * the groups. 46798275SEric Cheng * 46808275SEric Cheng * On the Rx side, the reservation is at the granularity of the group 46818275SEric Cheng * of rings, and used for v12n level 1 only. It has a special case for the 46828275SEric Cheng * primary client. 46838275SEric Cheng * 46848275SEric Cheng * If a share is allocated to a MAC client, we allocate a TX group and an 46858275SEric Cheng * RX group to the client, and assign TX rings and RX rings to these 46868275SEric Cheng * groups according to information gathered from the driver through 46878275SEric Cheng * the share capability. 46888275SEric Cheng * 46898275SEric Cheng * The foreseable evolution of Rx rings will handle v12n level 2 and higher 46908275SEric Cheng * to allocate individual rings out of a group and program the hw classifier 46918275SEric Cheng * based on IP address or higher level criteria. 46928275SEric Cheng */ 46938275SEric Cheng 46948275SEric Cheng /* 46958275SEric Cheng * mac_reserve_tx_ring() 46968275SEric Cheng * Reserve a unused ring by marking it with MR_INUSE state. 46978275SEric Cheng * As reserved, the ring is ready to function. 46988275SEric Cheng * 46998275SEric Cheng * Notes for Hybrid I/O: 47008275SEric Cheng * 47018275SEric Cheng * If a specific ring is needed, it is specified through the desired_ring 47028275SEric Cheng * argument. Otherwise that argument is set to NULL. 47038275SEric Cheng * If the desired ring was previous allocated to another client, this 47048275SEric Cheng * function swaps it with a new ring from the group of unassigned rings. 47058275SEric Cheng */ 47068275SEric Cheng mac_ring_t * 47078275SEric Cheng mac_reserve_tx_ring(mac_impl_t *mip, mac_ring_t *desired_ring) 47088275SEric Cheng { 47098275SEric Cheng mac_group_t *group; 47108275SEric Cheng mac_ring_t *ring; 47118275SEric Cheng 47128275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 47138275SEric Cheng 47148275SEric Cheng if (mip->mi_tx_groups == NULL) 47158275SEric Cheng return (NULL); 47168275SEric Cheng 47178275SEric Cheng /* 47188275SEric Cheng * Find an available ring and start it before changing its status. 47198275SEric Cheng * The unassigned rings are at the end of the mi_tx_groups 47208275SEric Cheng * array. 47218275SEric Cheng */ 47228275SEric Cheng group = mip->mi_tx_groups + mip->mi_tx_group_count; 47238275SEric Cheng 47248275SEric Cheng for (ring = group->mrg_rings; ring != NULL; 47258275SEric Cheng ring = ring->mr_next) { 47268275SEric Cheng if (desired_ring == NULL) { 47278275SEric Cheng if (ring->mr_state == MR_FREE) 47288275SEric Cheng /* wanted any free ring and found one */ 47298275SEric Cheng break; 47308275SEric Cheng } else { 47318275SEric Cheng mac_ring_t *sring; 47328275SEric Cheng mac_client_impl_t *client; 47338275SEric Cheng mac_soft_ring_set_t *srs; 47348275SEric Cheng 47358275SEric Cheng if (ring != desired_ring) 47368275SEric Cheng /* wants a desired ring but this one ain't it */ 47378275SEric Cheng continue; 47388275SEric Cheng 47398275SEric Cheng if (ring->mr_state == MR_FREE) 47408275SEric Cheng break; 47418275SEric Cheng 47428275SEric Cheng /* 47438275SEric Cheng * Found the desired ring but it's already in use. 47448275SEric Cheng * Swap it with a new ring. 47458275SEric Cheng */ 47468275SEric Cheng 47478275SEric Cheng /* find the client which owns that ring */ 47488275SEric Cheng for (client = mip->mi_clients_list; client != NULL; 47498275SEric Cheng client = client->mci_client_next) { 47508275SEric Cheng srs = MCIP_TX_SRS(client); 47518275SEric Cheng if (srs != NULL && mac_tx_srs_ring_present(srs, 47528275SEric Cheng desired_ring)) { 47538275SEric Cheng /* found our ring */ 47548275SEric Cheng break; 47558275SEric Cheng } 47568275SEric Cheng } 47578400SNicolas.Droux@Sun.COM if (client == NULL) { 47588400SNicolas.Droux@Sun.COM /* 47598400SNicolas.Droux@Sun.COM * The TX ring is in use, but it's not 47608400SNicolas.Droux@Sun.COM * associated with any clients, so it 47618400SNicolas.Droux@Sun.COM * has to be the default ring. In that 47628400SNicolas.Droux@Sun.COM * case we can simply assign a new ring 47638400SNicolas.Droux@Sun.COM * as the default ring, and we're done. 47648400SNicolas.Droux@Sun.COM */ 47658400SNicolas.Droux@Sun.COM ASSERT(mip->mi_default_tx_ring == 47668400SNicolas.Droux@Sun.COM (mac_ring_handle_t)desired_ring); 47678400SNicolas.Droux@Sun.COM 47688400SNicolas.Droux@Sun.COM /* 47698400SNicolas.Droux@Sun.COM * Quiesce all clients on top of 47708400SNicolas.Droux@Sun.COM * the NIC to make sure there are no 47718400SNicolas.Droux@Sun.COM * pending threads still relying on 47728400SNicolas.Droux@Sun.COM * that default ring, for example 47738400SNicolas.Droux@Sun.COM * the multicast path. 47748400SNicolas.Droux@Sun.COM */ 47758400SNicolas.Droux@Sun.COM for (client = mip->mi_clients_list; 47768400SNicolas.Droux@Sun.COM client != NULL; 47778400SNicolas.Droux@Sun.COM client = client->mci_client_next) { 47788400SNicolas.Droux@Sun.COM mac_tx_client_quiesce(client, 47798400SNicolas.Droux@Sun.COM SRS_QUIESCE); 47808400SNicolas.Droux@Sun.COM } 47818400SNicolas.Droux@Sun.COM 47828400SNicolas.Droux@Sun.COM mip->mi_default_tx_ring = (mac_ring_handle_t) 47838400SNicolas.Droux@Sun.COM mac_reserve_tx_ring(mip, NULL); 47848400SNicolas.Droux@Sun.COM 47858400SNicolas.Droux@Sun.COM /* resume the clients */ 47868400SNicolas.Droux@Sun.COM for (client = mip->mi_clients_list; 47878400SNicolas.Droux@Sun.COM client != NULL; 47888400SNicolas.Droux@Sun.COM client = client->mci_client_next) 47898400SNicolas.Droux@Sun.COM mac_tx_client_restart(client); 47908400SNicolas.Droux@Sun.COM 47918400SNicolas.Droux@Sun.COM break; 47928400SNicolas.Droux@Sun.COM } 47938275SEric Cheng 47948275SEric Cheng /* 47958275SEric Cheng * Note that we cannot simply invoke the group 47968275SEric Cheng * add/rem routines since the client doesn't have a 47978275SEric Cheng * TX group. So we need to instead add/remove 47988275SEric Cheng * the rings from the SRS. 47998275SEric Cheng */ 48008275SEric Cheng ASSERT(client->mci_share == NULL); 48018275SEric Cheng 48028275SEric Cheng /* first quiece the client */ 48038275SEric Cheng mac_tx_client_quiesce(client, SRS_QUIESCE); 48048275SEric Cheng 48058275SEric Cheng /* give a new ring to the client... */ 48068275SEric Cheng sring = mac_reserve_tx_ring(mip, NULL); 48078275SEric Cheng if (sring != NULL) { 48088275SEric Cheng /* 48098275SEric Cheng * There are no other available ring 48108275SEric Cheng * on that MAC instance. The client 48118275SEric Cheng * will fallback to the shared TX 48128275SEric Cheng * ring. 48138275SEric Cheng */ 48148275SEric Cheng mac_tx_srs_add_ring(srs, sring); 48158275SEric Cheng } 48168275SEric Cheng 48178275SEric Cheng /* ... in exchange for our desired ring */ 48188275SEric Cheng mac_tx_srs_del_ring(srs, desired_ring); 48198275SEric Cheng 48208275SEric Cheng /* restart the client */ 48218275SEric Cheng mac_tx_client_restart(client); 48228275SEric Cheng 48238400SNicolas.Droux@Sun.COM if (mip->mi_default_tx_ring == 48248400SNicolas.Droux@Sun.COM (mac_ring_handle_t)desired_ring) { 48258400SNicolas.Droux@Sun.COM /* 48268400SNicolas.Droux@Sun.COM * The desired ring is the default ring, 48278400SNicolas.Droux@Sun.COM * and there are one or more clients 48288400SNicolas.Droux@Sun.COM * using that default ring directly. 48298400SNicolas.Droux@Sun.COM */ 48308400SNicolas.Droux@Sun.COM mip->mi_default_tx_ring = 48318400SNicolas.Droux@Sun.COM (mac_ring_handle_t)sring; 48328400SNicolas.Droux@Sun.COM /* 48338400SNicolas.Droux@Sun.COM * Find clients using default ring and 48348400SNicolas.Droux@Sun.COM * swap it with the new default ring. 48358400SNicolas.Droux@Sun.COM */ 48368400SNicolas.Droux@Sun.COM for (client = mip->mi_clients_list; 48378400SNicolas.Droux@Sun.COM client != NULL; 48388400SNicolas.Droux@Sun.COM client = client->mci_client_next) { 48398400SNicolas.Droux@Sun.COM srs = MCIP_TX_SRS(client); 48408400SNicolas.Droux@Sun.COM if (srs != NULL && 48418400SNicolas.Droux@Sun.COM mac_tx_srs_ring_present(srs, 48428400SNicolas.Droux@Sun.COM desired_ring)) { 48438400SNicolas.Droux@Sun.COM /* first quiece the client */ 48448400SNicolas.Droux@Sun.COM mac_tx_client_quiesce(client, 48458400SNicolas.Droux@Sun.COM SRS_QUIESCE); 48468400SNicolas.Droux@Sun.COM 48478400SNicolas.Droux@Sun.COM /* 48488400SNicolas.Droux@Sun.COM * Give it the new default 48498400SNicolas.Droux@Sun.COM * ring, and remove the old 48508400SNicolas.Droux@Sun.COM * one. 48518400SNicolas.Droux@Sun.COM */ 48528400SNicolas.Droux@Sun.COM if (sring != NULL) { 48538400SNicolas.Droux@Sun.COM mac_tx_srs_add_ring(srs, 48548400SNicolas.Droux@Sun.COM sring); 48558400SNicolas.Droux@Sun.COM } 48568400SNicolas.Droux@Sun.COM mac_tx_srs_del_ring(srs, 48578400SNicolas.Droux@Sun.COM desired_ring); 48588400SNicolas.Droux@Sun.COM 48598400SNicolas.Droux@Sun.COM /* restart the client */ 48608400SNicolas.Droux@Sun.COM mac_tx_client_restart(client); 48618400SNicolas.Droux@Sun.COM } 48628400SNicolas.Droux@Sun.COM } 48638400SNicolas.Droux@Sun.COM } 48648275SEric Cheng break; 48658275SEric Cheng } 48668275SEric Cheng } 48678275SEric Cheng 48688275SEric Cheng if (ring != NULL) { 48698275SEric Cheng if (mac_start_ring(ring) != 0) 48708275SEric Cheng return (NULL); 48718275SEric Cheng ring->mr_state = MR_INUSE; 48728275SEric Cheng } 48738275SEric Cheng 48748275SEric Cheng return (ring); 48758275SEric Cheng } 48768275SEric Cheng 48778275SEric Cheng /* 48788275SEric Cheng * Minimum number of rings to leave in the default TX group when allocating 48798275SEric Cheng * rings to new clients. 48808275SEric Cheng */ 48818275SEric Cheng static uint_t mac_min_rx_default_rings = 1; 48828275SEric Cheng 48838275SEric Cheng /* 48848275SEric Cheng * Populate a zero-ring group with rings. If the share is non-NULL, 48858275SEric Cheng * the rings are chosen according to that share. 48868275SEric Cheng * Invoked after allocating a new RX or TX group through 48878275SEric Cheng * mac_reserve_rx_group() or mac_reserve_tx_group(), respectively. 48888275SEric Cheng * Returns zero on success, an errno otherwise. 48898275SEric Cheng */ 48908275SEric Cheng int 48918275SEric Cheng i_mac_group_allocate_rings(mac_impl_t *mip, mac_ring_type_t ring_type, 48928275SEric Cheng mac_group_t *src_group, mac_group_t *new_group, mac_share_handle_t share) 48938275SEric Cheng { 48948275SEric Cheng mac_ring_t **rings, *tmp_ring[1], *ring; 48958275SEric Cheng uint_t nrings; 48968275SEric Cheng int rv, i, j; 48978275SEric Cheng 48988275SEric Cheng ASSERT(mip->mi_rx_group_type == MAC_GROUP_TYPE_DYNAMIC && 48998275SEric Cheng mip->mi_tx_group_type == MAC_GROUP_TYPE_DYNAMIC); 49008275SEric Cheng ASSERT(new_group->mrg_cur_count == 0); 49018275SEric Cheng 49028275SEric Cheng /* 49038275SEric Cheng * First find the rings to allocate to the group. 49048275SEric Cheng */ 49058275SEric Cheng if (share != NULL) { 49068275SEric Cheng /* get rings through ms_squery() */ 49078275SEric Cheng mip->mi_share_capab.ms_squery(share, ring_type, NULL, &nrings); 49088275SEric Cheng ASSERT(nrings != 0); 49098275SEric Cheng rings = kmem_alloc(nrings * sizeof (mac_ring_handle_t), 49108275SEric Cheng KM_SLEEP); 49118275SEric Cheng mip->mi_share_capab.ms_squery(share, ring_type, 49128275SEric Cheng (mac_ring_handle_t *)rings, &nrings); 49138275SEric Cheng } else { 49148275SEric Cheng /* this function is called for TX only with a share */ 49158275SEric Cheng ASSERT(ring_type == MAC_RING_TYPE_RX); 49168275SEric Cheng /* 49178275SEric Cheng * Pick one ring from default group. 49188275SEric Cheng * 49198275SEric Cheng * for now pick the second ring which requires the first ring 49208275SEric Cheng * at index 0 to stay in the default group, since it is the 49218275SEric Cheng * ring which carries the multicast traffic. 49228275SEric Cheng * We need a better way for a driver to indicate this, 49238275SEric Cheng * for example a per-ring flag. 49248275SEric Cheng */ 49258275SEric Cheng for (ring = src_group->mrg_rings; ring != NULL; 49268275SEric Cheng ring = ring->mr_next) { 49278275SEric Cheng if (ring->mr_index != 0) 49288275SEric Cheng break; 49298275SEric Cheng } 49308275SEric Cheng ASSERT(ring != NULL); 49318275SEric Cheng nrings = 1; 49328275SEric Cheng tmp_ring[0] = ring; 49338275SEric Cheng rings = tmp_ring; 49348275SEric Cheng } 49358275SEric Cheng 49368275SEric Cheng switch (ring_type) { 49378275SEric Cheng case MAC_RING_TYPE_RX: 49388275SEric Cheng if (src_group->mrg_cur_count - nrings < 49398275SEric Cheng mac_min_rx_default_rings) { 49408275SEric Cheng /* we ran out of rings */ 49418275SEric Cheng return (ENOSPC); 49428275SEric Cheng } 49438275SEric Cheng 49448275SEric Cheng /* move receive rings to new group */ 49458275SEric Cheng for (i = 0; i < nrings; i++) { 49468275SEric Cheng rv = mac_group_mov_ring(mip, new_group, rings[i]); 49478275SEric Cheng if (rv != 0) { 49488275SEric Cheng /* move rings back on failure */ 49498275SEric Cheng for (j = 0; j < i; j++) { 49508275SEric Cheng (void) mac_group_mov_ring(mip, 49518275SEric Cheng src_group, rings[j]); 49528275SEric Cheng } 49538275SEric Cheng return (rv); 49548275SEric Cheng } 49558275SEric Cheng } 49568275SEric Cheng break; 49578275SEric Cheng 49588275SEric Cheng case MAC_RING_TYPE_TX: { 49598275SEric Cheng mac_ring_t *tmp_ring; 49608275SEric Cheng 49618275SEric Cheng /* move the TX rings to the new group */ 49628275SEric Cheng ASSERT(src_group == NULL); 49638275SEric Cheng for (i = 0; i < nrings; i++) { 49648275SEric Cheng /* get the desired ring */ 49658275SEric Cheng tmp_ring = mac_reserve_tx_ring(mip, rings[i]); 49668275SEric Cheng ASSERT(tmp_ring == rings[i]); 49678275SEric Cheng rv = mac_group_mov_ring(mip, new_group, rings[i]); 49688275SEric Cheng if (rv != 0) { 49698275SEric Cheng /* cleanup on failure */ 49708275SEric Cheng for (j = 0; j < i; j++) { 49718275SEric Cheng (void) mac_group_mov_ring(mip, 49728275SEric Cheng mip->mi_tx_groups + 49738275SEric Cheng mip->mi_tx_group_count, rings[j]); 49748275SEric Cheng } 49758275SEric Cheng } 49768275SEric Cheng } 49778275SEric Cheng break; 49788275SEric Cheng } 49798275SEric Cheng } 49808275SEric Cheng 49818275SEric Cheng if (share != NULL) { 49828275SEric Cheng /* add group to share */ 49838275SEric Cheng mip->mi_share_capab.ms_sadd(share, new_group->mrg_driver); 49848275SEric Cheng /* free temporary array of rings */ 49858275SEric Cheng kmem_free(rings, nrings * sizeof (mac_ring_handle_t)); 49868275SEric Cheng } 49878275SEric Cheng 49888275SEric Cheng return (0); 49898275SEric Cheng } 49908275SEric Cheng 49918275SEric Cheng void 49928275SEric Cheng mac_rx_group_add_client(mac_group_t *grp, mac_client_impl_t *mcip) 49938275SEric Cheng { 49948275SEric Cheng mac_grp_client_t *mgcp; 49958275SEric Cheng 49968275SEric Cheng for (mgcp = grp->mrg_clients; mgcp != NULL; mgcp = mgcp->mgc_next) { 49978275SEric Cheng if (mgcp->mgc_client == mcip) 49988275SEric Cheng break; 49998275SEric Cheng } 50008275SEric Cheng 50018275SEric Cheng VERIFY(mgcp == NULL); 50028275SEric Cheng 50038275SEric Cheng mgcp = kmem_zalloc(sizeof (mac_grp_client_t), KM_SLEEP); 50048275SEric Cheng mgcp->mgc_client = mcip; 50058275SEric Cheng mgcp->mgc_next = grp->mrg_clients; 50068275SEric Cheng grp->mrg_clients = mgcp; 50078275SEric Cheng 50088275SEric Cheng } 50098275SEric Cheng 50108275SEric Cheng void 50118275SEric Cheng mac_rx_group_remove_client(mac_group_t *grp, mac_client_impl_t *mcip) 50128275SEric Cheng { 50138275SEric Cheng mac_grp_client_t *mgcp, **pprev; 50148275SEric Cheng 50158275SEric Cheng for (pprev = &grp->mrg_clients, mgcp = *pprev; mgcp != NULL; 50168275SEric Cheng pprev = &mgcp->mgc_next, mgcp = *pprev) { 50178275SEric Cheng if (mgcp->mgc_client == mcip) 50188275SEric Cheng break; 50198275SEric Cheng } 50208275SEric Cheng 50218275SEric Cheng ASSERT(mgcp != NULL); 50228275SEric Cheng 50238275SEric Cheng *pprev = mgcp->mgc_next; 50248275SEric Cheng kmem_free(mgcp, sizeof (mac_grp_client_t)); 50258275SEric Cheng } 50268275SEric Cheng 50278275SEric Cheng /* 50288275SEric Cheng * mac_reserve_rx_group() 50298275SEric Cheng * 50308275SEric Cheng * Finds an available group and exclusively reserves it for a client. 50318275SEric Cheng * The group is chosen to suit the flow's resource controls (bandwidth and 50328275SEric Cheng * fanout requirements) and the address type. 50338275SEric Cheng * If the requestor is the pimary MAC then return the group with the 50348275SEric Cheng * largest number of rings, otherwise the default ring when available. 50358275SEric Cheng */ 50368275SEric Cheng mac_group_t * 50378275SEric Cheng mac_reserve_rx_group(mac_client_impl_t *mcip, uint8_t *mac_addr, 50388275SEric Cheng mac_rx_group_reserve_type_t rtype) 50398275SEric Cheng { 50408275SEric Cheng mac_share_handle_t share = mcip->mci_share; 50418275SEric Cheng mac_impl_t *mip = mcip->mci_mip; 50428275SEric Cheng mac_group_t *grp = NULL; 50438275SEric Cheng int i, start, loopcount; 50448275SEric Cheng int err; 50458275SEric Cheng mac_address_t *map; 50468275SEric Cheng 50478275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mip)); 50488275SEric Cheng 50498275SEric Cheng /* Check if a group already has this mac address (case of VLANs) */ 50508275SEric Cheng if ((map = mac_find_macaddr(mip, mac_addr)) != NULL) 50518275SEric Cheng return (map->ma_group); 50528275SEric Cheng 50538275SEric Cheng if (mip->mi_rx_groups == NULL || mip->mi_rx_group_count == 0 || 50548275SEric Cheng rtype == MAC_RX_NO_RESERVE) 50558275SEric Cheng return (NULL); 50568275SEric Cheng 50578275SEric Cheng /* 50588275SEric Cheng * Try to exclusively reserve a RX group. 50598275SEric Cheng * 50608275SEric Cheng * For flows requires SW_RING it always goes to the default group 50618275SEric Cheng * (Until we can explicitely call out default groups (CR 6695600), 50628275SEric Cheng * we assume that the default group is always at position zero); 50638275SEric Cheng * 50648275SEric Cheng * For flows requires HW_DEFAULT_RING (unicast flow of the primary 50658275SEric Cheng * client), try to reserve the default RX group only. 50668275SEric Cheng * 50678275SEric Cheng * For flows requires HW_RING (unicast flow of other clients), try 50688275SEric Cheng * to reserve non-default RX group then the default group. 50698275SEric Cheng */ 50708275SEric Cheng switch (rtype) { 50718275SEric Cheng case MAC_RX_RESERVE_DEFAULT: 50728275SEric Cheng start = 0; 50738275SEric Cheng loopcount = 1; 50748275SEric Cheng break; 50758275SEric Cheng case MAC_RX_RESERVE_NONDEFAULT: 50768275SEric Cheng start = 1; 50778275SEric Cheng loopcount = mip->mi_rx_group_count; 50788275SEric Cheng } 50798275SEric Cheng 50808275SEric Cheng for (i = start; i < start + loopcount; i++) { 50818275SEric Cheng grp = &mip->mi_rx_groups[i % mip->mi_rx_group_count]; 50828275SEric Cheng 50838275SEric Cheng DTRACE_PROBE3(rx__group__trying, char *, mip->mi_name, 50848275SEric Cheng int, grp->mrg_index, mac_group_state_t, grp->mrg_state); 50858275SEric Cheng 50868275SEric Cheng /* 50878275SEric Cheng * Check to see whether this mac client is the only client 50888275SEric Cheng * on this RX group. If not, we cannot exclusively reserve 50898275SEric Cheng * this RX group. 50908275SEric Cheng */ 50918275SEric Cheng if (!MAC_RX_GROUP_NO_CLIENT(grp) && 50928275SEric Cheng (MAC_RX_GROUP_ONLY_CLIENT(grp) != mcip)) { 50938275SEric Cheng continue; 50948275SEric Cheng } 50958275SEric Cheng 50968275SEric Cheng /* 50978275SEric Cheng * This group could already be SHARED by other multicast 50988275SEric Cheng * flows on this client. In that case, the group would 50998275SEric Cheng * be shared and has already been started. 51008275SEric Cheng */ 51018275SEric Cheng ASSERT(grp->mrg_state != MAC_GROUP_STATE_UNINIT); 51028275SEric Cheng 51038275SEric Cheng if ((grp->mrg_state == MAC_GROUP_STATE_REGISTERED) && 51048275SEric Cheng (mac_start_group(grp) != 0)) { 51058275SEric Cheng continue; 51068275SEric Cheng } 51078275SEric Cheng 51088275SEric Cheng if ((i % mip->mi_rx_group_count) == 0 || 51098275SEric Cheng mip->mi_rx_group_type != MAC_GROUP_TYPE_DYNAMIC) { 51108275SEric Cheng break; 51118275SEric Cheng } 51128275SEric Cheng 51138275SEric Cheng ASSERT(grp->mrg_cur_count == 0); 51148275SEric Cheng 51158275SEric Cheng /* 51168275SEric Cheng * Populate the group. Rings should be taken 51178275SEric Cheng * from the default group at position 0 for now. 51188275SEric Cheng */ 51198275SEric Cheng 51208275SEric Cheng err = i_mac_group_allocate_rings(mip, MAC_RING_TYPE_RX, 51218275SEric Cheng &mip->mi_rx_groups[0], grp, share); 51228275SEric Cheng if (err == 0) 51238275SEric Cheng break; 51248275SEric Cheng 51258275SEric Cheng DTRACE_PROBE3(rx__group__reserve__alloc__rings, char *, 51268275SEric Cheng mip->mi_name, int, grp->mrg_index, int, err); 51278275SEric Cheng 51288275SEric Cheng /* 51298275SEric Cheng * It's a dynamic group but the grouping operation failed. 51308275SEric Cheng */ 51318275SEric Cheng mac_stop_group(grp); 51328275SEric Cheng } 51338275SEric Cheng 51348275SEric Cheng if (i == start + loopcount) 51358275SEric Cheng return (NULL); 51368275SEric Cheng 51378275SEric Cheng ASSERT(grp != NULL); 51388275SEric Cheng 51398275SEric Cheng DTRACE_PROBE2(rx__group__reserved, 51408275SEric Cheng char *, mip->mi_name, int, grp->mrg_index); 51418275SEric Cheng return (grp); 51428275SEric Cheng } 51438275SEric Cheng 51448275SEric Cheng /* 51458275SEric Cheng * mac_rx_release_group() 51468275SEric Cheng * 51478275SEric Cheng * This is called when there are no clients left for the group. 51488275SEric Cheng * The group is stopped and marked MAC_GROUP_STATE_REGISTERED, 51498275SEric Cheng * and if it is a non default group, the shares are removed and 51508275SEric Cheng * all rings are assigned back to default group. 51518275SEric Cheng */ 51528275SEric Cheng void 51538275SEric Cheng mac_release_rx_group(mac_client_impl_t *mcip, mac_group_t *group) 51548275SEric Cheng { 51558275SEric Cheng mac_impl_t *mip = mcip->mci_mip; 51568275SEric Cheng mac_ring_t *ring; 51578275SEric Cheng 51588275SEric Cheng ASSERT(group != &mip->mi_rx_groups[0]); 51598275SEric Cheng 51608275SEric Cheng /* 51618275SEric Cheng * This is the case where there are no clients left. Any 51628275SEric Cheng * SRS etc on this group have also be quiesced. 51638275SEric Cheng */ 51648275SEric Cheng for (ring = group->mrg_rings; ring != NULL; ring = ring->mr_next) { 51658275SEric Cheng if (ring->mr_classify_type == MAC_HW_CLASSIFIER) { 51668275SEric Cheng ASSERT(group->mrg_state == MAC_GROUP_STATE_RESERVED); 51678275SEric Cheng /* 51688275SEric Cheng * Remove the SRS associated with the HW ring. 51698275SEric Cheng * As a result, polling will be disabled. 51708275SEric Cheng */ 51718275SEric Cheng ring->mr_srs = NULL; 51728275SEric Cheng } 51738275SEric Cheng ASSERT(ring->mr_state == MR_INUSE); 51748275SEric Cheng mac_stop_ring(ring); 51758275SEric Cheng ring->mr_state = MR_FREE; 51768275SEric Cheng ring->mr_flag = 0; 51778275SEric Cheng } 51788275SEric Cheng 51798275SEric Cheng /* remove group from share */ 51808275SEric Cheng if (mcip->mci_share != NULL) { 51818275SEric Cheng mip->mi_share_capab.ms_sremove(mcip->mci_share, 51828275SEric Cheng group->mrg_driver); 51838275SEric Cheng } 51848275SEric Cheng 51858275SEric Cheng if (mip->mi_rx_group_type == MAC_GROUP_TYPE_DYNAMIC) { 51868275SEric Cheng mac_ring_t *ring; 51878275SEric Cheng 51888275SEric Cheng /* 51898275SEric Cheng * Rings were dynamically allocated to group. 51908275SEric Cheng * Move rings back to default group. 51918275SEric Cheng */ 51928275SEric Cheng while ((ring = group->mrg_rings) != NULL) { 51938275SEric Cheng (void) mac_group_mov_ring(mip, 51948275SEric Cheng &mip->mi_rx_groups[0], ring); 51958275SEric Cheng } 51968275SEric Cheng } 51978275SEric Cheng mac_stop_group(group); 51988275SEric Cheng /* 51998275SEric Cheng * Possible improvement: See if we can assign the group just released 52008275SEric Cheng * to a another client of the mip 52018275SEric Cheng */ 52028275SEric Cheng } 52038275SEric Cheng 52048275SEric Cheng /* 52058275SEric Cheng * Reserves a TX group for the specified share. Invoked by mac_tx_srs_setup() 52068275SEric Cheng * when a share was allocated to the client. 52078275SEric Cheng */ 52088275SEric Cheng mac_group_t * 52098275SEric Cheng mac_reserve_tx_group(mac_impl_t *mip, mac_share_handle_t share) 52108275SEric Cheng { 52118275SEric Cheng mac_group_t *grp; 52128275SEric Cheng int rv, i; 52138275SEric Cheng 52148275SEric Cheng /* 52158275SEric Cheng * TX groups are currently allocated only to MAC clients 52168275SEric Cheng * which are associated with a share. Since we have a fixed 52178275SEric Cheng * number of share and groups, and we already successfully 52188275SEric Cheng * allocated a share, find an available TX group. 52198275SEric Cheng */ 52208275SEric Cheng ASSERT(share != NULL); 52218275SEric Cheng ASSERT(mip->mi_tx_group_free > 0); 52228275SEric Cheng 52238275SEric Cheng for (i = 0; i < mip->mi_tx_group_count; i++) { 52248275SEric Cheng grp = &mip->mi_tx_groups[i]; 52258275SEric Cheng 52268275SEric Cheng if ((grp->mrg_state == MAC_GROUP_STATE_RESERVED) || 52278275SEric Cheng (grp->mrg_state == MAC_GROUP_STATE_UNINIT)) 52288275SEric Cheng continue; 52298275SEric Cheng 52308275SEric Cheng rv = mac_start_group(grp); 52318275SEric Cheng ASSERT(rv == 0); 52328275SEric Cheng 52338275SEric Cheng grp->mrg_state = MAC_GROUP_STATE_RESERVED; 52348275SEric Cheng break; 52358275SEric Cheng } 52368275SEric Cheng 52378275SEric Cheng ASSERT(grp != NULL); 52388275SEric Cheng 52398275SEric Cheng /* 52408275SEric Cheng * Populate the group. Rings should be taken from the group 52418275SEric Cheng * of unassigned rings, which is past the array of TX 52428275SEric Cheng * groups adversized by the driver. 52438275SEric Cheng */ 52448275SEric Cheng rv = i_mac_group_allocate_rings(mip, MAC_RING_TYPE_TX, NULL, 52458275SEric Cheng grp, share); 52468275SEric Cheng if (rv != 0) { 52478275SEric Cheng DTRACE_PROBE3(tx__group__reserve__alloc__rings, 52488275SEric Cheng char *, mip->mi_name, int, grp->mrg_index, int, rv); 52498275SEric Cheng 52508275SEric Cheng mac_stop_group(grp); 52518275SEric Cheng grp->mrg_state = MAC_GROUP_STATE_UNINIT; 52528275SEric Cheng 52538275SEric Cheng return (NULL); 52548275SEric Cheng } 52558275SEric Cheng 52568275SEric Cheng mip->mi_tx_group_free--; 52578275SEric Cheng 52588275SEric Cheng return (grp); 52598275SEric Cheng } 52608275SEric Cheng 52618275SEric Cheng void 52628275SEric Cheng mac_release_tx_group(mac_impl_t *mip, mac_group_t *grp) 52638275SEric Cheng { 52648275SEric Cheng mac_client_impl_t *mcip = grp->mrg_tx_client; 52658275SEric Cheng mac_share_handle_t share = mcip->mci_share; 52668275SEric Cheng mac_ring_t *ring; 52678275SEric Cheng 52688275SEric Cheng ASSERT(mip->mi_tx_group_type == MAC_GROUP_TYPE_DYNAMIC); 52698275SEric Cheng ASSERT(share != NULL); 52708275SEric Cheng ASSERT(grp->mrg_state == MAC_GROUP_STATE_RESERVED); 52718275SEric Cheng 52728275SEric Cheng mip->mi_share_capab.ms_sremove(share, grp->mrg_driver); 52738275SEric Cheng while ((ring = grp->mrg_rings) != NULL) { 52748275SEric Cheng /* move the ring back to the pool */ 52758275SEric Cheng (void) mac_group_mov_ring(mip, mip->mi_tx_groups + 52768275SEric Cheng mip->mi_tx_group_count, ring); 52778275SEric Cheng } 52788275SEric Cheng mac_stop_group(grp); 52798275SEric Cheng mac_set_rx_group_state(grp, MAC_GROUP_STATE_REGISTERED); 52808275SEric Cheng grp->mrg_tx_client = NULL; 52818275SEric Cheng mip->mi_tx_group_free++; 52828275SEric Cheng } 52838275SEric Cheng 52848275SEric Cheng /* 52858275SEric Cheng * This is a 1-time control path activity initiated by the client (IP). 52868275SEric Cheng * The mac perimeter protects against other simultaneous control activities, 52878275SEric Cheng * for example an ioctl that attempts to change the degree of fanout and 52888275SEric Cheng * increase or decrease the number of softrings associated with this Tx SRS. 52898275SEric Cheng */ 52908275SEric Cheng static mac_tx_notify_cb_t * 52918275SEric Cheng mac_client_tx_notify_add(mac_client_impl_t *mcip, 52928275SEric Cheng mac_tx_notify_t notify, void *arg) 52938275SEric Cheng { 52948275SEric Cheng mac_cb_info_t *mcbi; 52958275SEric Cheng mac_tx_notify_cb_t *mtnfp; 52968275SEric Cheng 52978275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mcip->mci_mip)); 52988275SEric Cheng 52998275SEric Cheng mtnfp = kmem_zalloc(sizeof (mac_tx_notify_cb_t), KM_SLEEP); 53008275SEric Cheng mtnfp->mtnf_fn = notify; 53018275SEric Cheng mtnfp->mtnf_arg = arg; 53028275SEric Cheng mtnfp->mtnf_link.mcb_objp = mtnfp; 53038275SEric Cheng mtnfp->mtnf_link.mcb_objsize = sizeof (mac_tx_notify_cb_t); 53048275SEric Cheng mtnfp->mtnf_link.mcb_flags = MCB_TX_NOTIFY_CB_T; 53058275SEric Cheng 53068275SEric Cheng mcbi = &mcip->mci_tx_notify_cb_info; 53078275SEric Cheng mutex_enter(mcbi->mcbi_lockp); 53088275SEric Cheng mac_callback_add(mcbi, &mcip->mci_tx_notify_cb_list, &mtnfp->mtnf_link); 53098275SEric Cheng mutex_exit(mcbi->mcbi_lockp); 53108275SEric Cheng return (mtnfp); 53118275SEric Cheng } 53128275SEric Cheng 53138275SEric Cheng static void 53148275SEric Cheng mac_client_tx_notify_remove(mac_client_impl_t *mcip, mac_tx_notify_cb_t *mtnfp) 53158275SEric Cheng { 53168275SEric Cheng mac_cb_info_t *mcbi; 53178275SEric Cheng mac_cb_t **cblist; 53188275SEric Cheng 53198275SEric Cheng ASSERT(MAC_PERIM_HELD((mac_handle_t)mcip->mci_mip)); 53208275SEric Cheng 53218275SEric Cheng if (!mac_callback_find(&mcip->mci_tx_notify_cb_info, 53228275SEric Cheng &mcip->mci_tx_notify_cb_list, &mtnfp->mtnf_link)) { 53238275SEric Cheng cmn_err(CE_WARN, 53248275SEric Cheng "mac_client_tx_notify_remove: callback not " 53258275SEric Cheng "found, mcip 0x%p mtnfp 0x%p", (void *)mcip, (void *)mtnfp); 53268275SEric Cheng return; 53278275SEric Cheng } 53288275SEric Cheng 53298275SEric Cheng mcbi = &mcip->mci_tx_notify_cb_info; 53308275SEric Cheng cblist = &mcip->mci_tx_notify_cb_list; 53318275SEric Cheng mutex_enter(mcbi->mcbi_lockp); 53328275SEric Cheng if (mac_callback_remove(mcbi, cblist, &mtnfp->mtnf_link)) 53338275SEric Cheng kmem_free(mtnfp, sizeof (mac_tx_notify_cb_t)); 53348275SEric Cheng else 53358275SEric Cheng mac_callback_remove_wait(&mcip->mci_tx_notify_cb_info); 53368275SEric Cheng mutex_exit(mcbi->mcbi_lockp); 53378275SEric Cheng } 53388275SEric Cheng 53398275SEric Cheng /* 53408275SEric Cheng * mac_client_tx_notify(): 53418275SEric Cheng * call to add and remove flow control callback routine. 53428275SEric Cheng */ 53438275SEric Cheng mac_tx_notify_handle_t 53448275SEric Cheng mac_client_tx_notify(mac_client_handle_t mch, mac_tx_notify_t callb_func, 53458275SEric Cheng void *ptr) 53468275SEric Cheng { 53478275SEric Cheng mac_client_impl_t *mcip = (mac_client_impl_t *)mch; 53488275SEric Cheng mac_tx_notify_cb_t *mtnfp = NULL; 53498275SEric Cheng 53508275SEric Cheng i_mac_perim_enter(mcip->mci_mip); 53518275SEric Cheng 53528275SEric Cheng if (callb_func != NULL) { 53538275SEric Cheng /* Add a notify callback */ 53548275SEric Cheng mtnfp = mac_client_tx_notify_add(mcip, callb_func, ptr); 53558275SEric Cheng } else { 53568275SEric Cheng mac_client_tx_notify_remove(mcip, (mac_tx_notify_cb_t *)ptr); 53578275SEric Cheng } 53588275SEric Cheng i_mac_perim_exit(mcip->mci_mip); 53598275SEric Cheng 53608275SEric Cheng return ((mac_tx_notify_handle_t)mtnfp); 53618275SEric Cheng } 536210491SRishi.Srivatsavai@Sun.COM 536310491SRishi.Srivatsavai@Sun.COM void 536410491SRishi.Srivatsavai@Sun.COM mac_bridge_vectors(mac_bridge_tx_t txf, mac_bridge_rx_t rxf, 536510491SRishi.Srivatsavai@Sun.COM mac_bridge_ref_t reff, mac_bridge_ls_t lsf) 536610491SRishi.Srivatsavai@Sun.COM { 536710491SRishi.Srivatsavai@Sun.COM mac_bridge_tx_cb = txf; 536810491SRishi.Srivatsavai@Sun.COM mac_bridge_rx_cb = rxf; 536910491SRishi.Srivatsavai@Sun.COM mac_bridge_ref_cb = reff; 537010491SRishi.Srivatsavai@Sun.COM mac_bridge_ls_cb = lsf; 537110491SRishi.Srivatsavai@Sun.COM } 537210491SRishi.Srivatsavai@Sun.COM 537310491SRishi.Srivatsavai@Sun.COM int 537410491SRishi.Srivatsavai@Sun.COM mac_bridge_set(mac_handle_t mh, mac_handle_t link) 537510491SRishi.Srivatsavai@Sun.COM { 537610491SRishi.Srivatsavai@Sun.COM mac_impl_t *mip = (mac_impl_t *)mh; 537710491SRishi.Srivatsavai@Sun.COM int retv; 537810491SRishi.Srivatsavai@Sun.COM 537910491SRishi.Srivatsavai@Sun.COM mutex_enter(&mip->mi_bridge_lock); 538010491SRishi.Srivatsavai@Sun.COM if (mip->mi_bridge_link == NULL) { 538110491SRishi.Srivatsavai@Sun.COM mip->mi_bridge_link = link; 538210491SRishi.Srivatsavai@Sun.COM retv = 0; 538310491SRishi.Srivatsavai@Sun.COM } else { 538410491SRishi.Srivatsavai@Sun.COM retv = EBUSY; 538510491SRishi.Srivatsavai@Sun.COM } 538610491SRishi.Srivatsavai@Sun.COM mutex_exit(&mip->mi_bridge_lock); 538710491SRishi.Srivatsavai@Sun.COM if (retv == 0) { 538810491SRishi.Srivatsavai@Sun.COM mac_poll_state_change(mh, B_FALSE); 538910491SRishi.Srivatsavai@Sun.COM mac_capab_update(mh); 539010491SRishi.Srivatsavai@Sun.COM } 539110491SRishi.Srivatsavai@Sun.COM return (retv); 539210491SRishi.Srivatsavai@Sun.COM } 539310491SRishi.Srivatsavai@Sun.COM 539410491SRishi.Srivatsavai@Sun.COM /* 539510491SRishi.Srivatsavai@Sun.COM * Disable bridging on the indicated link. 539610491SRishi.Srivatsavai@Sun.COM */ 539710491SRishi.Srivatsavai@Sun.COM void 539810491SRishi.Srivatsavai@Sun.COM mac_bridge_clear(mac_handle_t mh, mac_handle_t link) 539910491SRishi.Srivatsavai@Sun.COM { 540010491SRishi.Srivatsavai@Sun.COM mac_impl_t *mip = (mac_impl_t *)mh; 540110491SRishi.Srivatsavai@Sun.COM 540210491SRishi.Srivatsavai@Sun.COM mutex_enter(&mip->mi_bridge_lock); 540310491SRishi.Srivatsavai@Sun.COM ASSERT(mip->mi_bridge_link == link); 540410491SRishi.Srivatsavai@Sun.COM mip->mi_bridge_link = NULL; 540510491SRishi.Srivatsavai@Sun.COM mutex_exit(&mip->mi_bridge_lock); 540610491SRishi.Srivatsavai@Sun.COM mac_poll_state_change(mh, B_TRUE); 540710491SRishi.Srivatsavai@Sun.COM mac_capab_update(mh); 540810491SRishi.Srivatsavai@Sun.COM } 540910491SRishi.Srivatsavai@Sun.COM 541010491SRishi.Srivatsavai@Sun.COM void 541110491SRishi.Srivatsavai@Sun.COM mac_no_active(mac_handle_t mh) 541210491SRishi.Srivatsavai@Sun.COM { 541310491SRishi.Srivatsavai@Sun.COM mac_impl_t *mip = (mac_impl_t *)mh; 541410491SRishi.Srivatsavai@Sun.COM 541510491SRishi.Srivatsavai@Sun.COM i_mac_perim_enter(mip); 541610491SRishi.Srivatsavai@Sun.COM mip->mi_state_flags |= MIS_NO_ACTIVE; 541710491SRishi.Srivatsavai@Sun.COM i_mac_perim_exit(mip); 541810491SRishi.Srivatsavai@Sun.COM } 5419