11b13d190SFrançois Tigeot /* 21b13d190SFrançois Tigeot * Copyright © 2014 Intel Corporation 31b13d190SFrançois Tigeot * 41b13d190SFrançois Tigeot * Permission is hereby granted, free of charge, to any person obtaining a 51b13d190SFrançois Tigeot * copy of this software and associated documentation files (the "Software"), 61b13d190SFrançois Tigeot * to deal in the Software without restriction, including without limitation 71b13d190SFrançois Tigeot * the rights to use, copy, modify, merge, publish, distribute, sublicense, 81b13d190SFrançois Tigeot * and/or sell copies of the Software, and to permit persons to whom the 91b13d190SFrançois Tigeot * Software is furnished to do so, subject to the following conditions: 101b13d190SFrançois Tigeot * 111b13d190SFrançois Tigeot * The above copyright notice and this permission notice (including the next 121b13d190SFrançois Tigeot * paragraph) shall be included in all copies or substantial portions of the 131b13d190SFrançois Tigeot * Software. 141b13d190SFrançois Tigeot * 151b13d190SFrançois Tigeot * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 161b13d190SFrançois Tigeot * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 171b13d190SFrançois Tigeot * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL 181b13d190SFrançois Tigeot * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 191b13d190SFrançois Tigeot * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 201b13d190SFrançois Tigeot * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS 211b13d190SFrançois Tigeot * IN THE SOFTWARE. 221b13d190SFrançois Tigeot * 231b13d190SFrançois Tigeot * Authors: 241b13d190SFrançois Tigeot * Ben Widawsky <ben@bwidawsk.net> 251b13d190SFrançois Tigeot * Michel Thierry <michel.thierry@intel.com> 261b13d190SFrançois Tigeot * Thomas Daniel <thomas.daniel@intel.com> 271b13d190SFrançois Tigeot * Oscar Mateo <oscar.mateo@intel.com> 281b13d190SFrançois Tigeot * 291b13d190SFrançois Tigeot */ 301b13d190SFrançois Tigeot 311b13d190SFrançois Tigeot /** 321b13d190SFrançois Tigeot * DOC: Logical Rings, Logical Ring Contexts and Execlists 331b13d190SFrançois Tigeot * 341b13d190SFrançois Tigeot * Motivation: 351b13d190SFrançois Tigeot * GEN8 brings an expansion of the HW contexts: "Logical Ring Contexts". 361b13d190SFrançois Tigeot * These expanded contexts enable a number of new abilities, especially 371b13d190SFrançois Tigeot * "Execlists" (also implemented in this file). 381b13d190SFrançois Tigeot * 391b13d190SFrançois Tigeot * One of the main differences with the legacy HW contexts is that logical 401b13d190SFrançois Tigeot * ring contexts incorporate many more things to the context's state, like 411b13d190SFrançois Tigeot * PDPs or ringbuffer control registers: 421b13d190SFrançois Tigeot * 431b13d190SFrançois Tigeot * The reason why PDPs are included in the context is straightforward: as 441b13d190SFrançois Tigeot * PPGTTs (per-process GTTs) are actually per-context, having the PDPs 451b13d190SFrançois Tigeot * contained there mean you don't need to do a ppgtt->switch_mm yourself, 461b13d190SFrançois Tigeot * instead, the GPU will do it for you on the context switch. 471b13d190SFrançois Tigeot * 481b13d190SFrançois Tigeot * But, what about the ringbuffer control registers (head, tail, etc..)? 491b13d190SFrançois Tigeot * shouldn't we just need a set of those per engine command streamer? This is 501b13d190SFrançois Tigeot * where the name "Logical Rings" starts to make sense: by virtualizing the 511b13d190SFrançois Tigeot * rings, the engine cs shifts to a new "ring buffer" with every context 521b13d190SFrançois Tigeot * switch. When you want to submit a workload to the GPU you: A) choose your 531b13d190SFrançois Tigeot * context, B) find its appropriate virtualized ring, C) write commands to it 541b13d190SFrançois Tigeot * and then, finally, D) tell the GPU to switch to that context. 551b13d190SFrançois Tigeot * 561b13d190SFrançois Tigeot * Instead of the legacy MI_SET_CONTEXT, the way you tell the GPU to switch 571b13d190SFrançois Tigeot * to a contexts is via a context execution list, ergo "Execlists". 581b13d190SFrançois Tigeot * 591b13d190SFrançois Tigeot * LRC implementation: 601b13d190SFrançois Tigeot * Regarding the creation of contexts, we have: 611b13d190SFrançois Tigeot * 621b13d190SFrançois Tigeot * - One global default context. 631b13d190SFrançois Tigeot * - One local default context for each opened fd. 641b13d190SFrançois Tigeot * - One local extra context for each context create ioctl call. 651b13d190SFrançois Tigeot * 661b13d190SFrançois Tigeot * Now that ringbuffers belong per-context (and not per-engine, like before) 671b13d190SFrançois Tigeot * and that contexts are uniquely tied to a given engine (and not reusable, 681b13d190SFrançois Tigeot * like before) we need: 691b13d190SFrançois Tigeot * 701b13d190SFrançois Tigeot * - One ringbuffer per-engine inside each context. 711b13d190SFrançois Tigeot * - One backing object per-engine inside each context. 721b13d190SFrançois Tigeot * 731b13d190SFrançois Tigeot * The global default context starts its life with these new objects fully 741b13d190SFrançois Tigeot * allocated and populated. The local default context for each opened fd is 751b13d190SFrançois Tigeot * more complex, because we don't know at creation time which engine is going 761b13d190SFrançois Tigeot * to use them. To handle this, we have implemented a deferred creation of LR 771b13d190SFrançois Tigeot * contexts: 781b13d190SFrançois Tigeot * 791b13d190SFrançois Tigeot * The local context starts its life as a hollow or blank holder, that only 801b13d190SFrançois Tigeot * gets populated for a given engine once we receive an execbuffer. If later 811b13d190SFrançois Tigeot * on we receive another execbuffer ioctl for the same context but a different 821b13d190SFrançois Tigeot * engine, we allocate/populate a new ringbuffer and context backing object and 831b13d190SFrançois Tigeot * so on. 841b13d190SFrançois Tigeot * 851b13d190SFrançois Tigeot * Finally, regarding local contexts created using the ioctl call: as they are 861b13d190SFrançois Tigeot * only allowed with the render ring, we can allocate & populate them right 871b13d190SFrançois Tigeot * away (no need to defer anything, at least for now). 881b13d190SFrançois Tigeot * 891b13d190SFrançois Tigeot * Execlists implementation: 901b13d190SFrançois Tigeot * Execlists are the new method by which, on gen8+ hardware, workloads are 911b13d190SFrançois Tigeot * submitted for execution (as opposed to the legacy, ringbuffer-based, method). 921b13d190SFrançois Tigeot * This method works as follows: 931b13d190SFrançois Tigeot * 941b13d190SFrançois Tigeot * When a request is committed, its commands (the BB start and any leading or 951b13d190SFrançois Tigeot * trailing commands, like the seqno breadcrumbs) are placed in the ringbuffer 961b13d190SFrançois Tigeot * for the appropriate context. The tail pointer in the hardware context is not 971b13d190SFrançois Tigeot * updated at this time, but instead, kept by the driver in the ringbuffer 981b13d190SFrançois Tigeot * structure. A structure representing this request is added to a request queue 991b13d190SFrançois Tigeot * for the appropriate engine: this structure contains a copy of the context's 1001b13d190SFrançois Tigeot * tail after the request was written to the ring buffer and a pointer to the 1011b13d190SFrançois Tigeot * context itself. 1021b13d190SFrançois Tigeot * 1031b13d190SFrançois Tigeot * If the engine's request queue was empty before the request was added, the 1041b13d190SFrançois Tigeot * queue is processed immediately. Otherwise the queue will be processed during 1051b13d190SFrançois Tigeot * a context switch interrupt. In any case, elements on the queue will get sent 1061b13d190SFrançois Tigeot * (in pairs) to the GPU's ExecLists Submit Port (ELSP, for short) with a 1071b13d190SFrançois Tigeot * globally unique 20-bits submission ID. 1081b13d190SFrançois Tigeot * 1091b13d190SFrançois Tigeot * When execution of a request completes, the GPU updates the context status 1101b13d190SFrançois Tigeot * buffer with a context complete event and generates a context switch interrupt. 1111b13d190SFrançois Tigeot * During the interrupt handling, the driver examines the events in the buffer: 1121b13d190SFrançois Tigeot * for each context complete event, if the announced ID matches that on the head 1131b13d190SFrançois Tigeot * of the request queue, then that request is retired and removed from the queue. 1141b13d190SFrançois Tigeot * 1151b13d190SFrançois Tigeot * After processing, if any requests were retired and the queue is not empty 1161b13d190SFrançois Tigeot * then a new execution list can be submitted. The two requests at the front of 1171b13d190SFrançois Tigeot * the queue are next to be submitted but since a context may not occur twice in 1181b13d190SFrançois Tigeot * an execution list, if subsequent requests have the same ID as the first then 1191b13d190SFrançois Tigeot * the two requests must be combined. This is done simply by discarding requests 1201b13d190SFrançois Tigeot * at the head of the queue until either only one requests is left (in which case 1211b13d190SFrançois Tigeot * we use a NULL second context) or the first two requests have unique IDs. 1221b13d190SFrançois Tigeot * 1231b13d190SFrançois Tigeot * By always executing the first two requests in the queue the driver ensures 1241b13d190SFrançois Tigeot * that the GPU is kept as busy as possible. In the case where a single context 1251b13d190SFrançois Tigeot * completes but a second context is still executing, the request for this second 1261b13d190SFrançois Tigeot * context will be at the head of the queue when we remove the first one. This 1271b13d190SFrançois Tigeot * request will then be resubmitted along with a new request for a different context, 1281b13d190SFrançois Tigeot * which will cause the hardware to continue executing the second request and queue 1291b13d190SFrançois Tigeot * the new request (the GPU detects the condition of a context getting preempted 1301b13d190SFrançois Tigeot * with the same context and optimizes the context switch flow by not doing 1311b13d190SFrançois Tigeot * preemption, but just sampling the new tail pointer). 1321b13d190SFrançois Tigeot * 1331b13d190SFrançois Tigeot */ 1341b13d190SFrançois Tigeot 1351b13d190SFrançois Tigeot #include <drm/drmP.h> 1361b13d190SFrançois Tigeot #include <drm/i915_drm.h> 1371b13d190SFrançois Tigeot #include "i915_drv.h" 1381b13d190SFrançois Tigeot #include "intel_drv.h" 139*a05eeebfSFrançois Tigeot #include "intel_mocs.h" 1401b13d190SFrançois Tigeot 1412c9916cdSFrançois Tigeot #define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE) 1421b13d190SFrançois Tigeot #define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE) 1431b13d190SFrançois Tigeot #define GEN8_LR_CONTEXT_OTHER_SIZE (2 * PAGE_SIZE) 1441b13d190SFrançois Tigeot 1451b13d190SFrançois Tigeot #define RING_EXECLIST_QFULL (1 << 0x2) 1461b13d190SFrançois Tigeot #define RING_EXECLIST1_VALID (1 << 0x3) 1471b13d190SFrançois Tigeot #define RING_EXECLIST0_VALID (1 << 0x4) 1481b13d190SFrançois Tigeot #define RING_EXECLIST_ACTIVE_STATUS (3 << 0xE) 1491b13d190SFrançois Tigeot #define RING_EXECLIST1_ACTIVE (1 << 0x11) 1501b13d190SFrançois Tigeot #define RING_EXECLIST0_ACTIVE (1 << 0x12) 1511b13d190SFrançois Tigeot 1521b13d190SFrançois Tigeot #define GEN8_CTX_STATUS_IDLE_ACTIVE (1 << 0) 1531b13d190SFrançois Tigeot #define GEN8_CTX_STATUS_PREEMPTED (1 << 1) 1541b13d190SFrançois Tigeot #define GEN8_CTX_STATUS_ELEMENT_SWITCH (1 << 2) 1551b13d190SFrançois Tigeot #define GEN8_CTX_STATUS_ACTIVE_IDLE (1 << 3) 1561b13d190SFrançois Tigeot #define GEN8_CTX_STATUS_COMPLETE (1 << 4) 1571b13d190SFrançois Tigeot #define GEN8_CTX_STATUS_LITE_RESTORE (1 << 15) 1581b13d190SFrançois Tigeot 1591b13d190SFrançois Tigeot #define CTX_LRI_HEADER_0 0x01 1601b13d190SFrançois Tigeot #define CTX_CONTEXT_CONTROL 0x02 1611b13d190SFrançois Tigeot #define CTX_RING_HEAD 0x04 1621b13d190SFrançois Tigeot #define CTX_RING_TAIL 0x06 1631b13d190SFrançois Tigeot #define CTX_RING_BUFFER_START 0x08 1641b13d190SFrançois Tigeot #define CTX_RING_BUFFER_CONTROL 0x0a 1651b13d190SFrançois Tigeot #define CTX_BB_HEAD_U 0x0c 1661b13d190SFrançois Tigeot #define CTX_BB_HEAD_L 0x0e 1671b13d190SFrançois Tigeot #define CTX_BB_STATE 0x10 1681b13d190SFrançois Tigeot #define CTX_SECOND_BB_HEAD_U 0x12 1691b13d190SFrançois Tigeot #define CTX_SECOND_BB_HEAD_L 0x14 1701b13d190SFrançois Tigeot #define CTX_SECOND_BB_STATE 0x16 1711b13d190SFrançois Tigeot #define CTX_BB_PER_CTX_PTR 0x18 1721b13d190SFrançois Tigeot #define CTX_RCS_INDIRECT_CTX 0x1a 1731b13d190SFrançois Tigeot #define CTX_RCS_INDIRECT_CTX_OFFSET 0x1c 1741b13d190SFrançois Tigeot #define CTX_LRI_HEADER_1 0x21 1751b13d190SFrançois Tigeot #define CTX_CTX_TIMESTAMP 0x22 1761b13d190SFrançois Tigeot #define CTX_PDP3_UDW 0x24 1771b13d190SFrançois Tigeot #define CTX_PDP3_LDW 0x26 1781b13d190SFrançois Tigeot #define CTX_PDP2_UDW 0x28 1791b13d190SFrançois Tigeot #define CTX_PDP2_LDW 0x2a 1801b13d190SFrançois Tigeot #define CTX_PDP1_UDW 0x2c 1811b13d190SFrançois Tigeot #define CTX_PDP1_LDW 0x2e 1821b13d190SFrançois Tigeot #define CTX_PDP0_UDW 0x30 1831b13d190SFrançois Tigeot #define CTX_PDP0_LDW 0x32 1841b13d190SFrançois Tigeot #define CTX_LRI_HEADER_2 0x41 1851b13d190SFrançois Tigeot #define CTX_R_PWR_CLK_STATE 0x42 1861b13d190SFrançois Tigeot #define CTX_GPGPU_CSR_BASE_ADDRESS 0x44 1871b13d190SFrançois Tigeot 1881b13d190SFrançois Tigeot #define GEN8_CTX_VALID (1<<0) 1891b13d190SFrançois Tigeot #define GEN8_CTX_FORCE_PD_RESTORE (1<<1) 1901b13d190SFrançois Tigeot #define GEN8_CTX_FORCE_RESTORE (1<<2) 1911b13d190SFrançois Tigeot #define GEN8_CTX_L3LLC_COHERENT (1<<5) 1921b13d190SFrançois Tigeot #define GEN8_CTX_PRIVILEGE (1<<8) 19319c468b4SFrançois Tigeot 19419c468b4SFrançois Tigeot #define ASSIGN_CTX_PDP(ppgtt, reg_state, n) { \ 195*a05eeebfSFrançois Tigeot const u64 _addr = i915_page_dir_dma_addr((ppgtt), (n)); \ 19619c468b4SFrançois Tigeot reg_state[CTX_PDP ## n ## _UDW+1] = upper_32_bits(_addr); \ 19719c468b4SFrançois Tigeot reg_state[CTX_PDP ## n ## _LDW+1] = lower_32_bits(_addr); \ 19819c468b4SFrançois Tigeot } 19919c468b4SFrançois Tigeot 2001b13d190SFrançois Tigeot enum { 2011b13d190SFrançois Tigeot ADVANCED_CONTEXT = 0, 2021b13d190SFrançois Tigeot LEGACY_CONTEXT, 2031b13d190SFrançois Tigeot ADVANCED_AD_CONTEXT, 2041b13d190SFrançois Tigeot LEGACY_64B_CONTEXT 2051b13d190SFrançois Tigeot }; 2061b13d190SFrançois Tigeot #define GEN8_CTX_MODE_SHIFT 3 2071b13d190SFrançois Tigeot enum { 2081b13d190SFrançois Tigeot FAULT_AND_HANG = 0, 2091b13d190SFrançois Tigeot FAULT_AND_HALT, /* Debug only */ 2101b13d190SFrançois Tigeot FAULT_AND_STREAM, 2111b13d190SFrançois Tigeot FAULT_AND_CONTINUE /* Unsupported */ 2121b13d190SFrançois Tigeot }; 2131b13d190SFrançois Tigeot #define GEN8_CTX_ID_SHIFT 32 214*a05eeebfSFrançois Tigeot #define CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT 0x17 2151b13d190SFrançois Tigeot 216*a05eeebfSFrançois Tigeot static int intel_lr_context_pin(struct drm_i915_gem_request *rq); 2172c9916cdSFrançois Tigeot 2181b13d190SFrançois Tigeot /** 2191b13d190SFrançois Tigeot * intel_sanitize_enable_execlists() - sanitize i915.enable_execlists 2201b13d190SFrançois Tigeot * @dev: DRM device. 2211b13d190SFrançois Tigeot * @enable_execlists: value of i915.enable_execlists module parameter. 2221b13d190SFrançois Tigeot * 2231b13d190SFrançois Tigeot * Only certain platforms support Execlists (the prerequisites being 2242c9916cdSFrançois Tigeot * support for Logical Ring Contexts and Aliasing PPGTT or better). 2251b13d190SFrançois Tigeot * 2261b13d190SFrançois Tigeot * Return: 1 if Execlists is supported and has to be enabled. 2271b13d190SFrançois Tigeot */ 2281b13d190SFrançois Tigeot int intel_sanitize_enable_execlists(struct drm_device *dev, int enable_execlists) 2291b13d190SFrançois Tigeot { 2301b13d190SFrançois Tigeot WARN_ON(i915.enable_ppgtt == -1); 2311b13d190SFrançois Tigeot 2322c9916cdSFrançois Tigeot if (INTEL_INFO(dev)->gen >= 9) 2332c9916cdSFrançois Tigeot return 1; 2342c9916cdSFrançois Tigeot 2351b13d190SFrançois Tigeot if (enable_execlists == 0) 2361b13d190SFrançois Tigeot return 0; 2371b13d190SFrançois Tigeot 2381b13d190SFrançois Tigeot if (HAS_LOGICAL_RING_CONTEXTS(dev) && USES_PPGTT(dev) && 2391b13d190SFrançois Tigeot i915.use_mmio_flip >= 0) 2401b13d190SFrançois Tigeot return 1; 2411b13d190SFrançois Tigeot 2421b13d190SFrançois Tigeot return 0; 2431b13d190SFrançois Tigeot } 2441b13d190SFrançois Tigeot 2451b13d190SFrançois Tigeot /** 2461b13d190SFrançois Tigeot * intel_execlists_ctx_id() - get the Execlists Context ID 2471b13d190SFrançois Tigeot * @ctx_obj: Logical Ring Context backing object. 2481b13d190SFrançois Tigeot * 2491b13d190SFrançois Tigeot * Do not confuse with ctx->id! Unfortunately we have a name overload 2501b13d190SFrançois Tigeot * here: the old context ID we pass to userspace as a handler so that 2511b13d190SFrançois Tigeot * they can refer to a context, and the new context ID we pass to the 2521b13d190SFrançois Tigeot * ELSP so that the GPU can inform us of the context status via 2531b13d190SFrançois Tigeot * interrupts. 2541b13d190SFrançois Tigeot * 2551b13d190SFrançois Tigeot * Return: 20-bits globally unique context ID. 2561b13d190SFrançois Tigeot */ 2571b13d190SFrançois Tigeot u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj) 2581b13d190SFrançois Tigeot { 2591b13d190SFrançois Tigeot u32 lrca = i915_gem_obj_ggtt_offset(ctx_obj); 2601b13d190SFrançois Tigeot 2611b13d190SFrançois Tigeot /* LRCA is required to be 4K aligned so the more significant 20 bits 2621b13d190SFrançois Tigeot * are globally unique */ 2631b13d190SFrançois Tigeot return lrca >> 12; 2641b13d190SFrançois Tigeot } 2651b13d190SFrançois Tigeot 266*a05eeebfSFrançois Tigeot static uint64_t execlists_ctx_descriptor(struct drm_i915_gem_request *rq) 2671b13d190SFrançois Tigeot { 268*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = rq->ring; 269477eb7f9SFrançois Tigeot struct drm_device *dev = ring->dev; 270*a05eeebfSFrançois Tigeot struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state; 2711b13d190SFrançois Tigeot uint64_t desc; 2721b13d190SFrançois Tigeot uint64_t lrca = i915_gem_obj_ggtt_offset(ctx_obj); 2731b13d190SFrançois Tigeot 2741b13d190SFrançois Tigeot WARN_ON(lrca & 0xFFFFFFFF00000FFFULL); 2751b13d190SFrançois Tigeot 2761b13d190SFrançois Tigeot desc = GEN8_CTX_VALID; 2771b13d190SFrançois Tigeot desc |= LEGACY_CONTEXT << GEN8_CTX_MODE_SHIFT; 27819c468b4SFrançois Tigeot if (IS_GEN8(ctx_obj->base.dev)) 2791b13d190SFrançois Tigeot desc |= GEN8_CTX_L3LLC_COHERENT; 2801b13d190SFrançois Tigeot desc |= GEN8_CTX_PRIVILEGE; 2811b13d190SFrançois Tigeot desc |= lrca; 2821b13d190SFrançois Tigeot desc |= (u64)intel_execlists_ctx_id(ctx_obj) << GEN8_CTX_ID_SHIFT; 2831b13d190SFrançois Tigeot 2841b13d190SFrançois Tigeot /* TODO: WaDisableLiteRestore when we start using semaphore 2851b13d190SFrançois Tigeot * signalling between Command Streamers */ 2861b13d190SFrançois Tigeot /* desc |= GEN8_CTX_FORCE_RESTORE; */ 2871b13d190SFrançois Tigeot 288477eb7f9SFrançois Tigeot /* WaEnableForceRestoreInCtxtDescForVCS:skl */ 289477eb7f9SFrançois Tigeot if (IS_GEN9(dev) && 290477eb7f9SFrançois Tigeot INTEL_REVID(dev) <= SKL_REVID_B0 && 291477eb7f9SFrançois Tigeot (ring->id == BCS || ring->id == VCS || 292477eb7f9SFrançois Tigeot ring->id == VECS || ring->id == VCS2)) 293477eb7f9SFrançois Tigeot desc |= GEN8_CTX_FORCE_RESTORE; 294477eb7f9SFrançois Tigeot 2951b13d190SFrançois Tigeot return desc; 2961b13d190SFrançois Tigeot } 2971b13d190SFrançois Tigeot 298*a05eeebfSFrançois Tigeot static void execlists_elsp_write(struct drm_i915_gem_request *rq0, 299*a05eeebfSFrançois Tigeot struct drm_i915_gem_request *rq1) 3001b13d190SFrançois Tigeot { 301*a05eeebfSFrançois Tigeot 302*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = rq0->ring; 3032c9916cdSFrançois Tigeot struct drm_device *dev = ring->dev; 3042c9916cdSFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 305*a05eeebfSFrançois Tigeot uint64_t desc[2]; 3061b13d190SFrançois Tigeot 307*a05eeebfSFrançois Tigeot if (rq1) { 308*a05eeebfSFrançois Tigeot desc[1] = execlists_ctx_descriptor(rq1); 309*a05eeebfSFrançois Tigeot rq1->elsp_submitted++; 310*a05eeebfSFrançois Tigeot } else { 311*a05eeebfSFrançois Tigeot desc[1] = 0; 312*a05eeebfSFrançois Tigeot } 3131b13d190SFrançois Tigeot 314*a05eeebfSFrançois Tigeot desc[0] = execlists_ctx_descriptor(rq0); 315*a05eeebfSFrançois Tigeot rq0->elsp_submitted++; 3161b13d190SFrançois Tigeot 317*a05eeebfSFrançois Tigeot /* You must always write both descriptors in the order below. */ 31819c468b4SFrançois Tigeot lockmgr(&dev_priv->uncore.lock, LK_EXCLUSIVE); 31919c468b4SFrançois Tigeot intel_uncore_forcewake_get__locked(dev_priv, FORCEWAKE_ALL); 320*a05eeebfSFrançois Tigeot I915_WRITE_FW(RING_ELSP(ring), upper_32_bits(desc[1])); 321*a05eeebfSFrançois Tigeot I915_WRITE_FW(RING_ELSP(ring), lower_32_bits(desc[1])); 3222c9916cdSFrançois Tigeot 323*a05eeebfSFrançois Tigeot I915_WRITE_FW(RING_ELSP(ring), upper_32_bits(desc[0])); 3241b13d190SFrançois Tigeot /* The context is automatically loaded after the following */ 325*a05eeebfSFrançois Tigeot I915_WRITE_FW(RING_ELSP(ring), lower_32_bits(desc[0])); 3261b13d190SFrançois Tigeot 327*a05eeebfSFrançois Tigeot /* ELSP is a wo register, use another nearby reg for posting */ 32819c468b4SFrançois Tigeot POSTING_READ_FW(RING_EXECLIST_STATUS(ring)); 32919c468b4SFrançois Tigeot intel_uncore_forcewake_put__locked(dev_priv, FORCEWAKE_ALL); 33019c468b4SFrançois Tigeot lockmgr(&dev_priv->uncore.lock, LK_RELEASE); 3311b13d190SFrançois Tigeot } 3321b13d190SFrançois Tigeot 333*a05eeebfSFrançois Tigeot static int execlists_update_context(struct drm_i915_gem_request *rq) 3341b13d190SFrançois Tigeot { 335*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = rq->ring; 336*a05eeebfSFrançois Tigeot struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt; 337*a05eeebfSFrançois Tigeot struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state; 338*a05eeebfSFrançois Tigeot struct drm_i915_gem_object *rb_obj = rq->ringbuf->obj; 3391b13d190SFrançois Tigeot struct vm_page *page; 3401b13d190SFrançois Tigeot uint32_t *reg_state; 3411b13d190SFrançois Tigeot 342*a05eeebfSFrançois Tigeot BUG_ON(!ctx_obj); 343*a05eeebfSFrançois Tigeot WARN_ON(!i915_gem_obj_is_pinned(ctx_obj)); 344*a05eeebfSFrançois Tigeot WARN_ON(!i915_gem_obj_is_pinned(rb_obj)); 345*a05eeebfSFrançois Tigeot 3461b13d190SFrançois Tigeot page = i915_gem_object_get_page(ctx_obj, 1); 3471b13d190SFrançois Tigeot reg_state = kmap_atomic(page); 3481b13d190SFrançois Tigeot 349*a05eeebfSFrançois Tigeot reg_state[CTX_RING_TAIL+1] = rq->tail; 350*a05eeebfSFrançois Tigeot reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj); 3511b13d190SFrançois Tigeot 35219c468b4SFrançois Tigeot /* True PPGTT with dynamic page allocation: update PDP registers and 35319c468b4SFrançois Tigeot * point the unallocated PDPs to the scratch page 35419c468b4SFrançois Tigeot */ 35519c468b4SFrançois Tigeot if (ppgtt) { 35619c468b4SFrançois Tigeot ASSIGN_CTX_PDP(ppgtt, reg_state, 3); 35719c468b4SFrançois Tigeot ASSIGN_CTX_PDP(ppgtt, reg_state, 2); 35819c468b4SFrançois Tigeot ASSIGN_CTX_PDP(ppgtt, reg_state, 1); 35919c468b4SFrançois Tigeot ASSIGN_CTX_PDP(ppgtt, reg_state, 0); 36019c468b4SFrançois Tigeot } 36119c468b4SFrançois Tigeot 3621b13d190SFrançois Tigeot kunmap_atomic(reg_state); 3631b13d190SFrançois Tigeot 3641b13d190SFrançois Tigeot return 0; 3651b13d190SFrançois Tigeot } 3661b13d190SFrançois Tigeot 367*a05eeebfSFrançois Tigeot static void execlists_submit_requests(struct drm_i915_gem_request *rq0, 368*a05eeebfSFrançois Tigeot struct drm_i915_gem_request *rq1) 3691b13d190SFrançois Tigeot { 370*a05eeebfSFrançois Tigeot execlists_update_context(rq0); 3711b13d190SFrançois Tigeot 372*a05eeebfSFrançois Tigeot if (rq1) 373*a05eeebfSFrançois Tigeot execlists_update_context(rq1); 3741b13d190SFrançois Tigeot 375*a05eeebfSFrançois Tigeot execlists_elsp_write(rq0, rq1); 3761b13d190SFrançois Tigeot } 3771b13d190SFrançois Tigeot 3781b13d190SFrançois Tigeot static void execlists_context_unqueue(struct intel_engine_cs *ring) 3791b13d190SFrançois Tigeot { 3802c9916cdSFrançois Tigeot struct drm_i915_gem_request *req0 = NULL, *req1 = NULL; 3812c9916cdSFrançois Tigeot struct drm_i915_gem_request *cursor = NULL, *tmp = NULL; 3821b13d190SFrançois Tigeot 3831b13d190SFrançois Tigeot assert_spin_locked(&ring->execlist_lock); 3841b13d190SFrançois Tigeot 38519c468b4SFrançois Tigeot /* 38619c468b4SFrançois Tigeot * If irqs are not active generate a warning as batches that finish 38719c468b4SFrançois Tigeot * without the irqs may get lost and a GPU Hang may occur. 38819c468b4SFrançois Tigeot */ 38919c468b4SFrançois Tigeot WARN_ON(!intel_irqs_enabled(ring->dev->dev_private)); 39019c468b4SFrançois Tigeot 3911b13d190SFrançois Tigeot if (list_empty(&ring->execlist_queue)) 3921b13d190SFrançois Tigeot return; 3931b13d190SFrançois Tigeot 3941b13d190SFrançois Tigeot /* Try to read in pairs */ 3951b13d190SFrançois Tigeot list_for_each_entry_safe(cursor, tmp, &ring->execlist_queue, 3961b13d190SFrançois Tigeot execlist_link) { 3971b13d190SFrançois Tigeot if (!req0) { 3981b13d190SFrançois Tigeot req0 = cursor; 3991b13d190SFrançois Tigeot } else if (req0->ctx == cursor->ctx) { 4001b13d190SFrançois Tigeot /* Same ctx: ignore first request, as second request 4011b13d190SFrançois Tigeot * will update tail past first request's workload */ 4021b13d190SFrançois Tigeot cursor->elsp_submitted = req0->elsp_submitted; 4031b13d190SFrançois Tigeot list_del(&req0->execlist_link); 4042c9916cdSFrançois Tigeot list_add_tail(&req0->execlist_link, 4052c9916cdSFrançois Tigeot &ring->execlist_retired_req_list); 4061b13d190SFrançois Tigeot req0 = cursor; 4071b13d190SFrançois Tigeot } else { 4081b13d190SFrançois Tigeot req1 = cursor; 4091b13d190SFrançois Tigeot break; 4101b13d190SFrançois Tigeot } 4111b13d190SFrançois Tigeot } 4121b13d190SFrançois Tigeot 413477eb7f9SFrançois Tigeot if (IS_GEN8(ring->dev) || IS_GEN9(ring->dev)) { 414477eb7f9SFrançois Tigeot /* 415477eb7f9SFrançois Tigeot * WaIdleLiteRestore: make sure we never cause a lite 416477eb7f9SFrançois Tigeot * restore with HEAD==TAIL 417477eb7f9SFrançois Tigeot */ 41819c468b4SFrançois Tigeot if (req0->elsp_submitted) { 419477eb7f9SFrançois Tigeot /* 420477eb7f9SFrançois Tigeot * Apply the wa NOOPS to prevent ring:HEAD == req:TAIL 421477eb7f9SFrançois Tigeot * as we resubmit the request. See gen8_emit_request() 422477eb7f9SFrançois Tigeot * for where we prepare the padding after the end of the 423477eb7f9SFrançois Tigeot * request. 424477eb7f9SFrançois Tigeot */ 425477eb7f9SFrançois Tigeot struct intel_ringbuffer *ringbuf; 426477eb7f9SFrançois Tigeot 427477eb7f9SFrançois Tigeot ringbuf = req0->ctx->engine[ring->id].ringbuf; 428477eb7f9SFrançois Tigeot req0->tail += 8; 429477eb7f9SFrançois Tigeot req0->tail &= ringbuf->size - 1; 430477eb7f9SFrançois Tigeot } 431477eb7f9SFrançois Tigeot } 432477eb7f9SFrançois Tigeot 4331b13d190SFrançois Tigeot WARN_ON(req1 && req1->elsp_submitted); 4341b13d190SFrançois Tigeot 435*a05eeebfSFrançois Tigeot execlists_submit_requests(req0, req1); 4361b13d190SFrançois Tigeot } 4371b13d190SFrançois Tigeot 4381b13d190SFrançois Tigeot static bool execlists_check_remove_request(struct intel_engine_cs *ring, 4391b13d190SFrançois Tigeot u32 request_id) 4401b13d190SFrançois Tigeot { 4412c9916cdSFrançois Tigeot struct drm_i915_gem_request *head_req; 4421b13d190SFrançois Tigeot 4431b13d190SFrançois Tigeot assert_spin_locked(&ring->execlist_lock); 4441b13d190SFrançois Tigeot 4451b13d190SFrançois Tigeot head_req = list_first_entry_or_null(&ring->execlist_queue, 4462c9916cdSFrançois Tigeot struct drm_i915_gem_request, 4471b13d190SFrançois Tigeot execlist_link); 4481b13d190SFrançois Tigeot 4491b13d190SFrançois Tigeot if (head_req != NULL) { 4501b13d190SFrançois Tigeot struct drm_i915_gem_object *ctx_obj = 4511b13d190SFrançois Tigeot head_req->ctx->engine[ring->id].state; 4521b13d190SFrançois Tigeot if (intel_execlists_ctx_id(ctx_obj) == request_id) { 4531b13d190SFrançois Tigeot WARN(head_req->elsp_submitted == 0, 4541b13d190SFrançois Tigeot "Never submitted head request\n"); 4551b13d190SFrançois Tigeot 4561b13d190SFrançois Tigeot if (--head_req->elsp_submitted <= 0) { 4571b13d190SFrançois Tigeot list_del(&head_req->execlist_link); 4582c9916cdSFrançois Tigeot list_add_tail(&head_req->execlist_link, 4592c9916cdSFrançois Tigeot &ring->execlist_retired_req_list); 4601b13d190SFrançois Tigeot return true; 4611b13d190SFrançois Tigeot } 4621b13d190SFrançois Tigeot } 4631b13d190SFrançois Tigeot } 4641b13d190SFrançois Tigeot 4651b13d190SFrançois Tigeot return false; 4661b13d190SFrançois Tigeot } 4671b13d190SFrançois Tigeot 4681b13d190SFrançois Tigeot /** 4692c9916cdSFrançois Tigeot * intel_lrc_irq_handler() - handle Context Switch interrupts 4701b13d190SFrançois Tigeot * @ring: Engine Command Streamer to handle. 4711b13d190SFrançois Tigeot * 4721b13d190SFrançois Tigeot * Check the unread Context Status Buffers and manage the submission of new 4731b13d190SFrançois Tigeot * contexts to the ELSP accordingly. 4741b13d190SFrançois Tigeot */ 4752c9916cdSFrançois Tigeot void intel_lrc_irq_handler(struct intel_engine_cs *ring) 4761b13d190SFrançois Tigeot { 4771b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = ring->dev->dev_private; 4781b13d190SFrançois Tigeot u32 status_pointer; 4791b13d190SFrançois Tigeot u8 read_pointer; 4801b13d190SFrançois Tigeot u8 write_pointer; 4811b13d190SFrançois Tigeot u32 status; 4821b13d190SFrançois Tigeot u32 status_id; 4831b13d190SFrançois Tigeot u32 submit_contexts = 0; 4841b13d190SFrançois Tigeot 4851b13d190SFrançois Tigeot status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring)); 4861b13d190SFrançois Tigeot 4871b13d190SFrançois Tigeot read_pointer = ring->next_context_status_buffer; 488*a05eeebfSFrançois Tigeot write_pointer = status_pointer & GEN8_CSB_PTR_MASK; 4891b13d190SFrançois Tigeot if (read_pointer > write_pointer) 490*a05eeebfSFrançois Tigeot write_pointer += GEN8_CSB_ENTRIES; 4911b13d190SFrançois Tigeot 4921b13d190SFrançois Tigeot lockmgr(&ring->execlist_lock, LK_EXCLUSIVE); 4931b13d190SFrançois Tigeot 4941b13d190SFrançois Tigeot while (read_pointer < write_pointer) { 4951b13d190SFrançois Tigeot read_pointer++; 4961b13d190SFrançois Tigeot status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) + 497*a05eeebfSFrançois Tigeot (read_pointer % GEN8_CSB_ENTRIES) * 8); 4981b13d190SFrançois Tigeot status_id = I915_READ(RING_CONTEXT_STATUS_BUF(ring) + 499*a05eeebfSFrançois Tigeot (read_pointer % GEN8_CSB_ENTRIES) * 8 + 4); 500*a05eeebfSFrançois Tigeot 501*a05eeebfSFrançois Tigeot if (status & GEN8_CTX_STATUS_IDLE_ACTIVE) 502*a05eeebfSFrançois Tigeot continue; 5031b13d190SFrançois Tigeot 5041b13d190SFrançois Tigeot if (status & GEN8_CTX_STATUS_PREEMPTED) { 5051b13d190SFrançois Tigeot if (status & GEN8_CTX_STATUS_LITE_RESTORE) { 5061b13d190SFrançois Tigeot if (execlists_check_remove_request(ring, status_id)) 5071b13d190SFrançois Tigeot WARN(1, "Lite Restored request removed from queue\n"); 5081b13d190SFrançois Tigeot } else 5091b13d190SFrançois Tigeot WARN(1, "Preemption without Lite Restore\n"); 5101b13d190SFrançois Tigeot } 5111b13d190SFrançois Tigeot 5121b13d190SFrançois Tigeot if ((status & GEN8_CTX_STATUS_ACTIVE_IDLE) || 5131b13d190SFrançois Tigeot (status & GEN8_CTX_STATUS_ELEMENT_SWITCH)) { 5141b13d190SFrançois Tigeot if (execlists_check_remove_request(ring, status_id)) 5151b13d190SFrançois Tigeot submit_contexts++; 5161b13d190SFrançois Tigeot } 5171b13d190SFrançois Tigeot } 5181b13d190SFrançois Tigeot 5191b13d190SFrançois Tigeot if (submit_contexts != 0) 5201b13d190SFrançois Tigeot execlists_context_unqueue(ring); 5211b13d190SFrançois Tigeot 5221b13d190SFrançois Tigeot lockmgr(&ring->execlist_lock, LK_RELEASE); 5231b13d190SFrançois Tigeot 5241b13d190SFrançois Tigeot WARN(submit_contexts > 2, "More than two context complete events?\n"); 525*a05eeebfSFrançois Tigeot ring->next_context_status_buffer = write_pointer % GEN8_CSB_ENTRIES; 5261b13d190SFrançois Tigeot 5271b13d190SFrançois Tigeot I915_WRITE(RING_CONTEXT_STATUS_PTR(ring), 528*a05eeebfSFrançois Tigeot _MASKED_FIELD(GEN8_CSB_PTR_MASK << 8, 529*a05eeebfSFrançois Tigeot ((u32)ring->next_context_status_buffer & 530*a05eeebfSFrançois Tigeot GEN8_CSB_PTR_MASK) << 8)); 5311b13d190SFrançois Tigeot } 5321b13d190SFrançois Tigeot 533*a05eeebfSFrançois Tigeot static int execlists_context_queue(struct drm_i915_gem_request *request) 5341b13d190SFrançois Tigeot { 535*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = request->ring; 5362c9916cdSFrançois Tigeot struct drm_i915_gem_request *cursor; 5371b13d190SFrançois Tigeot int num_elements = 0; 5381b13d190SFrançois Tigeot 539*a05eeebfSFrançois Tigeot if (request->ctx != ring->default_context) 540*a05eeebfSFrançois Tigeot intel_lr_context_pin(request); 5412c9916cdSFrançois Tigeot 5422c9916cdSFrançois Tigeot i915_gem_request_reference(request); 543*a05eeebfSFrançois Tigeot 544*a05eeebfSFrançois Tigeot request->tail = request->ringbuf->tail; 5451b13d190SFrançois Tigeot 5461b13d190SFrançois Tigeot lockmgr(&ring->execlist_lock, LK_EXCLUSIVE); 5471b13d190SFrançois Tigeot 5481b13d190SFrançois Tigeot list_for_each_entry(cursor, &ring->execlist_queue, execlist_link) 5491b13d190SFrançois Tigeot if (++num_elements > 2) 5501b13d190SFrançois Tigeot break; 5511b13d190SFrançois Tigeot 5521b13d190SFrançois Tigeot if (num_elements > 2) { 5532c9916cdSFrançois Tigeot struct drm_i915_gem_request *tail_req; 5541b13d190SFrançois Tigeot 5551b13d190SFrançois Tigeot tail_req = list_last_entry(&ring->execlist_queue, 5562c9916cdSFrançois Tigeot struct drm_i915_gem_request, 5571b13d190SFrançois Tigeot execlist_link); 5581b13d190SFrançois Tigeot 559*a05eeebfSFrançois Tigeot if (request->ctx == tail_req->ctx) { 5601b13d190SFrançois Tigeot WARN(tail_req->elsp_submitted != 0, 5611b13d190SFrançois Tigeot "More than 2 already-submitted reqs queued\n"); 5621b13d190SFrançois Tigeot list_del(&tail_req->execlist_link); 5632c9916cdSFrançois Tigeot list_add_tail(&tail_req->execlist_link, 5642c9916cdSFrançois Tigeot &ring->execlist_retired_req_list); 5651b13d190SFrançois Tigeot } 5661b13d190SFrançois Tigeot } 5671b13d190SFrançois Tigeot 5682c9916cdSFrançois Tigeot list_add_tail(&request->execlist_link, &ring->execlist_queue); 5691b13d190SFrançois Tigeot if (num_elements == 0) 5701b13d190SFrançois Tigeot execlists_context_unqueue(ring); 5711b13d190SFrançois Tigeot 5721b13d190SFrançois Tigeot lockmgr(&ring->execlist_lock, LK_RELEASE); 5731b13d190SFrançois Tigeot 5741b13d190SFrançois Tigeot return 0; 5751b13d190SFrançois Tigeot } 5761b13d190SFrançois Tigeot 577*a05eeebfSFrançois Tigeot static int logical_ring_invalidate_all_caches(struct drm_i915_gem_request *req) 5781b13d190SFrançois Tigeot { 579*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = req->ring; 5801b13d190SFrançois Tigeot uint32_t flush_domains; 5811b13d190SFrançois Tigeot int ret; 5821b13d190SFrançois Tigeot 5831b13d190SFrançois Tigeot flush_domains = 0; 5841b13d190SFrançois Tigeot if (ring->gpu_caches_dirty) 5851b13d190SFrançois Tigeot flush_domains = I915_GEM_GPU_DOMAINS; 5861b13d190SFrançois Tigeot 587*a05eeebfSFrançois Tigeot ret = ring->emit_flush(req, I915_GEM_GPU_DOMAINS, flush_domains); 5881b13d190SFrançois Tigeot if (ret) 5891b13d190SFrançois Tigeot return ret; 5901b13d190SFrançois Tigeot 5911b13d190SFrançois Tigeot ring->gpu_caches_dirty = false; 5921b13d190SFrançois Tigeot return 0; 5931b13d190SFrançois Tigeot } 5941b13d190SFrançois Tigeot 595*a05eeebfSFrançois Tigeot static int execlists_move_to_gpu(struct drm_i915_gem_request *req, 5961b13d190SFrançois Tigeot struct list_head *vmas) 5971b13d190SFrançois Tigeot { 598*a05eeebfSFrançois Tigeot const unsigned other_rings = ~intel_ring_flag(req->ring); 5991b13d190SFrançois Tigeot struct i915_vma *vma; 6001b13d190SFrançois Tigeot uint32_t flush_domains = 0; 6011b13d190SFrançois Tigeot bool flush_chipset = false; 6021b13d190SFrançois Tigeot int ret; 6031b13d190SFrançois Tigeot 6041b13d190SFrançois Tigeot list_for_each_entry(vma, vmas, exec_list) { 6051b13d190SFrançois Tigeot struct drm_i915_gem_object *obj = vma->obj; 6061b13d190SFrançois Tigeot 60719c468b4SFrançois Tigeot if (obj->active & other_rings) { 608*a05eeebfSFrançois Tigeot ret = i915_gem_object_sync(obj, req->ring, &req); 6091b13d190SFrançois Tigeot if (ret) 6101b13d190SFrançois Tigeot return ret; 61119c468b4SFrançois Tigeot } 6121b13d190SFrançois Tigeot 6131b13d190SFrançois Tigeot if (obj->base.write_domain & I915_GEM_DOMAIN_CPU) 6141b13d190SFrançois Tigeot flush_chipset |= i915_gem_clflush_object(obj, false); 6151b13d190SFrançois Tigeot 6161b13d190SFrançois Tigeot flush_domains |= obj->base.write_domain; 6171b13d190SFrançois Tigeot } 6181b13d190SFrançois Tigeot 6191b13d190SFrançois Tigeot if (flush_domains & I915_GEM_DOMAIN_GTT) 6201b13d190SFrançois Tigeot wmb(); 6211b13d190SFrançois Tigeot 6221b13d190SFrançois Tigeot /* Unconditionally invalidate gpu caches and ensure that we do flush 6231b13d190SFrançois Tigeot * any residual writes from the previous batch. 6241b13d190SFrançois Tigeot */ 625*a05eeebfSFrançois Tigeot return logical_ring_invalidate_all_caches(req); 6261b13d190SFrançois Tigeot } 6271b13d190SFrançois Tigeot 628*a05eeebfSFrançois Tigeot int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request) 62919c468b4SFrançois Tigeot { 63019c468b4SFrançois Tigeot int ret; 63119c468b4SFrançois Tigeot 632*a05eeebfSFrançois Tigeot request->ringbuf = request->ctx->engine[request->ring->id].ringbuf; 633*a05eeebfSFrançois Tigeot 634*a05eeebfSFrançois Tigeot if (request->ctx != request->ring->default_context) { 635*a05eeebfSFrançois Tigeot ret = intel_lr_context_pin(request); 63619c468b4SFrançois Tigeot if (ret) 63719c468b4SFrançois Tigeot return ret; 63819c468b4SFrançois Tigeot } 63919c468b4SFrançois Tigeot 64019c468b4SFrançois Tigeot return 0; 64119c468b4SFrançois Tigeot } 64219c468b4SFrançois Tigeot 643*a05eeebfSFrançois Tigeot static int logical_ring_wait_for_space(struct drm_i915_gem_request *req, 64419c468b4SFrançois Tigeot int bytes) 64519c468b4SFrançois Tigeot { 646*a05eeebfSFrançois Tigeot struct intel_ringbuffer *ringbuf = req->ringbuf; 647*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = req->ring; 648*a05eeebfSFrançois Tigeot struct drm_i915_gem_request *target; 64919c468b4SFrançois Tigeot unsigned space; 65019c468b4SFrançois Tigeot int ret; 65119c468b4SFrançois Tigeot 65219c468b4SFrançois Tigeot if (intel_ring_space(ringbuf) >= bytes) 65319c468b4SFrançois Tigeot return 0; 65419c468b4SFrançois Tigeot 655*a05eeebfSFrançois Tigeot /* The whole point of reserving space is to not wait! */ 656*a05eeebfSFrançois Tigeot WARN_ON(ringbuf->reserved_in_use); 657*a05eeebfSFrançois Tigeot 658*a05eeebfSFrançois Tigeot list_for_each_entry(target, &ring->request_list, list) { 65919c468b4SFrançois Tigeot /* 66019c468b4SFrançois Tigeot * The request queue is per-engine, so can contain requests 66119c468b4SFrançois Tigeot * from multiple ringbuffers. Here, we must ignore any that 66219c468b4SFrançois Tigeot * aren't from the ringbuffer we're considering. 66319c468b4SFrançois Tigeot */ 664*a05eeebfSFrançois Tigeot if (target->ringbuf != ringbuf) 66519c468b4SFrançois Tigeot continue; 66619c468b4SFrançois Tigeot 66719c468b4SFrançois Tigeot /* Would completion of this request free enough space? */ 668*a05eeebfSFrançois Tigeot space = __intel_ring_space(target->postfix, ringbuf->tail, 66919c468b4SFrançois Tigeot ringbuf->size); 67019c468b4SFrançois Tigeot if (space >= bytes) 67119c468b4SFrançois Tigeot break; 67219c468b4SFrançois Tigeot } 67319c468b4SFrançois Tigeot 674*a05eeebfSFrançois Tigeot if (WARN_ON(&target->list == &ring->request_list)) 67519c468b4SFrançois Tigeot return -ENOSPC; 67619c468b4SFrançois Tigeot 677*a05eeebfSFrançois Tigeot ret = i915_wait_request(target); 67819c468b4SFrançois Tigeot if (ret) 67919c468b4SFrançois Tigeot return ret; 68019c468b4SFrançois Tigeot 68119c468b4SFrançois Tigeot ringbuf->space = space; 68219c468b4SFrançois Tigeot return 0; 68319c468b4SFrançois Tigeot } 68419c468b4SFrançois Tigeot 68519c468b4SFrançois Tigeot /* 68619c468b4SFrançois Tigeot * intel_logical_ring_advance_and_submit() - advance the tail and submit the workload 687*a05eeebfSFrançois Tigeot * @request: Request to advance the logical ringbuffer of. 68819c468b4SFrançois Tigeot * 68919c468b4SFrançois Tigeot * The tail is updated in our logical ringbuffer struct, not in the actual context. What 69019c468b4SFrançois Tigeot * really happens during submission is that the context and current tail will be placed 69119c468b4SFrançois Tigeot * on a queue waiting for the ELSP to be ready to accept a new context submission. At that 69219c468b4SFrançois Tigeot * point, the tail *inside* the context is updated and the ELSP written to. 69319c468b4SFrançois Tigeot */ 69419c468b4SFrançois Tigeot static void 695*a05eeebfSFrançois Tigeot intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request) 69619c468b4SFrançois Tigeot { 697*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = request->ring; 69819c468b4SFrançois Tigeot 699*a05eeebfSFrançois Tigeot intel_logical_ring_advance(request->ringbuf); 70019c468b4SFrançois Tigeot 70119c468b4SFrançois Tigeot if (intel_ring_stopped(ring)) 70219c468b4SFrançois Tigeot return; 70319c468b4SFrançois Tigeot 704*a05eeebfSFrançois Tigeot execlists_context_queue(request); 70519c468b4SFrançois Tigeot } 70619c468b4SFrançois Tigeot 707*a05eeebfSFrançois Tigeot static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf) 70819c468b4SFrançois Tigeot { 70919c468b4SFrançois Tigeot uint32_t __iomem *virt; 71019c468b4SFrançois Tigeot int rem = ringbuf->size - ringbuf->tail; 71119c468b4SFrançois Tigeot 71219c468b4SFrançois Tigeot virt = (uint32_t *)(ringbuf->virtual_start + ringbuf->tail); 71319c468b4SFrançois Tigeot rem /= 4; 71419c468b4SFrançois Tigeot while (rem--) 71519c468b4SFrançois Tigeot iowrite32(MI_NOOP, virt++); 71619c468b4SFrançois Tigeot 71719c468b4SFrançois Tigeot ringbuf->tail = 0; 71819c468b4SFrançois Tigeot intel_ring_update_space(ringbuf); 71919c468b4SFrançois Tigeot } 72019c468b4SFrançois Tigeot 721*a05eeebfSFrançois Tigeot static int logical_ring_prepare(struct drm_i915_gem_request *req, int bytes) 72219c468b4SFrançois Tigeot { 723*a05eeebfSFrançois Tigeot struct intel_ringbuffer *ringbuf = req->ringbuf; 724*a05eeebfSFrançois Tigeot int remain_usable = ringbuf->effective_size - ringbuf->tail; 725*a05eeebfSFrançois Tigeot int remain_actual = ringbuf->size - ringbuf->tail; 726*a05eeebfSFrançois Tigeot int ret, total_bytes, wait_bytes = 0; 727*a05eeebfSFrançois Tigeot bool need_wrap = false; 72819c468b4SFrançois Tigeot 729*a05eeebfSFrançois Tigeot if (ringbuf->reserved_in_use) 730*a05eeebfSFrançois Tigeot total_bytes = bytes; 731*a05eeebfSFrançois Tigeot else 732*a05eeebfSFrançois Tigeot total_bytes = bytes + ringbuf->reserved_size; 733*a05eeebfSFrançois Tigeot 734*a05eeebfSFrançois Tigeot if (unlikely(bytes > remain_usable)) { 735*a05eeebfSFrançois Tigeot /* 736*a05eeebfSFrançois Tigeot * Not enough space for the basic request. So need to flush 737*a05eeebfSFrançois Tigeot * out the remainder and then wait for base + reserved. 738*a05eeebfSFrançois Tigeot */ 739*a05eeebfSFrançois Tigeot wait_bytes = remain_actual + total_bytes; 740*a05eeebfSFrançois Tigeot need_wrap = true; 741*a05eeebfSFrançois Tigeot } else { 742*a05eeebfSFrançois Tigeot if (unlikely(total_bytes > remain_usable)) { 743*a05eeebfSFrançois Tigeot /* 744*a05eeebfSFrançois Tigeot * The base request will fit but the reserved space 745*a05eeebfSFrançois Tigeot * falls off the end. So only need to to wait for the 746*a05eeebfSFrançois Tigeot * reserved size after flushing out the remainder. 747*a05eeebfSFrançois Tigeot */ 748*a05eeebfSFrançois Tigeot wait_bytes = remain_actual + ringbuf->reserved_size; 749*a05eeebfSFrançois Tigeot need_wrap = true; 750*a05eeebfSFrançois Tigeot } else if (total_bytes > ringbuf->space) { 751*a05eeebfSFrançois Tigeot /* No wrapping required, just waiting. */ 752*a05eeebfSFrançois Tigeot wait_bytes = total_bytes; 753*a05eeebfSFrançois Tigeot } 75419c468b4SFrançois Tigeot } 75519c468b4SFrançois Tigeot 756*a05eeebfSFrançois Tigeot if (wait_bytes) { 757*a05eeebfSFrançois Tigeot ret = logical_ring_wait_for_space(req, wait_bytes); 75819c468b4SFrançois Tigeot if (unlikely(ret)) 75919c468b4SFrançois Tigeot return ret; 760*a05eeebfSFrançois Tigeot 761*a05eeebfSFrançois Tigeot if (need_wrap) 762*a05eeebfSFrançois Tigeot __wrap_ring_buffer(ringbuf); 76319c468b4SFrançois Tigeot } 76419c468b4SFrançois Tigeot 76519c468b4SFrançois Tigeot return 0; 76619c468b4SFrançois Tigeot } 76719c468b4SFrançois Tigeot 76819c468b4SFrançois Tigeot /** 76919c468b4SFrançois Tigeot * intel_logical_ring_begin() - prepare the logical ringbuffer to accept some commands 77019c468b4SFrançois Tigeot * 771*a05eeebfSFrançois Tigeot * @request: The request to start some new work for 772*a05eeebfSFrançois Tigeot * @ctx: Logical ring context whose ringbuffer is being prepared. 77319c468b4SFrançois Tigeot * @num_dwords: number of DWORDs that we plan to write to the ringbuffer. 77419c468b4SFrançois Tigeot * 77519c468b4SFrançois Tigeot * The ringbuffer might not be ready to accept the commands right away (maybe it needs to 77619c468b4SFrançois Tigeot * be wrapped, or wait a bit for the tail to be updated). This function takes care of that 77719c468b4SFrançois Tigeot * and also preallocates a request (every workload submission is still mediated through 77819c468b4SFrançois Tigeot * requests, same as it did with legacy ringbuffer submission). 77919c468b4SFrançois Tigeot * 78019c468b4SFrançois Tigeot * Return: non-zero if the ringbuffer is not ready to be written to. 78119c468b4SFrançois Tigeot */ 782*a05eeebfSFrançois Tigeot int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords) 78319c468b4SFrançois Tigeot { 784*a05eeebfSFrançois Tigeot struct drm_i915_private *dev_priv; 78519c468b4SFrançois Tigeot int ret; 78619c468b4SFrançois Tigeot 787*a05eeebfSFrançois Tigeot WARN_ON(req == NULL); 788*a05eeebfSFrançois Tigeot dev_priv = req->ring->dev->dev_private; 789*a05eeebfSFrançois Tigeot 79019c468b4SFrançois Tigeot ret = i915_gem_check_wedge(&dev_priv->gpu_error, 79119c468b4SFrançois Tigeot dev_priv->mm.interruptible); 79219c468b4SFrançois Tigeot if (ret) 79319c468b4SFrançois Tigeot return ret; 79419c468b4SFrançois Tigeot 795*a05eeebfSFrançois Tigeot ret = logical_ring_prepare(req, num_dwords * sizeof(uint32_t)); 79619c468b4SFrançois Tigeot if (ret) 79719c468b4SFrançois Tigeot return ret; 79819c468b4SFrançois Tigeot 799*a05eeebfSFrançois Tigeot req->ringbuf->space -= num_dwords * sizeof(uint32_t); 80019c468b4SFrançois Tigeot return 0; 80119c468b4SFrançois Tigeot } 80219c468b4SFrançois Tigeot 803*a05eeebfSFrançois Tigeot int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request) 804*a05eeebfSFrançois Tigeot { 805*a05eeebfSFrançois Tigeot /* 806*a05eeebfSFrançois Tigeot * The first call merely notes the reserve request and is common for 807*a05eeebfSFrançois Tigeot * all back ends. The subsequent localised _begin() call actually 808*a05eeebfSFrançois Tigeot * ensures that the reservation is available. Without the begin, if 809*a05eeebfSFrançois Tigeot * the request creator immediately submitted the request without 810*a05eeebfSFrançois Tigeot * adding any commands to it then there might not actually be 811*a05eeebfSFrançois Tigeot * sufficient room for the submission commands. 812*a05eeebfSFrançois Tigeot */ 813*a05eeebfSFrançois Tigeot intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST); 814*a05eeebfSFrançois Tigeot 815*a05eeebfSFrançois Tigeot return intel_logical_ring_begin(request, 0); 816*a05eeebfSFrançois Tigeot } 817*a05eeebfSFrançois Tigeot 8181b13d190SFrançois Tigeot /** 8191b13d190SFrançois Tigeot * execlists_submission() - submit a batchbuffer for execution, Execlists style 8201b13d190SFrançois Tigeot * @dev: DRM device. 8211b13d190SFrançois Tigeot * @file: DRM file. 8221b13d190SFrançois Tigeot * @ring: Engine Command Streamer to submit to. 8231b13d190SFrançois Tigeot * @ctx: Context to employ for this submission. 8241b13d190SFrançois Tigeot * @args: execbuffer call arguments. 8251b13d190SFrançois Tigeot * @vmas: list of vmas. 8261b13d190SFrançois Tigeot * @batch_obj: the batchbuffer to submit. 8271b13d190SFrançois Tigeot * @exec_start: batchbuffer start virtual address pointer. 828477eb7f9SFrançois Tigeot * @dispatch_flags: translated execbuffer call flags. 8291b13d190SFrançois Tigeot * 8301b13d190SFrançois Tigeot * This is the evil twin version of i915_gem_ringbuffer_submission. It abstracts 8311b13d190SFrançois Tigeot * away the submission details of the execbuffer ioctl call. 8321b13d190SFrançois Tigeot * 8331b13d190SFrançois Tigeot * Return: non-zero if the submission fails. 8341b13d190SFrançois Tigeot */ 835*a05eeebfSFrançois Tigeot int intel_execlists_submission(struct i915_execbuffer_params *params, 8361b13d190SFrançois Tigeot struct drm_i915_gem_execbuffer2 *args, 837*a05eeebfSFrançois Tigeot struct list_head *vmas) 8381b13d190SFrançois Tigeot { 839*a05eeebfSFrançois Tigeot struct drm_device *dev = params->dev; 840*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = params->ring; 8411b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 842*a05eeebfSFrançois Tigeot struct intel_ringbuffer *ringbuf = params->ctx->engine[ring->id].ringbuf; 843*a05eeebfSFrançois Tigeot u64 exec_start; 8441b13d190SFrançois Tigeot int instp_mode; 8451b13d190SFrançois Tigeot u32 instp_mask; 8461b13d190SFrançois Tigeot int ret; 8471b13d190SFrançois Tigeot 8481b13d190SFrançois Tigeot instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK; 8491b13d190SFrançois Tigeot instp_mask = I915_EXEC_CONSTANTS_MASK; 8501b13d190SFrançois Tigeot switch (instp_mode) { 8511b13d190SFrançois Tigeot case I915_EXEC_CONSTANTS_REL_GENERAL: 8521b13d190SFrançois Tigeot case I915_EXEC_CONSTANTS_ABSOLUTE: 8531b13d190SFrançois Tigeot case I915_EXEC_CONSTANTS_REL_SURFACE: 8541b13d190SFrançois Tigeot if (instp_mode != 0 && ring != &dev_priv->ring[RCS]) { 8551b13d190SFrançois Tigeot DRM_DEBUG("non-0 rel constants mode on non-RCS\n"); 8561b13d190SFrançois Tigeot return -EINVAL; 8571b13d190SFrançois Tigeot } 8581b13d190SFrançois Tigeot 8591b13d190SFrançois Tigeot if (instp_mode != dev_priv->relative_constants_mode) { 8601b13d190SFrançois Tigeot if (instp_mode == I915_EXEC_CONSTANTS_REL_SURFACE) { 8611b13d190SFrançois Tigeot DRM_DEBUG("rel surface constants mode invalid on gen5+\n"); 8621b13d190SFrançois Tigeot return -EINVAL; 8631b13d190SFrançois Tigeot } 8641b13d190SFrançois Tigeot 8651b13d190SFrançois Tigeot /* The HW changed the meaning on this bit on gen6 */ 8661b13d190SFrançois Tigeot instp_mask &= ~I915_EXEC_CONSTANTS_REL_SURFACE; 8671b13d190SFrançois Tigeot } 8681b13d190SFrançois Tigeot break; 8691b13d190SFrançois Tigeot default: 8701b13d190SFrançois Tigeot DRM_DEBUG("execbuf with unknown constants: %d\n", instp_mode); 8711b13d190SFrançois Tigeot return -EINVAL; 8721b13d190SFrançois Tigeot } 8731b13d190SFrançois Tigeot 8741b13d190SFrançois Tigeot if (args->num_cliprects != 0) { 8751b13d190SFrançois Tigeot DRM_DEBUG("clip rectangles are only valid on pre-gen5\n"); 8761b13d190SFrançois Tigeot return -EINVAL; 8771b13d190SFrançois Tigeot } else { 8781b13d190SFrançois Tigeot if (args->DR4 == 0xffffffff) { 8791b13d190SFrançois Tigeot DRM_DEBUG("UXA submitting garbage DR4, fixing up\n"); 8801b13d190SFrançois Tigeot args->DR4 = 0; 8811b13d190SFrançois Tigeot } 8821b13d190SFrançois Tigeot 8831b13d190SFrançois Tigeot if (args->DR1 || args->DR4 || args->cliprects_ptr) { 8841b13d190SFrançois Tigeot DRM_DEBUG("0 cliprects but dirt in cliprects fields\n"); 8851b13d190SFrançois Tigeot return -EINVAL; 8861b13d190SFrançois Tigeot } 8871b13d190SFrançois Tigeot } 8881b13d190SFrançois Tigeot 8891b13d190SFrançois Tigeot if (args->flags & I915_EXEC_GEN7_SOL_RESET) { 8901b13d190SFrançois Tigeot DRM_DEBUG("sol reset is gen7 only\n"); 8911b13d190SFrançois Tigeot return -EINVAL; 8921b13d190SFrançois Tigeot } 8931b13d190SFrançois Tigeot 894*a05eeebfSFrançois Tigeot ret = execlists_move_to_gpu(params->request, vmas); 8951b13d190SFrançois Tigeot if (ret) 8961b13d190SFrançois Tigeot return ret; 8971b13d190SFrançois Tigeot 8981b13d190SFrançois Tigeot if (ring == &dev_priv->ring[RCS] && 8991b13d190SFrançois Tigeot instp_mode != dev_priv->relative_constants_mode) { 900*a05eeebfSFrançois Tigeot ret = intel_logical_ring_begin(params->request, 4); 9011b13d190SFrançois Tigeot if (ret) 9021b13d190SFrançois Tigeot return ret; 9031b13d190SFrançois Tigeot 9041b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, MI_NOOP); 9051b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1)); 9061b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, INSTPM); 9071b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, instp_mask << 16 | instp_mode); 9081b13d190SFrançois Tigeot intel_logical_ring_advance(ringbuf); 9091b13d190SFrançois Tigeot 9101b13d190SFrançois Tigeot dev_priv->relative_constants_mode = instp_mode; 9111b13d190SFrançois Tigeot } 9121b13d190SFrançois Tigeot 913*a05eeebfSFrançois Tigeot exec_start = params->batch_obj_vm_offset + 914*a05eeebfSFrançois Tigeot args->batch_start_offset; 915*a05eeebfSFrançois Tigeot 916*a05eeebfSFrançois Tigeot ret = ring->emit_bb_start(params->request, exec_start, params->dispatch_flags); 9171b13d190SFrançois Tigeot if (ret) 9181b13d190SFrançois Tigeot return ret; 9191b13d190SFrançois Tigeot 920*a05eeebfSFrançois Tigeot trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags); 921477eb7f9SFrançois Tigeot 922*a05eeebfSFrançois Tigeot i915_gem_execbuffer_move_to_active(vmas, params->request); 923*a05eeebfSFrançois Tigeot i915_gem_execbuffer_retire_commands(params); 9241b13d190SFrançois Tigeot 9251b13d190SFrançois Tigeot return 0; 9261b13d190SFrançois Tigeot } 9271b13d190SFrançois Tigeot 9282c9916cdSFrançois Tigeot void intel_execlists_retire_requests(struct intel_engine_cs *ring) 9292c9916cdSFrançois Tigeot { 9302c9916cdSFrançois Tigeot struct drm_i915_gem_request *req, *tmp; 9312c9916cdSFrançois Tigeot struct list_head retired_list; 9322c9916cdSFrançois Tigeot 9332c9916cdSFrançois Tigeot WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex)); 9342c9916cdSFrançois Tigeot if (list_empty(&ring->execlist_retired_req_list)) 9352c9916cdSFrançois Tigeot return; 9362c9916cdSFrançois Tigeot 9372c9916cdSFrançois Tigeot INIT_LIST_HEAD(&retired_list); 9382c9916cdSFrançois Tigeot lockmgr(&ring->execlist_lock, LK_EXCLUSIVE); 9392c9916cdSFrançois Tigeot list_replace_init(&ring->execlist_retired_req_list, &retired_list); 9402c9916cdSFrançois Tigeot lockmgr(&ring->execlist_lock, LK_RELEASE); 9412c9916cdSFrançois Tigeot 9422c9916cdSFrançois Tigeot list_for_each_entry_safe(req, tmp, &retired_list, execlist_link) { 9432c9916cdSFrançois Tigeot struct intel_context *ctx = req->ctx; 9442c9916cdSFrançois Tigeot struct drm_i915_gem_object *ctx_obj = 9452c9916cdSFrançois Tigeot ctx->engine[ring->id].state; 9462c9916cdSFrançois Tigeot 9472c9916cdSFrançois Tigeot if (ctx_obj && (ctx != ring->default_context)) 948*a05eeebfSFrançois Tigeot intel_lr_context_unpin(req); 9492c9916cdSFrançois Tigeot list_del(&req->execlist_link); 9502c9916cdSFrançois Tigeot i915_gem_request_unreference(req); 9512c9916cdSFrançois Tigeot } 9522c9916cdSFrançois Tigeot } 9532c9916cdSFrançois Tigeot 9541b13d190SFrançois Tigeot void intel_logical_ring_stop(struct intel_engine_cs *ring) 9551b13d190SFrançois Tigeot { 9561b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = ring->dev->dev_private; 9571b13d190SFrançois Tigeot int ret; 9581b13d190SFrançois Tigeot 9591b13d190SFrançois Tigeot if (!intel_ring_initialized(ring)) 9601b13d190SFrançois Tigeot return; 9611b13d190SFrançois Tigeot 9621b13d190SFrançois Tigeot ret = intel_ring_idle(ring); 9631b13d190SFrançois Tigeot if (ret && !i915_reset_in_progress(&to_i915(ring->dev)->gpu_error)) 9641b13d190SFrançois Tigeot DRM_ERROR("failed to quiesce %s whilst cleaning up: %d\n", 9651b13d190SFrançois Tigeot ring->name, ret); 9661b13d190SFrançois Tigeot 9671b13d190SFrançois Tigeot /* TODO: Is this correct with Execlists enabled? */ 9681b13d190SFrançois Tigeot I915_WRITE_MODE(ring, _MASKED_BIT_ENABLE(STOP_RING)); 9691b13d190SFrançois Tigeot if (wait_for_atomic((I915_READ_MODE(ring) & MODE_IDLE) != 0, 1000)) { 9701b13d190SFrançois Tigeot DRM_ERROR("%s :timed out trying to stop ring\n", ring->name); 9711b13d190SFrançois Tigeot return; 9721b13d190SFrançois Tigeot } 9731b13d190SFrançois Tigeot I915_WRITE_MODE(ring, _MASKED_BIT_DISABLE(STOP_RING)); 9741b13d190SFrançois Tigeot } 9751b13d190SFrançois Tigeot 976*a05eeebfSFrançois Tigeot int logical_ring_flush_all_caches(struct drm_i915_gem_request *req) 9771b13d190SFrançois Tigeot { 978*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = req->ring; 9791b13d190SFrançois Tigeot int ret; 9801b13d190SFrançois Tigeot 9811b13d190SFrançois Tigeot if (!ring->gpu_caches_dirty) 9821b13d190SFrançois Tigeot return 0; 9831b13d190SFrançois Tigeot 984*a05eeebfSFrançois Tigeot ret = ring->emit_flush(req, 0, I915_GEM_GPU_DOMAINS); 9851b13d190SFrançois Tigeot if (ret) 9861b13d190SFrançois Tigeot return ret; 9871b13d190SFrançois Tigeot 9881b13d190SFrançois Tigeot ring->gpu_caches_dirty = false; 9891b13d190SFrançois Tigeot return 0; 9901b13d190SFrançois Tigeot } 9911b13d190SFrançois Tigeot 992*a05eeebfSFrançois Tigeot static int intel_lr_context_pin(struct drm_i915_gem_request *rq) 9931b13d190SFrançois Tigeot { 994*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = rq->ring; 995*a05eeebfSFrançois Tigeot struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state; 996*a05eeebfSFrançois Tigeot struct intel_ringbuffer *ringbuf = rq->ringbuf; 9972c9916cdSFrançois Tigeot int ret = 0; 9982c9916cdSFrançois Tigeot 9992c9916cdSFrançois Tigeot WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex)); 1000*a05eeebfSFrançois Tigeot if (rq->ctx->engine[ring->id].pin_count++ == 0) { 10012c9916cdSFrançois Tigeot ret = i915_gem_obj_ggtt_pin(ctx_obj, 10022c9916cdSFrançois Tigeot GEN8_LR_CONTEXT_ALIGN, 0); 10032c9916cdSFrançois Tigeot if (ret) 10042c9916cdSFrançois Tigeot goto reset_pin_count; 10052c9916cdSFrançois Tigeot 10062c9916cdSFrançois Tigeot ret = intel_pin_and_map_ringbuffer_obj(ring->dev, ringbuf); 10072c9916cdSFrançois Tigeot if (ret) 10082c9916cdSFrançois Tigeot goto unpin_ctx_obj; 100919c468b4SFrançois Tigeot 101019c468b4SFrançois Tigeot ctx_obj->dirty = true; 10112c9916cdSFrançois Tigeot } 10122c9916cdSFrançois Tigeot 10132c9916cdSFrançois Tigeot return ret; 10142c9916cdSFrançois Tigeot 10152c9916cdSFrançois Tigeot unpin_ctx_obj: 10162c9916cdSFrançois Tigeot i915_gem_object_ggtt_unpin(ctx_obj); 10172c9916cdSFrançois Tigeot reset_pin_count: 1018*a05eeebfSFrançois Tigeot rq->ctx->engine[ring->id].pin_count = 0; 10192c9916cdSFrançois Tigeot 10202c9916cdSFrançois Tigeot return ret; 10212c9916cdSFrançois Tigeot } 10222c9916cdSFrançois Tigeot 1023*a05eeebfSFrançois Tigeot void intel_lr_context_unpin(struct drm_i915_gem_request *rq) 10242c9916cdSFrançois Tigeot { 1025*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = rq->ring; 1026*a05eeebfSFrançois Tigeot struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state; 1027*a05eeebfSFrançois Tigeot struct intel_ringbuffer *ringbuf = rq->ringbuf; 10282c9916cdSFrançois Tigeot 10292c9916cdSFrançois Tigeot if (ctx_obj) { 10302c9916cdSFrançois Tigeot WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex)); 1031*a05eeebfSFrançois Tigeot if (--rq->ctx->engine[ring->id].pin_count == 0) { 10322c9916cdSFrançois Tigeot intel_unpin_ringbuffer_obj(ringbuf); 10332c9916cdSFrançois Tigeot i915_gem_object_ggtt_unpin(ctx_obj); 10342c9916cdSFrançois Tigeot } 10352c9916cdSFrançois Tigeot } 10362c9916cdSFrançois Tigeot } 10372c9916cdSFrançois Tigeot 1038*a05eeebfSFrançois Tigeot static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req) 10392c9916cdSFrançois Tigeot { 10402c9916cdSFrançois Tigeot int ret, i; 1041*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = req->ring; 1042*a05eeebfSFrançois Tigeot struct intel_ringbuffer *ringbuf = req->ringbuf; 10432c9916cdSFrançois Tigeot struct drm_device *dev = ring->dev; 10442c9916cdSFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 10452c9916cdSFrançois Tigeot struct i915_workarounds *w = &dev_priv->workarounds; 10462c9916cdSFrançois Tigeot 10472c9916cdSFrançois Tigeot if (WARN_ON_ONCE(w->count == 0)) 10482c9916cdSFrançois Tigeot return 0; 10492c9916cdSFrançois Tigeot 10502c9916cdSFrançois Tigeot ring->gpu_caches_dirty = true; 1051*a05eeebfSFrançois Tigeot ret = logical_ring_flush_all_caches(req); 10522c9916cdSFrançois Tigeot if (ret) 10532c9916cdSFrançois Tigeot return ret; 10542c9916cdSFrançois Tigeot 1055*a05eeebfSFrançois Tigeot ret = intel_logical_ring_begin(req, w->count * 2 + 2); 10562c9916cdSFrançois Tigeot if (ret) 10572c9916cdSFrançois Tigeot return ret; 10582c9916cdSFrançois Tigeot 10592c9916cdSFrançois Tigeot intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(w->count)); 10602c9916cdSFrançois Tigeot for (i = 0; i < w->count; i++) { 10612c9916cdSFrançois Tigeot intel_logical_ring_emit(ringbuf, w->reg[i].addr); 10622c9916cdSFrançois Tigeot intel_logical_ring_emit(ringbuf, w->reg[i].value); 10632c9916cdSFrançois Tigeot } 10642c9916cdSFrançois Tigeot intel_logical_ring_emit(ringbuf, MI_NOOP); 10652c9916cdSFrançois Tigeot 10662c9916cdSFrançois Tigeot intel_logical_ring_advance(ringbuf); 10672c9916cdSFrançois Tigeot 10682c9916cdSFrançois Tigeot ring->gpu_caches_dirty = true; 1069*a05eeebfSFrançois Tigeot ret = logical_ring_flush_all_caches(req); 10702c9916cdSFrançois Tigeot if (ret) 10712c9916cdSFrançois Tigeot return ret; 10722c9916cdSFrançois Tigeot 10732c9916cdSFrançois Tigeot return 0; 10742c9916cdSFrançois Tigeot } 10752c9916cdSFrançois Tigeot 1076*a05eeebfSFrançois Tigeot #define wa_ctx_emit(batch, index, cmd) \ 1077*a05eeebfSFrançois Tigeot do { \ 1078*a05eeebfSFrançois Tigeot int __index = (index)++; \ 1079*a05eeebfSFrançois Tigeot if (WARN_ON(__index >= (PAGE_SIZE / sizeof(uint32_t)))) { \ 1080*a05eeebfSFrançois Tigeot return -ENOSPC; \ 1081*a05eeebfSFrançois Tigeot } \ 1082*a05eeebfSFrançois Tigeot batch[__index] = (cmd); \ 1083*a05eeebfSFrançois Tigeot } while (0) 1084*a05eeebfSFrançois Tigeot 1085*a05eeebfSFrançois Tigeot 1086*a05eeebfSFrançois Tigeot /* 1087*a05eeebfSFrançois Tigeot * In this WA we need to set GEN8_L3SQCREG4[21:21] and reset it after 1088*a05eeebfSFrançois Tigeot * PIPE_CONTROL instruction. This is required for the flush to happen correctly 1089*a05eeebfSFrançois Tigeot * but there is a slight complication as this is applied in WA batch where the 1090*a05eeebfSFrançois Tigeot * values are only initialized once so we cannot take register value at the 1091*a05eeebfSFrançois Tigeot * beginning and reuse it further; hence we save its value to memory, upload a 1092*a05eeebfSFrançois Tigeot * constant value with bit21 set and then we restore it back with the saved value. 1093*a05eeebfSFrançois Tigeot * To simplify the WA, a constant value is formed by using the default value 1094*a05eeebfSFrançois Tigeot * of this register. This shouldn't be a problem because we are only modifying 1095*a05eeebfSFrançois Tigeot * it for a short period and this batch in non-premptible. We can ofcourse 1096*a05eeebfSFrançois Tigeot * use additional instructions that read the actual value of the register 1097*a05eeebfSFrançois Tigeot * at that time and set our bit of interest but it makes the WA complicated. 1098*a05eeebfSFrançois Tigeot * 1099*a05eeebfSFrançois Tigeot * This WA is also required for Gen9 so extracting as a function avoids 1100*a05eeebfSFrançois Tigeot * code duplication. 1101*a05eeebfSFrançois Tigeot */ 1102*a05eeebfSFrançois Tigeot static inline int gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *ring, 1103*a05eeebfSFrançois Tigeot uint32_t *const batch, 1104*a05eeebfSFrançois Tigeot uint32_t index) 1105*a05eeebfSFrançois Tigeot { 1106*a05eeebfSFrançois Tigeot uint32_t l3sqc4_flush = (0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES); 1107*a05eeebfSFrançois Tigeot 1108*a05eeebfSFrançois Tigeot /* 1109*a05eeebfSFrançois Tigeot * WaDisableLSQCROPERFforOCL:skl 1110*a05eeebfSFrançois Tigeot * This WA is implemented in skl_init_clock_gating() but since 1111*a05eeebfSFrançois Tigeot * this batch updates GEN8_L3SQCREG4 with default value we need to 1112*a05eeebfSFrançois Tigeot * set this bit here to retain the WA during flush. 1113*a05eeebfSFrançois Tigeot */ 1114*a05eeebfSFrançois Tigeot if (IS_SKYLAKE(ring->dev) && INTEL_REVID(ring->dev) <= SKL_REVID_E0) 1115*a05eeebfSFrançois Tigeot l3sqc4_flush |= GEN8_LQSC_RO_PERF_DIS; 1116*a05eeebfSFrançois Tigeot 1117*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, (MI_STORE_REGISTER_MEM_GEN8(1) | 1118*a05eeebfSFrançois Tigeot MI_SRM_LRM_GLOBAL_GTT)); 1119*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, GEN8_L3SQCREG4); 1120*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, ring->scratch.gtt_offset + 256); 1121*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, 0); 1122*a05eeebfSFrançois Tigeot 1123*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, MI_LOAD_REGISTER_IMM(1)); 1124*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, GEN8_L3SQCREG4); 1125*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, l3sqc4_flush); 1126*a05eeebfSFrançois Tigeot 1127*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, GFX_OP_PIPE_CONTROL(6)); 1128*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, (PIPE_CONTROL_CS_STALL | 1129*a05eeebfSFrançois Tigeot PIPE_CONTROL_DC_FLUSH_ENABLE)); 1130*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, 0); 1131*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, 0); 1132*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, 0); 1133*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, 0); 1134*a05eeebfSFrançois Tigeot 1135*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, (MI_LOAD_REGISTER_MEM_GEN8(1) | 1136*a05eeebfSFrançois Tigeot MI_SRM_LRM_GLOBAL_GTT)); 1137*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, GEN8_L3SQCREG4); 1138*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, ring->scratch.gtt_offset + 256); 1139*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, 0); 1140*a05eeebfSFrançois Tigeot 1141*a05eeebfSFrançois Tigeot return index; 1142*a05eeebfSFrançois Tigeot } 1143*a05eeebfSFrançois Tigeot 1144*a05eeebfSFrançois Tigeot static inline uint32_t wa_ctx_start(struct i915_wa_ctx_bb *wa_ctx, 1145*a05eeebfSFrançois Tigeot uint32_t offset, 1146*a05eeebfSFrançois Tigeot uint32_t start_alignment) 1147*a05eeebfSFrançois Tigeot { 1148*a05eeebfSFrançois Tigeot return wa_ctx->offset = ALIGN(offset, start_alignment); 1149*a05eeebfSFrançois Tigeot } 1150*a05eeebfSFrançois Tigeot 1151*a05eeebfSFrançois Tigeot static inline int wa_ctx_end(struct i915_wa_ctx_bb *wa_ctx, 1152*a05eeebfSFrançois Tigeot uint32_t offset, 1153*a05eeebfSFrançois Tigeot uint32_t size_alignment) 1154*a05eeebfSFrançois Tigeot { 1155*a05eeebfSFrançois Tigeot wa_ctx->size = offset - wa_ctx->offset; 1156*a05eeebfSFrançois Tigeot 1157*a05eeebfSFrançois Tigeot WARN(wa_ctx->size % size_alignment, 1158*a05eeebfSFrançois Tigeot "wa_ctx_bb failed sanity checks: size %d is not aligned to %d\n", 1159*a05eeebfSFrançois Tigeot wa_ctx->size, size_alignment); 1160*a05eeebfSFrançois Tigeot return 0; 1161*a05eeebfSFrançois Tigeot } 1162*a05eeebfSFrançois Tigeot 1163*a05eeebfSFrançois Tigeot /** 1164*a05eeebfSFrançois Tigeot * gen8_init_indirectctx_bb() - initialize indirect ctx batch with WA 1165*a05eeebfSFrançois Tigeot * 1166*a05eeebfSFrançois Tigeot * @ring: only applicable for RCS 1167*a05eeebfSFrançois Tigeot * @wa_ctx: structure representing wa_ctx 1168*a05eeebfSFrançois Tigeot * offset: specifies start of the batch, should be cache-aligned. This is updated 1169*a05eeebfSFrançois Tigeot * with the offset value received as input. 1170*a05eeebfSFrançois Tigeot * size: size of the batch in DWORDS but HW expects in terms of cachelines 1171*a05eeebfSFrançois Tigeot * @batch: page in which WA are loaded 1172*a05eeebfSFrançois Tigeot * @offset: This field specifies the start of the batch, it should be 1173*a05eeebfSFrançois Tigeot * cache-aligned otherwise it is adjusted accordingly. 1174*a05eeebfSFrançois Tigeot * Typically we only have one indirect_ctx and per_ctx batch buffer which are 1175*a05eeebfSFrançois Tigeot * initialized at the beginning and shared across all contexts but this field 1176*a05eeebfSFrançois Tigeot * helps us to have multiple batches at different offsets and select them based 1177*a05eeebfSFrançois Tigeot * on a criteria. At the moment this batch always start at the beginning of the page 1178*a05eeebfSFrançois Tigeot * and at this point we don't have multiple wa_ctx batch buffers. 1179*a05eeebfSFrançois Tigeot * 1180*a05eeebfSFrançois Tigeot * The number of WA applied are not known at the beginning; we use this field 1181*a05eeebfSFrançois Tigeot * to return the no of DWORDS written. 1182*a05eeebfSFrançois Tigeot * 1183*a05eeebfSFrançois Tigeot * It is to be noted that this batch does not contain MI_BATCH_BUFFER_END 1184*a05eeebfSFrançois Tigeot * so it adds NOOPs as padding to make it cacheline aligned. 1185*a05eeebfSFrançois Tigeot * MI_BATCH_BUFFER_END will be added to perctx batch and both of them together 1186*a05eeebfSFrançois Tigeot * makes a complete batch buffer. 1187*a05eeebfSFrançois Tigeot * 1188*a05eeebfSFrançois Tigeot * Return: non-zero if we exceed the PAGE_SIZE limit. 1189*a05eeebfSFrançois Tigeot */ 1190*a05eeebfSFrançois Tigeot 1191*a05eeebfSFrançois Tigeot static int gen8_init_indirectctx_bb(struct intel_engine_cs *ring, 1192*a05eeebfSFrançois Tigeot struct i915_wa_ctx_bb *wa_ctx, 1193*a05eeebfSFrançois Tigeot uint32_t *const batch, 1194*a05eeebfSFrançois Tigeot uint32_t *offset) 1195*a05eeebfSFrançois Tigeot { 1196*a05eeebfSFrançois Tigeot uint32_t scratch_addr; 1197*a05eeebfSFrançois Tigeot uint32_t index = wa_ctx_start(wa_ctx, *offset, CACHELINE_DWORDS); 1198*a05eeebfSFrançois Tigeot 1199*a05eeebfSFrançois Tigeot /* WaDisableCtxRestoreArbitration:bdw,chv */ 1200*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, MI_ARB_ON_OFF | MI_ARB_DISABLE); 1201*a05eeebfSFrançois Tigeot 1202*a05eeebfSFrançois Tigeot /* WaFlushCoherentL3CacheLinesAtContextSwitch:bdw */ 1203*a05eeebfSFrançois Tigeot if (IS_BROADWELL(ring->dev)) { 1204*a05eeebfSFrançois Tigeot index = gen8_emit_flush_coherentl3_wa(ring, batch, index); 1205*a05eeebfSFrançois Tigeot if (index < 0) 1206*a05eeebfSFrançois Tigeot return index; 1207*a05eeebfSFrançois Tigeot } 1208*a05eeebfSFrançois Tigeot 1209*a05eeebfSFrançois Tigeot /* WaClearSlmSpaceAtContextSwitch:bdw,chv */ 1210*a05eeebfSFrançois Tigeot /* Actual scratch location is at 128 bytes offset */ 1211*a05eeebfSFrançois Tigeot scratch_addr = ring->scratch.gtt_offset + 2*CACHELINE_BYTES; 1212*a05eeebfSFrançois Tigeot 1213*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, GFX_OP_PIPE_CONTROL(6)); 1214*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, (PIPE_CONTROL_FLUSH_L3 | 1215*a05eeebfSFrançois Tigeot PIPE_CONTROL_GLOBAL_GTT_IVB | 1216*a05eeebfSFrançois Tigeot PIPE_CONTROL_CS_STALL | 1217*a05eeebfSFrançois Tigeot PIPE_CONTROL_QW_WRITE)); 1218*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, scratch_addr); 1219*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, 0); 1220*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, 0); 1221*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, 0); 1222*a05eeebfSFrançois Tigeot 1223*a05eeebfSFrançois Tigeot /* Pad to end of cacheline */ 1224*a05eeebfSFrançois Tigeot while (index % CACHELINE_DWORDS) 1225*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, MI_NOOP); 1226*a05eeebfSFrançois Tigeot 1227*a05eeebfSFrançois Tigeot /* 1228*a05eeebfSFrançois Tigeot * MI_BATCH_BUFFER_END is not required in Indirect ctx BB because 1229*a05eeebfSFrançois Tigeot * execution depends on the length specified in terms of cache lines 1230*a05eeebfSFrançois Tigeot * in the register CTX_RCS_INDIRECT_CTX 1231*a05eeebfSFrançois Tigeot */ 1232*a05eeebfSFrançois Tigeot 1233*a05eeebfSFrançois Tigeot return wa_ctx_end(wa_ctx, *offset = index, CACHELINE_DWORDS); 1234*a05eeebfSFrançois Tigeot } 1235*a05eeebfSFrançois Tigeot 1236*a05eeebfSFrançois Tigeot /** 1237*a05eeebfSFrançois Tigeot * gen8_init_perctx_bb() - initialize per ctx batch with WA 1238*a05eeebfSFrançois Tigeot * 1239*a05eeebfSFrançois Tigeot * @ring: only applicable for RCS 1240*a05eeebfSFrançois Tigeot * @wa_ctx: structure representing wa_ctx 1241*a05eeebfSFrançois Tigeot * offset: specifies start of the batch, should be cache-aligned. 1242*a05eeebfSFrançois Tigeot * size: size of the batch in DWORDS but HW expects in terms of cachelines 1243*a05eeebfSFrançois Tigeot * @batch: page in which WA are loaded 1244*a05eeebfSFrançois Tigeot * @offset: This field specifies the start of this batch. 1245*a05eeebfSFrançois Tigeot * This batch is started immediately after indirect_ctx batch. Since we ensure 1246*a05eeebfSFrançois Tigeot * that indirect_ctx ends on a cacheline this batch is aligned automatically. 1247*a05eeebfSFrançois Tigeot * 1248*a05eeebfSFrançois Tigeot * The number of DWORDS written are returned using this field. 1249*a05eeebfSFrançois Tigeot * 1250*a05eeebfSFrançois Tigeot * This batch is terminated with MI_BATCH_BUFFER_END and so we need not add padding 1251*a05eeebfSFrançois Tigeot * to align it with cacheline as padding after MI_BATCH_BUFFER_END is redundant. 1252*a05eeebfSFrançois Tigeot */ 1253*a05eeebfSFrançois Tigeot static int gen8_init_perctx_bb(struct intel_engine_cs *ring, 1254*a05eeebfSFrançois Tigeot struct i915_wa_ctx_bb *wa_ctx, 1255*a05eeebfSFrançois Tigeot uint32_t *const batch, 1256*a05eeebfSFrançois Tigeot uint32_t *offset) 1257*a05eeebfSFrançois Tigeot { 1258*a05eeebfSFrançois Tigeot uint32_t index = wa_ctx_start(wa_ctx, *offset, CACHELINE_DWORDS); 1259*a05eeebfSFrançois Tigeot 1260*a05eeebfSFrançois Tigeot /* WaDisableCtxRestoreArbitration:bdw,chv */ 1261*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, MI_ARB_ON_OFF | MI_ARB_ENABLE); 1262*a05eeebfSFrançois Tigeot 1263*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, MI_BATCH_BUFFER_END); 1264*a05eeebfSFrançois Tigeot 1265*a05eeebfSFrançois Tigeot return wa_ctx_end(wa_ctx, *offset = index, 1); 1266*a05eeebfSFrançois Tigeot } 1267*a05eeebfSFrançois Tigeot 1268*a05eeebfSFrançois Tigeot static int gen9_init_indirectctx_bb(struct intel_engine_cs *ring, 1269*a05eeebfSFrançois Tigeot struct i915_wa_ctx_bb *wa_ctx, 1270*a05eeebfSFrançois Tigeot uint32_t *const batch, 1271*a05eeebfSFrançois Tigeot uint32_t *offset) 1272*a05eeebfSFrançois Tigeot { 1273*a05eeebfSFrançois Tigeot int ret; 1274*a05eeebfSFrançois Tigeot struct drm_device *dev = ring->dev; 1275*a05eeebfSFrançois Tigeot uint32_t index = wa_ctx_start(wa_ctx, *offset, CACHELINE_DWORDS); 1276*a05eeebfSFrançois Tigeot 1277*a05eeebfSFrançois Tigeot /* WaDisableCtxRestoreArbitration:skl,bxt */ 1278*a05eeebfSFrançois Tigeot if ((IS_SKYLAKE(dev) && (INTEL_REVID(dev) <= SKL_REVID_D0)) || 1279*a05eeebfSFrançois Tigeot (IS_BROXTON(dev) && (INTEL_REVID(dev) == BXT_REVID_A0))) 1280*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, MI_ARB_ON_OFF | MI_ARB_DISABLE); 1281*a05eeebfSFrançois Tigeot 1282*a05eeebfSFrançois Tigeot /* WaFlushCoherentL3CacheLinesAtContextSwitch:skl,bxt */ 1283*a05eeebfSFrançois Tigeot ret = gen8_emit_flush_coherentl3_wa(ring, batch, index); 1284*a05eeebfSFrançois Tigeot if (ret < 0) 1285*a05eeebfSFrançois Tigeot return ret; 1286*a05eeebfSFrançois Tigeot index = ret; 1287*a05eeebfSFrançois Tigeot 1288*a05eeebfSFrançois Tigeot /* Pad to end of cacheline */ 1289*a05eeebfSFrançois Tigeot while (index % CACHELINE_DWORDS) 1290*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, MI_NOOP); 1291*a05eeebfSFrançois Tigeot 1292*a05eeebfSFrançois Tigeot return wa_ctx_end(wa_ctx, *offset = index, CACHELINE_DWORDS); 1293*a05eeebfSFrançois Tigeot } 1294*a05eeebfSFrançois Tigeot 1295*a05eeebfSFrançois Tigeot static int gen9_init_perctx_bb(struct intel_engine_cs *ring, 1296*a05eeebfSFrançois Tigeot struct i915_wa_ctx_bb *wa_ctx, 1297*a05eeebfSFrançois Tigeot uint32_t *const batch, 1298*a05eeebfSFrançois Tigeot uint32_t *offset) 1299*a05eeebfSFrançois Tigeot { 1300*a05eeebfSFrançois Tigeot struct drm_device *dev = ring->dev; 1301*a05eeebfSFrançois Tigeot uint32_t index = wa_ctx_start(wa_ctx, *offset, CACHELINE_DWORDS); 1302*a05eeebfSFrançois Tigeot 1303*a05eeebfSFrançois Tigeot /* WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken:skl,bxt */ 1304*a05eeebfSFrançois Tigeot if ((IS_SKYLAKE(dev) && (INTEL_REVID(dev) <= SKL_REVID_B0)) || 1305*a05eeebfSFrançois Tigeot (IS_BROXTON(dev) && (INTEL_REVID(dev) == BXT_REVID_A0))) { 1306*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, MI_LOAD_REGISTER_IMM(1)); 1307*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, GEN9_SLICE_COMMON_ECO_CHICKEN0); 1308*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, 1309*a05eeebfSFrançois Tigeot _MASKED_BIT_ENABLE(DISABLE_PIXEL_MASK_CAMMING)); 1310*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, MI_NOOP); 1311*a05eeebfSFrançois Tigeot } 1312*a05eeebfSFrançois Tigeot 1313*a05eeebfSFrançois Tigeot /* WaDisableCtxRestoreArbitration:skl,bxt */ 1314*a05eeebfSFrançois Tigeot if ((IS_SKYLAKE(dev) && (INTEL_REVID(dev) <= SKL_REVID_D0)) || 1315*a05eeebfSFrançois Tigeot (IS_BROXTON(dev) && (INTEL_REVID(dev) == BXT_REVID_A0))) 1316*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, MI_ARB_ON_OFF | MI_ARB_ENABLE); 1317*a05eeebfSFrançois Tigeot 1318*a05eeebfSFrançois Tigeot wa_ctx_emit(batch, index, MI_BATCH_BUFFER_END); 1319*a05eeebfSFrançois Tigeot 1320*a05eeebfSFrançois Tigeot return wa_ctx_end(wa_ctx, *offset = index, 1); 1321*a05eeebfSFrançois Tigeot } 1322*a05eeebfSFrançois Tigeot 1323*a05eeebfSFrançois Tigeot static int lrc_setup_wa_ctx_obj(struct intel_engine_cs *ring, u32 size) 1324*a05eeebfSFrançois Tigeot { 1325*a05eeebfSFrançois Tigeot int ret; 1326*a05eeebfSFrançois Tigeot 1327*a05eeebfSFrançois Tigeot ring->wa_ctx.obj = i915_gem_alloc_object(ring->dev, PAGE_ALIGN(size)); 1328*a05eeebfSFrançois Tigeot if (!ring->wa_ctx.obj) { 1329*a05eeebfSFrançois Tigeot DRM_DEBUG_DRIVER("alloc LRC WA ctx backing obj failed.\n"); 1330*a05eeebfSFrançois Tigeot return -ENOMEM; 1331*a05eeebfSFrançois Tigeot } 1332*a05eeebfSFrançois Tigeot 1333*a05eeebfSFrançois Tigeot ret = i915_gem_obj_ggtt_pin(ring->wa_ctx.obj, PAGE_SIZE, 0); 1334*a05eeebfSFrançois Tigeot if (ret) { 1335*a05eeebfSFrançois Tigeot DRM_DEBUG_DRIVER("pin LRC WA ctx backing obj failed: %d\n", 1336*a05eeebfSFrançois Tigeot ret); 1337*a05eeebfSFrançois Tigeot drm_gem_object_unreference(&ring->wa_ctx.obj->base); 1338*a05eeebfSFrançois Tigeot return ret; 1339*a05eeebfSFrançois Tigeot } 1340*a05eeebfSFrançois Tigeot 1341*a05eeebfSFrançois Tigeot return 0; 1342*a05eeebfSFrançois Tigeot } 1343*a05eeebfSFrançois Tigeot 1344*a05eeebfSFrançois Tigeot static void lrc_destroy_wa_ctx_obj(struct intel_engine_cs *ring) 1345*a05eeebfSFrançois Tigeot { 1346*a05eeebfSFrançois Tigeot if (ring->wa_ctx.obj) { 1347*a05eeebfSFrançois Tigeot i915_gem_object_ggtt_unpin(ring->wa_ctx.obj); 1348*a05eeebfSFrançois Tigeot drm_gem_object_unreference(&ring->wa_ctx.obj->base); 1349*a05eeebfSFrançois Tigeot ring->wa_ctx.obj = NULL; 1350*a05eeebfSFrançois Tigeot } 1351*a05eeebfSFrançois Tigeot } 1352*a05eeebfSFrançois Tigeot 1353*a05eeebfSFrançois Tigeot static int intel_init_workaround_bb(struct intel_engine_cs *ring) 1354*a05eeebfSFrançois Tigeot { 1355*a05eeebfSFrançois Tigeot int ret; 1356*a05eeebfSFrançois Tigeot uint32_t *batch; 1357*a05eeebfSFrançois Tigeot uint32_t offset; 1358*a05eeebfSFrançois Tigeot struct vm_page *page; 1359*a05eeebfSFrançois Tigeot struct i915_ctx_workarounds *wa_ctx = &ring->wa_ctx; 1360*a05eeebfSFrançois Tigeot 1361*a05eeebfSFrançois Tigeot WARN_ON(ring->id != RCS); 1362*a05eeebfSFrançois Tigeot 1363*a05eeebfSFrançois Tigeot /* update this when WA for higher Gen are added */ 1364*a05eeebfSFrançois Tigeot if (INTEL_INFO(ring->dev)->gen > 9) { 1365*a05eeebfSFrançois Tigeot DRM_ERROR("WA batch buffer is not initialized for Gen%d\n", 1366*a05eeebfSFrançois Tigeot INTEL_INFO(ring->dev)->gen); 1367*a05eeebfSFrançois Tigeot return 0; 1368*a05eeebfSFrançois Tigeot } 1369*a05eeebfSFrançois Tigeot 1370*a05eeebfSFrançois Tigeot /* some WA perform writes to scratch page, ensure it is valid */ 1371*a05eeebfSFrançois Tigeot if (ring->scratch.obj == NULL) { 1372*a05eeebfSFrançois Tigeot DRM_ERROR("scratch page not allocated for %s\n", ring->name); 1373*a05eeebfSFrançois Tigeot return -EINVAL; 1374*a05eeebfSFrançois Tigeot } 1375*a05eeebfSFrançois Tigeot 1376*a05eeebfSFrançois Tigeot ret = lrc_setup_wa_ctx_obj(ring, PAGE_SIZE); 1377*a05eeebfSFrançois Tigeot if (ret) { 1378*a05eeebfSFrançois Tigeot DRM_DEBUG_DRIVER("Failed to setup context WA page: %d\n", ret); 1379*a05eeebfSFrançois Tigeot return ret; 1380*a05eeebfSFrançois Tigeot } 1381*a05eeebfSFrançois Tigeot 1382*a05eeebfSFrançois Tigeot page = i915_gem_object_get_page(wa_ctx->obj, 0); 1383*a05eeebfSFrançois Tigeot batch = kmap_atomic(page); 1384*a05eeebfSFrançois Tigeot offset = 0; 1385*a05eeebfSFrançois Tigeot 1386*a05eeebfSFrançois Tigeot if (INTEL_INFO(ring->dev)->gen == 8) { 1387*a05eeebfSFrançois Tigeot ret = gen8_init_indirectctx_bb(ring, 1388*a05eeebfSFrançois Tigeot &wa_ctx->indirect_ctx, 1389*a05eeebfSFrançois Tigeot batch, 1390*a05eeebfSFrançois Tigeot &offset); 1391*a05eeebfSFrançois Tigeot if (ret) 1392*a05eeebfSFrançois Tigeot goto out; 1393*a05eeebfSFrançois Tigeot 1394*a05eeebfSFrançois Tigeot ret = gen8_init_perctx_bb(ring, 1395*a05eeebfSFrançois Tigeot &wa_ctx->per_ctx, 1396*a05eeebfSFrançois Tigeot batch, 1397*a05eeebfSFrançois Tigeot &offset); 1398*a05eeebfSFrançois Tigeot if (ret) 1399*a05eeebfSFrançois Tigeot goto out; 1400*a05eeebfSFrançois Tigeot } else if (INTEL_INFO(ring->dev)->gen == 9) { 1401*a05eeebfSFrançois Tigeot ret = gen9_init_indirectctx_bb(ring, 1402*a05eeebfSFrançois Tigeot &wa_ctx->indirect_ctx, 1403*a05eeebfSFrançois Tigeot batch, 1404*a05eeebfSFrançois Tigeot &offset); 1405*a05eeebfSFrançois Tigeot if (ret) 1406*a05eeebfSFrançois Tigeot goto out; 1407*a05eeebfSFrançois Tigeot 1408*a05eeebfSFrançois Tigeot ret = gen9_init_perctx_bb(ring, 1409*a05eeebfSFrançois Tigeot &wa_ctx->per_ctx, 1410*a05eeebfSFrançois Tigeot batch, 1411*a05eeebfSFrançois Tigeot &offset); 1412*a05eeebfSFrançois Tigeot if (ret) 1413*a05eeebfSFrançois Tigeot goto out; 1414*a05eeebfSFrançois Tigeot } 1415*a05eeebfSFrançois Tigeot 1416*a05eeebfSFrançois Tigeot out: 1417*a05eeebfSFrançois Tigeot kunmap_atomic(batch); 1418*a05eeebfSFrançois Tigeot if (ret) 1419*a05eeebfSFrançois Tigeot lrc_destroy_wa_ctx_obj(ring); 1420*a05eeebfSFrançois Tigeot 1421*a05eeebfSFrançois Tigeot return ret; 1422*a05eeebfSFrançois Tigeot } 1423*a05eeebfSFrançois Tigeot 14241b13d190SFrançois Tigeot static int gen8_init_common_ring(struct intel_engine_cs *ring) 14251b13d190SFrançois Tigeot { 14261b13d190SFrançois Tigeot struct drm_device *dev = ring->dev; 14271b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 1428*a05eeebfSFrançois Tigeot u8 next_context_status_buffer_hw; 14291b13d190SFrançois Tigeot 14301b13d190SFrançois Tigeot I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask)); 14311b13d190SFrançois Tigeot I915_WRITE(RING_HWSTAM(ring->mmio_base), 0xffffffff); 14321b13d190SFrançois Tigeot 1433477eb7f9SFrançois Tigeot if (ring->status_page.obj) { 1434477eb7f9SFrançois Tigeot I915_WRITE(RING_HWS_PGA(ring->mmio_base), 1435477eb7f9SFrançois Tigeot (u32)ring->status_page.gfx_addr); 1436477eb7f9SFrançois Tigeot POSTING_READ(RING_HWS_PGA(ring->mmio_base)); 1437477eb7f9SFrançois Tigeot } 1438477eb7f9SFrançois Tigeot 14391b13d190SFrançois Tigeot I915_WRITE(RING_MODE_GEN7(ring), 14401b13d190SFrançois Tigeot _MASKED_BIT_DISABLE(GFX_REPLAY_MODE) | 14411b13d190SFrançois Tigeot _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE)); 14421b13d190SFrançois Tigeot POSTING_READ(RING_MODE_GEN7(ring)); 1443*a05eeebfSFrançois Tigeot 1444*a05eeebfSFrançois Tigeot /* 1445*a05eeebfSFrançois Tigeot * Instead of resetting the Context Status Buffer (CSB) read pointer to 1446*a05eeebfSFrançois Tigeot * zero, we need to read the write pointer from hardware and use its 1447*a05eeebfSFrançois Tigeot * value because "this register is power context save restored". 1448*a05eeebfSFrançois Tigeot * Effectively, these states have been observed: 1449*a05eeebfSFrançois Tigeot * 1450*a05eeebfSFrançois Tigeot * | Suspend-to-idle (freeze) | Suspend-to-RAM (mem) | 1451*a05eeebfSFrançois Tigeot * BDW | CSB regs not reset | CSB regs reset | 1452*a05eeebfSFrançois Tigeot * CHT | CSB regs not reset | CSB regs not reset | 1453*a05eeebfSFrançois Tigeot */ 1454*a05eeebfSFrançois Tigeot next_context_status_buffer_hw = (I915_READ(RING_CONTEXT_STATUS_PTR(ring)) 1455*a05eeebfSFrançois Tigeot & GEN8_CSB_PTR_MASK); 1456*a05eeebfSFrançois Tigeot 1457*a05eeebfSFrançois Tigeot /* 1458*a05eeebfSFrançois Tigeot * When the CSB registers are reset (also after power-up / gpu reset), 1459*a05eeebfSFrançois Tigeot * CSB write pointer is set to all 1's, which is not valid, use '5' in 1460*a05eeebfSFrançois Tigeot * this special case, so the first element read is CSB[0]. 1461*a05eeebfSFrançois Tigeot */ 1462*a05eeebfSFrançois Tigeot if (next_context_status_buffer_hw == GEN8_CSB_PTR_MASK) 1463*a05eeebfSFrançois Tigeot next_context_status_buffer_hw = (GEN8_CSB_ENTRIES - 1); 1464*a05eeebfSFrançois Tigeot 1465*a05eeebfSFrançois Tigeot ring->next_context_status_buffer = next_context_status_buffer_hw; 14661b13d190SFrançois Tigeot DRM_DEBUG_DRIVER("Execlists enabled for %s\n", ring->name); 14671b13d190SFrançois Tigeot 14681b13d190SFrançois Tigeot memset(&ring->hangcheck, 0, sizeof(ring->hangcheck)); 14691b13d190SFrançois Tigeot 14701b13d190SFrançois Tigeot return 0; 14711b13d190SFrançois Tigeot } 14721b13d190SFrançois Tigeot 14731b13d190SFrançois Tigeot static int gen8_init_render_ring(struct intel_engine_cs *ring) 14741b13d190SFrançois Tigeot { 14751b13d190SFrançois Tigeot struct drm_device *dev = ring->dev; 14761b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 14771b13d190SFrançois Tigeot int ret; 14781b13d190SFrançois Tigeot 14791b13d190SFrançois Tigeot ret = gen8_init_common_ring(ring); 14801b13d190SFrançois Tigeot if (ret) 14811b13d190SFrançois Tigeot return ret; 14821b13d190SFrançois Tigeot 14831b13d190SFrançois Tigeot /* We need to disable the AsyncFlip performance optimisations in order 14841b13d190SFrançois Tigeot * to use MI_WAIT_FOR_EVENT within the CS. It should already be 14851b13d190SFrançois Tigeot * programmed to '1' on all products. 14861b13d190SFrançois Tigeot * 14871b13d190SFrançois Tigeot * WaDisableAsyncFlipPerfMode:snb,ivb,hsw,vlv,bdw,chv 14881b13d190SFrançois Tigeot */ 14891b13d190SFrançois Tigeot I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(ASYNC_FLIP_PERF_DISABLE)); 14901b13d190SFrançois Tigeot 14911b13d190SFrançois Tigeot I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING)); 14921b13d190SFrançois Tigeot 14932c9916cdSFrançois Tigeot return init_workarounds_ring(ring); 14941b13d190SFrançois Tigeot } 14951b13d190SFrançois Tigeot 1496477eb7f9SFrançois Tigeot static int gen9_init_render_ring(struct intel_engine_cs *ring) 1497477eb7f9SFrançois Tigeot { 1498477eb7f9SFrançois Tigeot int ret; 1499477eb7f9SFrançois Tigeot 1500477eb7f9SFrançois Tigeot ret = gen8_init_common_ring(ring); 1501477eb7f9SFrançois Tigeot if (ret) 1502477eb7f9SFrançois Tigeot return ret; 1503477eb7f9SFrançois Tigeot 1504477eb7f9SFrançois Tigeot return init_workarounds_ring(ring); 1505477eb7f9SFrançois Tigeot } 1506477eb7f9SFrançois Tigeot 1507*a05eeebfSFrançois Tigeot static int intel_logical_ring_emit_pdps(struct drm_i915_gem_request *req) 1508*a05eeebfSFrançois Tigeot { 1509*a05eeebfSFrançois Tigeot struct i915_hw_ppgtt *ppgtt = req->ctx->ppgtt; 1510*a05eeebfSFrançois Tigeot struct intel_engine_cs *ring = req->ring; 1511*a05eeebfSFrançois Tigeot struct intel_ringbuffer *ringbuf = req->ringbuf; 1512*a05eeebfSFrançois Tigeot const int num_lri_cmds = GEN8_LEGACY_PDPES * 2; 1513*a05eeebfSFrançois Tigeot int i, ret; 1514*a05eeebfSFrançois Tigeot 1515*a05eeebfSFrançois Tigeot ret = intel_logical_ring_begin(req, num_lri_cmds * 2 + 2); 1516*a05eeebfSFrançois Tigeot if (ret) 1517*a05eeebfSFrançois Tigeot return ret; 1518*a05eeebfSFrançois Tigeot 1519*a05eeebfSFrançois Tigeot intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(num_lri_cmds)); 1520*a05eeebfSFrançois Tigeot for (i = GEN8_LEGACY_PDPES - 1; i >= 0; i--) { 1521*a05eeebfSFrançois Tigeot const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i); 1522*a05eeebfSFrançois Tigeot 1523*a05eeebfSFrançois Tigeot intel_logical_ring_emit(ringbuf, GEN8_RING_PDP_UDW(ring, i)); 1524*a05eeebfSFrançois Tigeot intel_logical_ring_emit(ringbuf, upper_32_bits(pd_daddr)); 1525*a05eeebfSFrançois Tigeot intel_logical_ring_emit(ringbuf, GEN8_RING_PDP_LDW(ring, i)); 1526*a05eeebfSFrançois Tigeot intel_logical_ring_emit(ringbuf, lower_32_bits(pd_daddr)); 1527*a05eeebfSFrançois Tigeot } 1528*a05eeebfSFrançois Tigeot 1529*a05eeebfSFrançois Tigeot intel_logical_ring_emit(ringbuf, MI_NOOP); 1530*a05eeebfSFrançois Tigeot intel_logical_ring_advance(ringbuf); 1531*a05eeebfSFrançois Tigeot 1532*a05eeebfSFrançois Tigeot return 0; 1533*a05eeebfSFrançois Tigeot } 1534*a05eeebfSFrançois Tigeot 1535*a05eeebfSFrançois Tigeot static int gen8_emit_bb_start(struct drm_i915_gem_request *req, 1536477eb7f9SFrançois Tigeot u64 offset, unsigned dispatch_flags) 15371b13d190SFrançois Tigeot { 1538*a05eeebfSFrançois Tigeot struct intel_ringbuffer *ringbuf = req->ringbuf; 1539477eb7f9SFrançois Tigeot bool ppgtt = !(dispatch_flags & I915_DISPATCH_SECURE); 15401b13d190SFrançois Tigeot int ret; 15411b13d190SFrançois Tigeot 1542*a05eeebfSFrançois Tigeot /* Don't rely in hw updating PDPs, specially in lite-restore. 1543*a05eeebfSFrançois Tigeot * Ideally, we should set Force PD Restore in ctx descriptor, 1544*a05eeebfSFrançois Tigeot * but we can't. Force Restore would be a second option, but 1545*a05eeebfSFrançois Tigeot * it is unsafe in case of lite-restore (because the ctx is 1546*a05eeebfSFrançois Tigeot * not idle). */ 1547*a05eeebfSFrançois Tigeot if (req->ctx->ppgtt && 1548*a05eeebfSFrançois Tigeot (intel_ring_flag(req->ring) & req->ctx->ppgtt->pd_dirty_rings)) { 1549*a05eeebfSFrançois Tigeot ret = intel_logical_ring_emit_pdps(req); 1550*a05eeebfSFrançois Tigeot if (ret) 1551*a05eeebfSFrançois Tigeot return ret; 1552*a05eeebfSFrançois Tigeot 1553*a05eeebfSFrançois Tigeot req->ctx->ppgtt->pd_dirty_rings &= ~intel_ring_flag(req->ring); 1554*a05eeebfSFrançois Tigeot } 1555*a05eeebfSFrançois Tigeot 1556*a05eeebfSFrançois Tigeot ret = intel_logical_ring_begin(req, 4); 15571b13d190SFrançois Tigeot if (ret) 15581b13d190SFrançois Tigeot return ret; 15591b13d190SFrançois Tigeot 15601b13d190SFrançois Tigeot /* FIXME(BDW): Address space and security selectors. */ 1561*a05eeebfSFrançois Tigeot intel_logical_ring_emit(ringbuf, MI_BATCH_BUFFER_START_GEN8 | 1562*a05eeebfSFrançois Tigeot (ppgtt<<8) | 1563*a05eeebfSFrançois Tigeot (dispatch_flags & I915_DISPATCH_RS ? 1564*a05eeebfSFrançois Tigeot MI_BATCH_RESOURCE_STREAMER : 0)); 15651b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, lower_32_bits(offset)); 15661b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, upper_32_bits(offset)); 15671b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, MI_NOOP); 15681b13d190SFrançois Tigeot intel_logical_ring_advance(ringbuf); 15691b13d190SFrançois Tigeot 15701b13d190SFrançois Tigeot return 0; 15711b13d190SFrançois Tigeot } 15721b13d190SFrançois Tigeot 15731b13d190SFrançois Tigeot static bool gen8_logical_ring_get_irq(struct intel_engine_cs *ring) 15741b13d190SFrançois Tigeot { 15751b13d190SFrançois Tigeot struct drm_device *dev = ring->dev; 15761b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 15771b13d190SFrançois Tigeot 15782c9916cdSFrançois Tigeot if (WARN_ON(!intel_irqs_enabled(dev_priv))) 15791b13d190SFrançois Tigeot return false; 15801b13d190SFrançois Tigeot 15811b13d190SFrançois Tigeot lockmgr(&dev_priv->irq_lock, LK_EXCLUSIVE); 15821b13d190SFrançois Tigeot if (ring->irq_refcount++ == 0) { 15831b13d190SFrançois Tigeot I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask)); 15841b13d190SFrançois Tigeot POSTING_READ(RING_IMR(ring->mmio_base)); 15851b13d190SFrançois Tigeot } 15861b13d190SFrançois Tigeot lockmgr(&dev_priv->irq_lock, LK_RELEASE); 15871b13d190SFrançois Tigeot 15881b13d190SFrançois Tigeot return true; 15891b13d190SFrançois Tigeot } 15901b13d190SFrançois Tigeot 15911b13d190SFrançois Tigeot static void gen8_logical_ring_put_irq(struct intel_engine_cs *ring) 15921b13d190SFrançois Tigeot { 15931b13d190SFrançois Tigeot struct drm_device *dev = ring->dev; 15941b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 15951b13d190SFrançois Tigeot 15961b13d190SFrançois Tigeot lockmgr(&dev_priv->irq_lock, LK_EXCLUSIVE); 15971b13d190SFrançois Tigeot if (--ring->irq_refcount == 0) { 15981b13d190SFrançois Tigeot I915_WRITE_IMR(ring, ~ring->irq_keep_mask); 15991b13d190SFrançois Tigeot POSTING_READ(RING_IMR(ring->mmio_base)); 16001b13d190SFrançois Tigeot } 16011b13d190SFrançois Tigeot lockmgr(&dev_priv->irq_lock, LK_RELEASE); 16021b13d190SFrançois Tigeot } 16031b13d190SFrançois Tigeot 1604*a05eeebfSFrançois Tigeot static int gen8_emit_flush(struct drm_i915_gem_request *request, 16051b13d190SFrançois Tigeot u32 invalidate_domains, 16061b13d190SFrançois Tigeot u32 unused) 16071b13d190SFrançois Tigeot { 1608*a05eeebfSFrançois Tigeot struct intel_ringbuffer *ringbuf = request->ringbuf; 16091b13d190SFrançois Tigeot struct intel_engine_cs *ring = ringbuf->ring; 16101b13d190SFrançois Tigeot struct drm_device *dev = ring->dev; 16111b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 16121b13d190SFrançois Tigeot uint32_t cmd; 16131b13d190SFrançois Tigeot int ret; 16141b13d190SFrançois Tigeot 1615*a05eeebfSFrançois Tigeot ret = intel_logical_ring_begin(request, 4); 16161b13d190SFrançois Tigeot if (ret) 16171b13d190SFrançois Tigeot return ret; 16181b13d190SFrançois Tigeot 16191b13d190SFrançois Tigeot cmd = MI_FLUSH_DW + 1; 16201b13d190SFrançois Tigeot 16212c9916cdSFrançois Tigeot /* We always require a command barrier so that subsequent 16222c9916cdSFrançois Tigeot * commands, such as breadcrumb interrupts, are strictly ordered 16232c9916cdSFrançois Tigeot * wrt the contents of the write cache being flushed to memory 16242c9916cdSFrançois Tigeot * (and thus being coherent from the CPU). 16252c9916cdSFrançois Tigeot */ 16262c9916cdSFrançois Tigeot cmd |= MI_FLUSH_DW_STORE_INDEX | MI_FLUSH_DW_OP_STOREDW; 16272c9916cdSFrançois Tigeot 16282c9916cdSFrançois Tigeot if (invalidate_domains & I915_GEM_GPU_DOMAINS) { 16292c9916cdSFrançois Tigeot cmd |= MI_INVALIDATE_TLB; 16302c9916cdSFrançois Tigeot if (ring == &dev_priv->ring[VCS]) 16312c9916cdSFrançois Tigeot cmd |= MI_INVALIDATE_BSD; 16321b13d190SFrançois Tigeot } 16331b13d190SFrançois Tigeot 16341b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, cmd); 16351b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, 16361b13d190SFrançois Tigeot I915_GEM_HWS_SCRATCH_ADDR | 16371b13d190SFrançois Tigeot MI_FLUSH_DW_USE_GTT); 16381b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, 0); /* upper addr */ 16391b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, 0); /* value */ 16401b13d190SFrançois Tigeot intel_logical_ring_advance(ringbuf); 16411b13d190SFrançois Tigeot 16421b13d190SFrançois Tigeot return 0; 16431b13d190SFrançois Tigeot } 16441b13d190SFrançois Tigeot 1645*a05eeebfSFrançois Tigeot static int gen8_emit_flush_render(struct drm_i915_gem_request *request, 16461b13d190SFrançois Tigeot u32 invalidate_domains, 16471b13d190SFrançois Tigeot u32 flush_domains) 16481b13d190SFrançois Tigeot { 1649*a05eeebfSFrançois Tigeot struct intel_ringbuffer *ringbuf = request->ringbuf; 16501b13d190SFrançois Tigeot struct intel_engine_cs *ring = ringbuf->ring; 16511b13d190SFrançois Tigeot u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES; 165219c468b4SFrançois Tigeot bool vf_flush_wa; 16531b13d190SFrançois Tigeot u32 flags = 0; 16541b13d190SFrançois Tigeot int ret; 16551b13d190SFrançois Tigeot 16561b13d190SFrançois Tigeot flags |= PIPE_CONTROL_CS_STALL; 16571b13d190SFrançois Tigeot 16581b13d190SFrançois Tigeot if (flush_domains) { 16591b13d190SFrançois Tigeot flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH; 16601b13d190SFrançois Tigeot flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH; 1661b49c8cf9SFrançois Tigeot flags |= PIPE_CONTROL_FLUSH_ENABLE; 16621b13d190SFrançois Tigeot } 16631b13d190SFrançois Tigeot 16641b13d190SFrançois Tigeot if (invalidate_domains) { 16651b13d190SFrançois Tigeot flags |= PIPE_CONTROL_TLB_INVALIDATE; 16661b13d190SFrançois Tigeot flags |= PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE; 16671b13d190SFrançois Tigeot flags |= PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE; 16681b13d190SFrançois Tigeot flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE; 16691b13d190SFrançois Tigeot flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE; 16701b13d190SFrançois Tigeot flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE; 16711b13d190SFrançois Tigeot flags |= PIPE_CONTROL_QW_WRITE; 16721b13d190SFrançois Tigeot flags |= PIPE_CONTROL_GLOBAL_GTT_IVB; 16731b13d190SFrançois Tigeot } 16741b13d190SFrançois Tigeot 167519c468b4SFrançois Tigeot /* 167619c468b4SFrançois Tigeot * On GEN9+ Before VF_CACHE_INVALIDATE we need to emit a NULL pipe 167719c468b4SFrançois Tigeot * control. 167819c468b4SFrançois Tigeot */ 167919c468b4SFrançois Tigeot vf_flush_wa = INTEL_INFO(ring->dev)->gen >= 9 && 168019c468b4SFrançois Tigeot flags & PIPE_CONTROL_VF_CACHE_INVALIDATE; 168119c468b4SFrançois Tigeot 1682*a05eeebfSFrançois Tigeot ret = intel_logical_ring_begin(request, vf_flush_wa ? 12 : 6); 16831b13d190SFrançois Tigeot if (ret) 16841b13d190SFrançois Tigeot return ret; 16851b13d190SFrançois Tigeot 168619c468b4SFrançois Tigeot if (vf_flush_wa) { 168719c468b4SFrançois Tigeot intel_logical_ring_emit(ringbuf, GFX_OP_PIPE_CONTROL(6)); 168819c468b4SFrançois Tigeot intel_logical_ring_emit(ringbuf, 0); 168919c468b4SFrançois Tigeot intel_logical_ring_emit(ringbuf, 0); 169019c468b4SFrançois Tigeot intel_logical_ring_emit(ringbuf, 0); 169119c468b4SFrançois Tigeot intel_logical_ring_emit(ringbuf, 0); 169219c468b4SFrançois Tigeot intel_logical_ring_emit(ringbuf, 0); 169319c468b4SFrançois Tigeot } 169419c468b4SFrançois Tigeot 16951b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, GFX_OP_PIPE_CONTROL(6)); 16961b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, flags); 16971b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, scratch_addr); 16981b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, 0); 16991b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, 0); 17001b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, 0); 17011b13d190SFrançois Tigeot intel_logical_ring_advance(ringbuf); 17021b13d190SFrançois Tigeot 17031b13d190SFrançois Tigeot return 0; 17041b13d190SFrançois Tigeot } 17051b13d190SFrançois Tigeot 17061b13d190SFrançois Tigeot static u32 gen8_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) 17071b13d190SFrançois Tigeot { 17081b13d190SFrançois Tigeot return intel_read_status_page(ring, I915_GEM_HWS_INDEX); 17091b13d190SFrançois Tigeot } 17101b13d190SFrançois Tigeot 17111b13d190SFrançois Tigeot static void gen8_set_seqno(struct intel_engine_cs *ring, u32 seqno) 17121b13d190SFrançois Tigeot { 17131b13d190SFrançois Tigeot intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno); 17141b13d190SFrançois Tigeot } 17151b13d190SFrançois Tigeot 1716*a05eeebfSFrançois Tigeot static int gen8_emit_request(struct drm_i915_gem_request *request) 17171b13d190SFrançois Tigeot { 1718*a05eeebfSFrançois Tigeot struct intel_ringbuffer *ringbuf = request->ringbuf; 17191b13d190SFrançois Tigeot struct intel_engine_cs *ring = ringbuf->ring; 17201b13d190SFrançois Tigeot u32 cmd; 17211b13d190SFrançois Tigeot int ret; 17221b13d190SFrançois Tigeot 1723477eb7f9SFrançois Tigeot /* 1724477eb7f9SFrançois Tigeot * Reserve space for 2 NOOPs at the end of each request to be 1725477eb7f9SFrançois Tigeot * used as a workaround for not being allowed to do lite 1726477eb7f9SFrançois Tigeot * restore with HEAD==TAIL (WaIdleLiteRestore). 1727477eb7f9SFrançois Tigeot */ 1728*a05eeebfSFrançois Tigeot ret = intel_logical_ring_begin(request, 8); 17291b13d190SFrançois Tigeot if (ret) 17301b13d190SFrançois Tigeot return ret; 17311b13d190SFrançois Tigeot 17322c9916cdSFrançois Tigeot cmd = MI_STORE_DWORD_IMM_GEN4; 17331b13d190SFrançois Tigeot cmd |= MI_GLOBAL_GTT; 17341b13d190SFrançois Tigeot 17351b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, cmd); 17361b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, 17371b13d190SFrançois Tigeot (ring->status_page.gfx_addr + 17381b13d190SFrançois Tigeot (I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT))); 17391b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, 0); 1740*a05eeebfSFrançois Tigeot intel_logical_ring_emit(ringbuf, i915_gem_request_get_seqno(request)); 17411b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT); 17421b13d190SFrançois Tigeot intel_logical_ring_emit(ringbuf, MI_NOOP); 1743*a05eeebfSFrançois Tigeot intel_logical_ring_advance_and_submit(request); 17441b13d190SFrançois Tigeot 1745477eb7f9SFrançois Tigeot /* 1746477eb7f9SFrançois Tigeot * Here we add two extra NOOPs as padding to avoid 1747477eb7f9SFrançois Tigeot * lite restore of a context with HEAD==TAIL. 1748477eb7f9SFrançois Tigeot */ 1749477eb7f9SFrançois Tigeot intel_logical_ring_emit(ringbuf, MI_NOOP); 1750477eb7f9SFrançois Tigeot intel_logical_ring_emit(ringbuf, MI_NOOP); 1751477eb7f9SFrançois Tigeot intel_logical_ring_advance(ringbuf); 1752477eb7f9SFrançois Tigeot 17531b13d190SFrançois Tigeot return 0; 17541b13d190SFrançois Tigeot } 17551b13d190SFrançois Tigeot 1756*a05eeebfSFrançois Tigeot static int intel_lr_context_render_state_init(struct drm_i915_gem_request *req) 1757477eb7f9SFrançois Tigeot { 1758477eb7f9SFrançois Tigeot struct render_state so; 1759477eb7f9SFrançois Tigeot int ret; 1760477eb7f9SFrançois Tigeot 1761*a05eeebfSFrançois Tigeot ret = i915_gem_render_state_prepare(req->ring, &so); 1762477eb7f9SFrançois Tigeot if (ret) 1763477eb7f9SFrançois Tigeot return ret; 1764477eb7f9SFrançois Tigeot 1765477eb7f9SFrançois Tigeot if (so.rodata == NULL) 1766477eb7f9SFrançois Tigeot return 0; 1767477eb7f9SFrançois Tigeot 1768*a05eeebfSFrançois Tigeot ret = req->ring->emit_bb_start(req, so.ggtt_offset, 1769477eb7f9SFrançois Tigeot I915_DISPATCH_SECURE); 1770477eb7f9SFrançois Tigeot if (ret) 1771477eb7f9SFrançois Tigeot goto out; 1772477eb7f9SFrançois Tigeot 1773*a05eeebfSFrançois Tigeot ret = req->ring->emit_bb_start(req, 1774*a05eeebfSFrançois Tigeot (so.ggtt_offset + so.aux_batch_offset), 1775*a05eeebfSFrançois Tigeot I915_DISPATCH_SECURE); 1776*a05eeebfSFrançois Tigeot if (ret) 1777*a05eeebfSFrançois Tigeot goto out; 1778477eb7f9SFrançois Tigeot 1779*a05eeebfSFrançois Tigeot i915_vma_move_to_active(i915_gem_obj_to_ggtt(so.obj), req); 1780*a05eeebfSFrançois Tigeot 1781477eb7f9SFrançois Tigeot out: 1782477eb7f9SFrançois Tigeot i915_gem_render_state_fini(&so); 1783477eb7f9SFrançois Tigeot return ret; 1784477eb7f9SFrançois Tigeot } 1785477eb7f9SFrançois Tigeot 1786*a05eeebfSFrançois Tigeot static int gen8_init_rcs_context(struct drm_i915_gem_request *req) 17872c9916cdSFrançois Tigeot { 17882c9916cdSFrançois Tigeot int ret; 17892c9916cdSFrançois Tigeot 1790*a05eeebfSFrançois Tigeot ret = intel_logical_ring_workarounds_emit(req); 17912c9916cdSFrançois Tigeot if (ret) 17922c9916cdSFrançois Tigeot return ret; 17932c9916cdSFrançois Tigeot 1794*a05eeebfSFrançois Tigeot ret = intel_rcs_context_init_mocs(req); 1795*a05eeebfSFrançois Tigeot /* 1796*a05eeebfSFrançois Tigeot * Failing to program the MOCS is non-fatal.The system will not 1797*a05eeebfSFrançois Tigeot * run at peak performance. So generate an error and carry on. 1798*a05eeebfSFrançois Tigeot */ 1799*a05eeebfSFrançois Tigeot if (ret) 1800*a05eeebfSFrançois Tigeot DRM_ERROR("MOCS failed to program: expect performance issues.\n"); 1801*a05eeebfSFrançois Tigeot 1802*a05eeebfSFrançois Tigeot return intel_lr_context_render_state_init(req); 18032c9916cdSFrançois Tigeot } 18042c9916cdSFrançois Tigeot 18051b13d190SFrançois Tigeot /** 18061b13d190SFrançois Tigeot * intel_logical_ring_cleanup() - deallocate the Engine Command Streamer 18071b13d190SFrançois Tigeot * 18081b13d190SFrançois Tigeot * @ring: Engine Command Streamer. 18091b13d190SFrançois Tigeot * 18101b13d190SFrançois Tigeot */ 18111b13d190SFrançois Tigeot void intel_logical_ring_cleanup(struct intel_engine_cs *ring) 18121b13d190SFrançois Tigeot { 18132c9916cdSFrançois Tigeot struct drm_i915_private *dev_priv; 18141b13d190SFrançois Tigeot 18151b13d190SFrançois Tigeot if (!intel_ring_initialized(ring)) 18161b13d190SFrançois Tigeot return; 18171b13d190SFrançois Tigeot 18182c9916cdSFrançois Tigeot dev_priv = ring->dev->dev_private; 18192c9916cdSFrançois Tigeot 18201b13d190SFrançois Tigeot intel_logical_ring_stop(ring); 18211b13d190SFrançois Tigeot WARN_ON((I915_READ_MODE(ring) & MODE_IDLE) == 0); 18221b13d190SFrançois Tigeot 18231b13d190SFrançois Tigeot if (ring->cleanup) 18241b13d190SFrançois Tigeot ring->cleanup(ring); 18251b13d190SFrançois Tigeot 18261b13d190SFrançois Tigeot i915_cmd_parser_fini_ring(ring); 182719c468b4SFrançois Tigeot i915_gem_batch_pool_fini(&ring->batch_pool); 18281b13d190SFrançois Tigeot 18291b13d190SFrançois Tigeot if (ring->status_page.obj) { 18307ec9f8e5SFrançois Tigeot kunmap(sg_page(ring->status_page.obj->pages->sgl)); 18311b13d190SFrançois Tigeot ring->status_page.obj = NULL; 18321b13d190SFrançois Tigeot } 1833*a05eeebfSFrançois Tigeot 1834*a05eeebfSFrançois Tigeot lrc_destroy_wa_ctx_obj(ring); 18351b13d190SFrançois Tigeot } 18361b13d190SFrançois Tigeot 18371b13d190SFrançois Tigeot static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *ring) 18381b13d190SFrançois Tigeot { 18391b13d190SFrançois Tigeot int ret; 18401b13d190SFrançois Tigeot 18411b13d190SFrançois Tigeot /* Intentionally left blank. */ 18421b13d190SFrançois Tigeot ring->buffer = NULL; 18431b13d190SFrançois Tigeot 18441b13d190SFrançois Tigeot ring->dev = dev; 18451b13d190SFrançois Tigeot INIT_LIST_HEAD(&ring->active_list); 18461b13d190SFrançois Tigeot INIT_LIST_HEAD(&ring->request_list); 184719c468b4SFrançois Tigeot i915_gem_batch_pool_init(dev, &ring->batch_pool); 18481b13d190SFrançois Tigeot init_waitqueue_head(&ring->irq_queue); 18491b13d190SFrançois Tigeot 18501b13d190SFrançois Tigeot INIT_LIST_HEAD(&ring->execlist_queue); 18512c9916cdSFrançois Tigeot INIT_LIST_HEAD(&ring->execlist_retired_req_list); 18521b13d190SFrançois Tigeot lockinit(&ring->execlist_lock, "i915el", 0, LK_CANRECURSE); 18531b13d190SFrançois Tigeot 18541b13d190SFrançois Tigeot ret = i915_cmd_parser_init_ring(ring); 18551b13d190SFrançois Tigeot if (ret) 18561b13d190SFrançois Tigeot return ret; 18571b13d190SFrançois Tigeot 18581b13d190SFrançois Tigeot ret = intel_lr_context_deferred_create(ring->default_context, ring); 18591b13d190SFrançois Tigeot 18601b13d190SFrançois Tigeot return ret; 18611b13d190SFrançois Tigeot } 18621b13d190SFrançois Tigeot 18631b13d190SFrançois Tigeot static int logical_render_ring_init(struct drm_device *dev) 18641b13d190SFrançois Tigeot { 18651b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 18661b13d190SFrançois Tigeot struct intel_engine_cs *ring = &dev_priv->ring[RCS]; 18672c9916cdSFrançois Tigeot int ret; 18681b13d190SFrançois Tigeot 18691b13d190SFrançois Tigeot ring->name = "render ring"; 18701b13d190SFrançois Tigeot ring->id = RCS; 18711b13d190SFrançois Tigeot ring->mmio_base = RENDER_RING_BASE; 18721b13d190SFrançois Tigeot ring->irq_enable_mask = 18731b13d190SFrançois Tigeot GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT; 18741b13d190SFrançois Tigeot ring->irq_keep_mask = 18751b13d190SFrançois Tigeot GT_CONTEXT_SWITCH_INTERRUPT << GEN8_RCS_IRQ_SHIFT; 18761b13d190SFrançois Tigeot if (HAS_L3_DPF(dev)) 18771b13d190SFrançois Tigeot ring->irq_keep_mask |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT; 18781b13d190SFrançois Tigeot 1879477eb7f9SFrançois Tigeot if (INTEL_INFO(dev)->gen >= 9) 1880477eb7f9SFrançois Tigeot ring->init_hw = gen9_init_render_ring; 1881477eb7f9SFrançois Tigeot else 18822c9916cdSFrançois Tigeot ring->init_hw = gen8_init_render_ring; 18832c9916cdSFrançois Tigeot ring->init_context = gen8_init_rcs_context; 18841b13d190SFrançois Tigeot ring->cleanup = intel_fini_pipe_control; 18851b13d190SFrançois Tigeot ring->get_seqno = gen8_get_seqno; 18861b13d190SFrançois Tigeot ring->set_seqno = gen8_set_seqno; 18871b13d190SFrançois Tigeot ring->emit_request = gen8_emit_request; 18881b13d190SFrançois Tigeot ring->emit_flush = gen8_emit_flush_render; 18891b13d190SFrançois Tigeot ring->irq_get = gen8_logical_ring_get_irq; 18901b13d190SFrançois Tigeot ring->irq_put = gen8_logical_ring_put_irq; 18911b13d190SFrançois Tigeot ring->emit_bb_start = gen8_emit_bb_start; 18921b13d190SFrançois Tigeot 18932c9916cdSFrançois Tigeot ring->dev = dev; 1894*a05eeebfSFrançois Tigeot 1895*a05eeebfSFrançois Tigeot ret = intel_init_pipe_control(ring); 18962c9916cdSFrançois Tigeot if (ret) 18972c9916cdSFrançois Tigeot return ret; 18982c9916cdSFrançois Tigeot 1899*a05eeebfSFrançois Tigeot ret = intel_init_workaround_bb(ring); 1900*a05eeebfSFrançois Tigeot if (ret) { 1901*a05eeebfSFrançois Tigeot /* 1902*a05eeebfSFrançois Tigeot * We continue even if we fail to initialize WA batch 1903*a05eeebfSFrançois Tigeot * because we only expect rare glitches but nothing 1904*a05eeebfSFrançois Tigeot * critical to prevent us from using GPU 1905*a05eeebfSFrançois Tigeot */ 1906*a05eeebfSFrançois Tigeot DRM_ERROR("WA batch buffer initialization failed: %d\n", 1907*a05eeebfSFrançois Tigeot ret); 1908*a05eeebfSFrançois Tigeot } 1909*a05eeebfSFrançois Tigeot 1910*a05eeebfSFrançois Tigeot ret = logical_ring_init(dev, ring); 1911*a05eeebfSFrançois Tigeot if (ret) { 1912*a05eeebfSFrançois Tigeot lrc_destroy_wa_ctx_obj(ring); 1913*a05eeebfSFrançois Tigeot } 1914*a05eeebfSFrançois Tigeot 1915*a05eeebfSFrançois Tigeot return ret; 19161b13d190SFrançois Tigeot } 19171b13d190SFrançois Tigeot 19181b13d190SFrançois Tigeot static int logical_bsd_ring_init(struct drm_device *dev) 19191b13d190SFrançois Tigeot { 19201b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 19211b13d190SFrançois Tigeot struct intel_engine_cs *ring = &dev_priv->ring[VCS]; 19221b13d190SFrançois Tigeot 19231b13d190SFrançois Tigeot ring->name = "bsd ring"; 19241b13d190SFrançois Tigeot ring->id = VCS; 19251b13d190SFrançois Tigeot ring->mmio_base = GEN6_BSD_RING_BASE; 19261b13d190SFrançois Tigeot ring->irq_enable_mask = 19271b13d190SFrançois Tigeot GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT; 19281b13d190SFrançois Tigeot ring->irq_keep_mask = 19291b13d190SFrançois Tigeot GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT; 19301b13d190SFrançois Tigeot 19312c9916cdSFrançois Tigeot ring->init_hw = gen8_init_common_ring; 19321b13d190SFrançois Tigeot ring->get_seqno = gen8_get_seqno; 19331b13d190SFrançois Tigeot ring->set_seqno = gen8_set_seqno; 19341b13d190SFrançois Tigeot ring->emit_request = gen8_emit_request; 19351b13d190SFrançois Tigeot ring->emit_flush = gen8_emit_flush; 19361b13d190SFrançois Tigeot ring->irq_get = gen8_logical_ring_get_irq; 19371b13d190SFrançois Tigeot ring->irq_put = gen8_logical_ring_put_irq; 19381b13d190SFrançois Tigeot ring->emit_bb_start = gen8_emit_bb_start; 19391b13d190SFrançois Tigeot 19401b13d190SFrançois Tigeot return logical_ring_init(dev, ring); 19411b13d190SFrançois Tigeot } 19421b13d190SFrançois Tigeot 19431b13d190SFrançois Tigeot static int logical_bsd2_ring_init(struct drm_device *dev) 19441b13d190SFrançois Tigeot { 19451b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 19461b13d190SFrançois Tigeot struct intel_engine_cs *ring = &dev_priv->ring[VCS2]; 19471b13d190SFrançois Tigeot 19481b13d190SFrançois Tigeot ring->name = "bds2 ring"; 19491b13d190SFrançois Tigeot ring->id = VCS2; 19501b13d190SFrançois Tigeot ring->mmio_base = GEN8_BSD2_RING_BASE; 19511b13d190SFrançois Tigeot ring->irq_enable_mask = 19521b13d190SFrançois Tigeot GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT; 19531b13d190SFrançois Tigeot ring->irq_keep_mask = 19541b13d190SFrançois Tigeot GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT; 19551b13d190SFrançois Tigeot 19562c9916cdSFrançois Tigeot ring->init_hw = gen8_init_common_ring; 19571b13d190SFrançois Tigeot ring->get_seqno = gen8_get_seqno; 19581b13d190SFrançois Tigeot ring->set_seqno = gen8_set_seqno; 19591b13d190SFrançois Tigeot ring->emit_request = gen8_emit_request; 19601b13d190SFrançois Tigeot ring->emit_flush = gen8_emit_flush; 19611b13d190SFrançois Tigeot ring->irq_get = gen8_logical_ring_get_irq; 19621b13d190SFrançois Tigeot ring->irq_put = gen8_logical_ring_put_irq; 19631b13d190SFrançois Tigeot ring->emit_bb_start = gen8_emit_bb_start; 19641b13d190SFrançois Tigeot 19651b13d190SFrançois Tigeot return logical_ring_init(dev, ring); 19661b13d190SFrançois Tigeot } 19671b13d190SFrançois Tigeot 19681b13d190SFrançois Tigeot static int logical_blt_ring_init(struct drm_device *dev) 19691b13d190SFrançois Tigeot { 19701b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 19711b13d190SFrançois Tigeot struct intel_engine_cs *ring = &dev_priv->ring[BCS]; 19721b13d190SFrançois Tigeot 19731b13d190SFrançois Tigeot ring->name = "blitter ring"; 19741b13d190SFrançois Tigeot ring->id = BCS; 19751b13d190SFrançois Tigeot ring->mmio_base = BLT_RING_BASE; 19761b13d190SFrançois Tigeot ring->irq_enable_mask = 19771b13d190SFrançois Tigeot GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT; 19781b13d190SFrançois Tigeot ring->irq_keep_mask = 19791b13d190SFrançois Tigeot GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT; 19801b13d190SFrançois Tigeot 19812c9916cdSFrançois Tigeot ring->init_hw = gen8_init_common_ring; 19821b13d190SFrançois Tigeot ring->get_seqno = gen8_get_seqno; 19831b13d190SFrançois Tigeot ring->set_seqno = gen8_set_seqno; 19841b13d190SFrançois Tigeot ring->emit_request = gen8_emit_request; 19851b13d190SFrançois Tigeot ring->emit_flush = gen8_emit_flush; 19861b13d190SFrançois Tigeot ring->irq_get = gen8_logical_ring_get_irq; 19871b13d190SFrançois Tigeot ring->irq_put = gen8_logical_ring_put_irq; 19881b13d190SFrançois Tigeot ring->emit_bb_start = gen8_emit_bb_start; 19891b13d190SFrançois Tigeot 19901b13d190SFrançois Tigeot return logical_ring_init(dev, ring); 19911b13d190SFrançois Tigeot } 19921b13d190SFrançois Tigeot 19931b13d190SFrançois Tigeot static int logical_vebox_ring_init(struct drm_device *dev) 19941b13d190SFrançois Tigeot { 19951b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 19961b13d190SFrançois Tigeot struct intel_engine_cs *ring = &dev_priv->ring[VECS]; 19971b13d190SFrançois Tigeot 19981b13d190SFrançois Tigeot ring->name = "video enhancement ring"; 19991b13d190SFrançois Tigeot ring->id = VECS; 20001b13d190SFrançois Tigeot ring->mmio_base = VEBOX_RING_BASE; 20011b13d190SFrançois Tigeot ring->irq_enable_mask = 20021b13d190SFrançois Tigeot GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT; 20031b13d190SFrançois Tigeot ring->irq_keep_mask = 20041b13d190SFrançois Tigeot GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT; 20051b13d190SFrançois Tigeot 20062c9916cdSFrançois Tigeot ring->init_hw = gen8_init_common_ring; 20071b13d190SFrançois Tigeot ring->get_seqno = gen8_get_seqno; 20081b13d190SFrançois Tigeot ring->set_seqno = gen8_set_seqno; 20091b13d190SFrançois Tigeot ring->emit_request = gen8_emit_request; 20101b13d190SFrançois Tigeot ring->emit_flush = gen8_emit_flush; 20111b13d190SFrançois Tigeot ring->irq_get = gen8_logical_ring_get_irq; 20121b13d190SFrançois Tigeot ring->irq_put = gen8_logical_ring_put_irq; 20131b13d190SFrançois Tigeot ring->emit_bb_start = gen8_emit_bb_start; 20141b13d190SFrançois Tigeot 20151b13d190SFrançois Tigeot return logical_ring_init(dev, ring); 20161b13d190SFrançois Tigeot } 20171b13d190SFrançois Tigeot 20181b13d190SFrançois Tigeot /** 20191b13d190SFrançois Tigeot * intel_logical_rings_init() - allocate, populate and init the Engine Command Streamers 20201b13d190SFrançois Tigeot * @dev: DRM device. 20211b13d190SFrançois Tigeot * 20221b13d190SFrançois Tigeot * This function inits the engines for an Execlists submission style (the equivalent in the 20231b13d190SFrançois Tigeot * legacy ringbuffer submission world would be i915_gem_init_rings). It does it only for 20241b13d190SFrançois Tigeot * those engines that are present in the hardware. 20251b13d190SFrançois Tigeot * 20261b13d190SFrançois Tigeot * Return: non-zero if the initialization failed. 20271b13d190SFrançois Tigeot */ 20281b13d190SFrançois Tigeot int intel_logical_rings_init(struct drm_device *dev) 20291b13d190SFrançois Tigeot { 20301b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 20311b13d190SFrançois Tigeot int ret; 20321b13d190SFrançois Tigeot 20331b13d190SFrançois Tigeot ret = logical_render_ring_init(dev); 20341b13d190SFrançois Tigeot if (ret) 20351b13d190SFrançois Tigeot return ret; 20361b13d190SFrançois Tigeot 20371b13d190SFrançois Tigeot if (HAS_BSD(dev)) { 20381b13d190SFrançois Tigeot ret = logical_bsd_ring_init(dev); 20391b13d190SFrançois Tigeot if (ret) 20401b13d190SFrançois Tigeot goto cleanup_render_ring; 20411b13d190SFrançois Tigeot } 20421b13d190SFrançois Tigeot 20431b13d190SFrançois Tigeot if (HAS_BLT(dev)) { 20441b13d190SFrançois Tigeot ret = logical_blt_ring_init(dev); 20451b13d190SFrançois Tigeot if (ret) 20461b13d190SFrançois Tigeot goto cleanup_bsd_ring; 20471b13d190SFrançois Tigeot } 20481b13d190SFrançois Tigeot 20491b13d190SFrançois Tigeot if (HAS_VEBOX(dev)) { 20501b13d190SFrançois Tigeot ret = logical_vebox_ring_init(dev); 20511b13d190SFrançois Tigeot if (ret) 20521b13d190SFrançois Tigeot goto cleanup_blt_ring; 20531b13d190SFrançois Tigeot } 20541b13d190SFrançois Tigeot 20551b13d190SFrançois Tigeot if (HAS_BSD2(dev)) { 20561b13d190SFrançois Tigeot ret = logical_bsd2_ring_init(dev); 20571b13d190SFrançois Tigeot if (ret) 20581b13d190SFrançois Tigeot goto cleanup_vebox_ring; 20591b13d190SFrançois Tigeot } 20601b13d190SFrançois Tigeot 20611b13d190SFrançois Tigeot ret = i915_gem_set_seqno(dev, ((u32)~0 - 0x1000)); 20621b13d190SFrançois Tigeot if (ret) 20631b13d190SFrançois Tigeot goto cleanup_bsd2_ring; 20641b13d190SFrançois Tigeot 20651b13d190SFrançois Tigeot return 0; 20661b13d190SFrançois Tigeot 20671b13d190SFrançois Tigeot cleanup_bsd2_ring: 20681b13d190SFrançois Tigeot intel_logical_ring_cleanup(&dev_priv->ring[VCS2]); 20691b13d190SFrançois Tigeot cleanup_vebox_ring: 20701b13d190SFrançois Tigeot intel_logical_ring_cleanup(&dev_priv->ring[VECS]); 20711b13d190SFrançois Tigeot cleanup_blt_ring: 20721b13d190SFrançois Tigeot intel_logical_ring_cleanup(&dev_priv->ring[BCS]); 20731b13d190SFrançois Tigeot cleanup_bsd_ring: 20741b13d190SFrançois Tigeot intel_logical_ring_cleanup(&dev_priv->ring[VCS]); 20751b13d190SFrançois Tigeot cleanup_render_ring: 20761b13d190SFrançois Tigeot intel_logical_ring_cleanup(&dev_priv->ring[RCS]); 20771b13d190SFrançois Tigeot 20781b13d190SFrançois Tigeot return ret; 20791b13d190SFrançois Tigeot } 20801b13d190SFrançois Tigeot 2081477eb7f9SFrançois Tigeot static u32 2082477eb7f9SFrançois Tigeot make_rpcs(struct drm_device *dev) 20831b13d190SFrançois Tigeot { 2084477eb7f9SFrançois Tigeot u32 rpcs = 0; 20851b13d190SFrançois Tigeot 2086477eb7f9SFrançois Tigeot /* 2087477eb7f9SFrançois Tigeot * No explicit RPCS request is needed to ensure full 2088477eb7f9SFrançois Tigeot * slice/subslice/EU enablement prior to Gen9. 2089477eb7f9SFrançois Tigeot */ 2090477eb7f9SFrançois Tigeot if (INTEL_INFO(dev)->gen < 9) 20911b13d190SFrançois Tigeot return 0; 20921b13d190SFrançois Tigeot 2093477eb7f9SFrançois Tigeot /* 2094477eb7f9SFrançois Tigeot * Starting in Gen9, render power gating can leave 2095477eb7f9SFrançois Tigeot * slice/subslice/EU in a partially enabled state. We 2096477eb7f9SFrançois Tigeot * must make an explicit request through RPCS for full 2097477eb7f9SFrançois Tigeot * enablement. 2098477eb7f9SFrançois Tigeot */ 2099477eb7f9SFrançois Tigeot if (INTEL_INFO(dev)->has_slice_pg) { 2100477eb7f9SFrançois Tigeot rpcs |= GEN8_RPCS_S_CNT_ENABLE; 2101477eb7f9SFrançois Tigeot rpcs |= INTEL_INFO(dev)->slice_total << 2102477eb7f9SFrançois Tigeot GEN8_RPCS_S_CNT_SHIFT; 2103477eb7f9SFrançois Tigeot rpcs |= GEN8_RPCS_ENABLE; 2104477eb7f9SFrançois Tigeot } 21051b13d190SFrançois Tigeot 2106477eb7f9SFrançois Tigeot if (INTEL_INFO(dev)->has_subslice_pg) { 2107477eb7f9SFrançois Tigeot rpcs |= GEN8_RPCS_SS_CNT_ENABLE; 2108477eb7f9SFrançois Tigeot rpcs |= INTEL_INFO(dev)->subslice_per_slice << 2109477eb7f9SFrançois Tigeot GEN8_RPCS_SS_CNT_SHIFT; 2110477eb7f9SFrançois Tigeot rpcs |= GEN8_RPCS_ENABLE; 2111477eb7f9SFrançois Tigeot } 21121b13d190SFrançois Tigeot 2113477eb7f9SFrançois Tigeot if (INTEL_INFO(dev)->has_eu_pg) { 2114477eb7f9SFrançois Tigeot rpcs |= INTEL_INFO(dev)->eu_per_subslice << 2115477eb7f9SFrançois Tigeot GEN8_RPCS_EU_MIN_SHIFT; 2116477eb7f9SFrançois Tigeot rpcs |= INTEL_INFO(dev)->eu_per_subslice << 2117477eb7f9SFrançois Tigeot GEN8_RPCS_EU_MAX_SHIFT; 2118477eb7f9SFrançois Tigeot rpcs |= GEN8_RPCS_ENABLE; 2119477eb7f9SFrançois Tigeot } 2120477eb7f9SFrançois Tigeot 2121477eb7f9SFrançois Tigeot return rpcs; 21221b13d190SFrançois Tigeot } 21231b13d190SFrançois Tigeot 21241b13d190SFrançois Tigeot static int 21251b13d190SFrançois Tigeot populate_lr_context(struct intel_context *ctx, struct drm_i915_gem_object *ctx_obj, 21261b13d190SFrançois Tigeot struct intel_engine_cs *ring, struct intel_ringbuffer *ringbuf) 21271b13d190SFrançois Tigeot { 21281b13d190SFrançois Tigeot struct drm_device *dev = ring->dev; 21291b13d190SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 21301b13d190SFrançois Tigeot struct i915_hw_ppgtt *ppgtt = ctx->ppgtt; 21311b13d190SFrançois Tigeot struct vm_page *page; 21321b13d190SFrançois Tigeot uint32_t *reg_state; 21331b13d190SFrançois Tigeot int ret; 21341b13d190SFrançois Tigeot 21351b13d190SFrançois Tigeot if (!ppgtt) 21361b13d190SFrançois Tigeot ppgtt = dev_priv->mm.aliasing_ppgtt; 21371b13d190SFrançois Tigeot 21381b13d190SFrançois Tigeot ret = i915_gem_object_set_to_cpu_domain(ctx_obj, true); 21391b13d190SFrançois Tigeot if (ret) { 21401b13d190SFrançois Tigeot DRM_DEBUG_DRIVER("Could not set to CPU domain\n"); 21411b13d190SFrançois Tigeot return ret; 21421b13d190SFrançois Tigeot } 21431b13d190SFrançois Tigeot 21441b13d190SFrançois Tigeot ret = i915_gem_object_get_pages(ctx_obj); 21451b13d190SFrançois Tigeot if (ret) { 21461b13d190SFrançois Tigeot DRM_DEBUG_DRIVER("Could not get object pages\n"); 21471b13d190SFrançois Tigeot return ret; 21481b13d190SFrançois Tigeot } 21491b13d190SFrançois Tigeot 21501b13d190SFrançois Tigeot i915_gem_object_pin_pages(ctx_obj); 21511b13d190SFrançois Tigeot 21521b13d190SFrançois Tigeot /* The second page of the context object contains some fields which must 21531b13d190SFrançois Tigeot * be set up prior to the first execution. */ 21541b13d190SFrançois Tigeot page = i915_gem_object_get_page(ctx_obj, 1); 21551b13d190SFrançois Tigeot reg_state = kmap_atomic(page); 21561b13d190SFrançois Tigeot 21571b13d190SFrançois Tigeot /* A context is actually a big batch buffer with several MI_LOAD_REGISTER_IMM 21581b13d190SFrançois Tigeot * commands followed by (reg, value) pairs. The values we are setting here are 21591b13d190SFrançois Tigeot * only for the first context restore: on a subsequent save, the GPU will 21601b13d190SFrançois Tigeot * recreate this batchbuffer with new values (including all the missing 21611b13d190SFrançois Tigeot * MI_LOAD_REGISTER_IMM commands that we are not initializing here). */ 21621b13d190SFrançois Tigeot if (ring->id == RCS) 21631b13d190SFrançois Tigeot reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(14); 21641b13d190SFrançois Tigeot else 21651b13d190SFrançois Tigeot reg_state[CTX_LRI_HEADER_0] = MI_LOAD_REGISTER_IMM(11); 21661b13d190SFrançois Tigeot reg_state[CTX_LRI_HEADER_0] |= MI_LRI_FORCE_POSTED; 21671b13d190SFrançois Tigeot reg_state[CTX_CONTEXT_CONTROL] = RING_CONTEXT_CONTROL(ring); 21681b13d190SFrançois Tigeot reg_state[CTX_CONTEXT_CONTROL+1] = 2169477eb7f9SFrançois Tigeot _MASKED_BIT_ENABLE(CTX_CTRL_INHIBIT_SYN_CTX_SWITCH | 2170*a05eeebfSFrançois Tigeot CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT | 2171*a05eeebfSFrançois Tigeot CTX_CTRL_RS_CTX_ENABLE); 21721b13d190SFrançois Tigeot reg_state[CTX_RING_HEAD] = RING_HEAD(ring->mmio_base); 21731b13d190SFrançois Tigeot reg_state[CTX_RING_HEAD+1] = 0; 21741b13d190SFrançois Tigeot reg_state[CTX_RING_TAIL] = RING_TAIL(ring->mmio_base); 21751b13d190SFrançois Tigeot reg_state[CTX_RING_TAIL+1] = 0; 21761b13d190SFrançois Tigeot reg_state[CTX_RING_BUFFER_START] = RING_START(ring->mmio_base); 21772c9916cdSFrançois Tigeot /* Ring buffer start address is not known until the buffer is pinned. 21782c9916cdSFrançois Tigeot * It is written to the context image in execlists_update_context() 21792c9916cdSFrançois Tigeot */ 21801b13d190SFrançois Tigeot reg_state[CTX_RING_BUFFER_CONTROL] = RING_CTL(ring->mmio_base); 21811b13d190SFrançois Tigeot reg_state[CTX_RING_BUFFER_CONTROL+1] = 21821b13d190SFrançois Tigeot ((ringbuf->size - PAGE_SIZE) & RING_NR_PAGES) | RING_VALID; 21831b13d190SFrançois Tigeot reg_state[CTX_BB_HEAD_U] = ring->mmio_base + 0x168; 21841b13d190SFrançois Tigeot reg_state[CTX_BB_HEAD_U+1] = 0; 21851b13d190SFrançois Tigeot reg_state[CTX_BB_HEAD_L] = ring->mmio_base + 0x140; 21861b13d190SFrançois Tigeot reg_state[CTX_BB_HEAD_L+1] = 0; 21871b13d190SFrançois Tigeot reg_state[CTX_BB_STATE] = ring->mmio_base + 0x110; 21881b13d190SFrançois Tigeot reg_state[CTX_BB_STATE+1] = (1<<5); 21891b13d190SFrançois Tigeot reg_state[CTX_SECOND_BB_HEAD_U] = ring->mmio_base + 0x11c; 21901b13d190SFrançois Tigeot reg_state[CTX_SECOND_BB_HEAD_U+1] = 0; 21911b13d190SFrançois Tigeot reg_state[CTX_SECOND_BB_HEAD_L] = ring->mmio_base + 0x114; 21921b13d190SFrançois Tigeot reg_state[CTX_SECOND_BB_HEAD_L+1] = 0; 21931b13d190SFrançois Tigeot reg_state[CTX_SECOND_BB_STATE] = ring->mmio_base + 0x118; 21941b13d190SFrançois Tigeot reg_state[CTX_SECOND_BB_STATE+1] = 0; 21951b13d190SFrançois Tigeot if (ring->id == RCS) { 21961b13d190SFrançois Tigeot reg_state[CTX_BB_PER_CTX_PTR] = ring->mmio_base + 0x1c0; 21971b13d190SFrançois Tigeot reg_state[CTX_BB_PER_CTX_PTR+1] = 0; 21981b13d190SFrançois Tigeot reg_state[CTX_RCS_INDIRECT_CTX] = ring->mmio_base + 0x1c4; 21991b13d190SFrançois Tigeot reg_state[CTX_RCS_INDIRECT_CTX+1] = 0; 22001b13d190SFrançois Tigeot reg_state[CTX_RCS_INDIRECT_CTX_OFFSET] = ring->mmio_base + 0x1c8; 22011b13d190SFrançois Tigeot reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 0; 2202*a05eeebfSFrançois Tigeot if (ring->wa_ctx.obj) { 2203*a05eeebfSFrançois Tigeot struct i915_ctx_workarounds *wa_ctx = &ring->wa_ctx; 2204*a05eeebfSFrançois Tigeot uint32_t ggtt_offset = i915_gem_obj_ggtt_offset(wa_ctx->obj); 2205*a05eeebfSFrançois Tigeot 2206*a05eeebfSFrançois Tigeot reg_state[CTX_RCS_INDIRECT_CTX+1] = 2207*a05eeebfSFrançois Tigeot (ggtt_offset + wa_ctx->indirect_ctx.offset * sizeof(uint32_t)) | 2208*a05eeebfSFrançois Tigeot (wa_ctx->indirect_ctx.size / CACHELINE_DWORDS); 2209*a05eeebfSFrançois Tigeot 2210*a05eeebfSFrançois Tigeot reg_state[CTX_RCS_INDIRECT_CTX_OFFSET+1] = 2211*a05eeebfSFrançois Tigeot CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT << 6; 2212*a05eeebfSFrançois Tigeot 2213*a05eeebfSFrançois Tigeot reg_state[CTX_BB_PER_CTX_PTR+1] = 2214*a05eeebfSFrançois Tigeot (ggtt_offset + wa_ctx->per_ctx.offset * sizeof(uint32_t)) | 2215*a05eeebfSFrançois Tigeot 0x01; 2216*a05eeebfSFrançois Tigeot } 22171b13d190SFrançois Tigeot } 22181b13d190SFrançois Tigeot reg_state[CTX_LRI_HEADER_1] = MI_LOAD_REGISTER_IMM(9); 22191b13d190SFrançois Tigeot reg_state[CTX_LRI_HEADER_1] |= MI_LRI_FORCE_POSTED; 22201b13d190SFrançois Tigeot reg_state[CTX_CTX_TIMESTAMP] = ring->mmio_base + 0x3a8; 22211b13d190SFrançois Tigeot reg_state[CTX_CTX_TIMESTAMP+1] = 0; 22221b13d190SFrançois Tigeot reg_state[CTX_PDP3_UDW] = GEN8_RING_PDP_UDW(ring, 3); 22231b13d190SFrançois Tigeot reg_state[CTX_PDP3_LDW] = GEN8_RING_PDP_LDW(ring, 3); 22241b13d190SFrançois Tigeot reg_state[CTX_PDP2_UDW] = GEN8_RING_PDP_UDW(ring, 2); 22251b13d190SFrançois Tigeot reg_state[CTX_PDP2_LDW] = GEN8_RING_PDP_LDW(ring, 2); 22261b13d190SFrançois Tigeot reg_state[CTX_PDP1_UDW] = GEN8_RING_PDP_UDW(ring, 1); 22271b13d190SFrançois Tigeot reg_state[CTX_PDP1_LDW] = GEN8_RING_PDP_LDW(ring, 1); 22281b13d190SFrançois Tigeot reg_state[CTX_PDP0_UDW] = GEN8_RING_PDP_UDW(ring, 0); 22291b13d190SFrançois Tigeot reg_state[CTX_PDP0_LDW] = GEN8_RING_PDP_LDW(ring, 0); 223019c468b4SFrançois Tigeot 223119c468b4SFrançois Tigeot /* With dynamic page allocation, PDPs may not be allocated at this point, 223219c468b4SFrançois Tigeot * Point the unallocated PDPs to the scratch page 223319c468b4SFrançois Tigeot */ 223419c468b4SFrançois Tigeot ASSIGN_CTX_PDP(ppgtt, reg_state, 3); 223519c468b4SFrançois Tigeot ASSIGN_CTX_PDP(ppgtt, reg_state, 2); 223619c468b4SFrançois Tigeot ASSIGN_CTX_PDP(ppgtt, reg_state, 1); 223719c468b4SFrançois Tigeot ASSIGN_CTX_PDP(ppgtt, reg_state, 0); 22381b13d190SFrançois Tigeot if (ring->id == RCS) { 22391b13d190SFrançois Tigeot reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1); 2240477eb7f9SFrançois Tigeot reg_state[CTX_R_PWR_CLK_STATE] = GEN8_R_PWR_CLK_STATE; 2241477eb7f9SFrançois Tigeot reg_state[CTX_R_PWR_CLK_STATE+1] = make_rpcs(dev); 22421b13d190SFrançois Tigeot } 22431b13d190SFrançois Tigeot 22441b13d190SFrançois Tigeot kunmap_atomic(reg_state); 22451b13d190SFrançois Tigeot 22461b13d190SFrançois Tigeot ctx_obj->dirty = 1; 22471b13d190SFrançois Tigeot set_page_dirty(page); 22481b13d190SFrançois Tigeot i915_gem_object_unpin_pages(ctx_obj); 22491b13d190SFrançois Tigeot 22501b13d190SFrançois Tigeot return 0; 22511b13d190SFrançois Tigeot } 22521b13d190SFrançois Tigeot 22531b13d190SFrançois Tigeot /** 22541b13d190SFrançois Tigeot * intel_lr_context_free() - free the LRC specific bits of a context 22551b13d190SFrançois Tigeot * @ctx: the LR context to free. 22561b13d190SFrançois Tigeot * 22571b13d190SFrançois Tigeot * The real context freeing is done in i915_gem_context_free: this only 22581b13d190SFrançois Tigeot * takes care of the bits that are LRC related: the per-engine backing 22591b13d190SFrançois Tigeot * objects and the logical ringbuffer. 22601b13d190SFrançois Tigeot */ 22611b13d190SFrançois Tigeot void intel_lr_context_free(struct intel_context *ctx) 22621b13d190SFrançois Tigeot { 22631b13d190SFrançois Tigeot int i; 22641b13d190SFrançois Tigeot 22651b13d190SFrançois Tigeot for (i = 0; i < I915_NUM_RINGS; i++) { 22661b13d190SFrançois Tigeot struct drm_i915_gem_object *ctx_obj = ctx->engine[i].state; 22671b13d190SFrançois Tigeot 22681b13d190SFrançois Tigeot if (ctx_obj) { 22692c9916cdSFrançois Tigeot struct intel_ringbuffer *ringbuf = 22702c9916cdSFrançois Tigeot ctx->engine[i].ringbuf; 22712c9916cdSFrançois Tigeot struct intel_engine_cs *ring = ringbuf->ring; 22722c9916cdSFrançois Tigeot 22732c9916cdSFrançois Tigeot if (ctx == ring->default_context) { 22742c9916cdSFrançois Tigeot intel_unpin_ringbuffer_obj(ringbuf); 22752c9916cdSFrançois Tigeot i915_gem_object_ggtt_unpin(ctx_obj); 22762c9916cdSFrançois Tigeot } 22772c9916cdSFrançois Tigeot WARN_ON(ctx->engine[ring->id].pin_count); 22781b13d190SFrançois Tigeot intel_destroy_ringbuffer_obj(ringbuf); 22791b13d190SFrançois Tigeot kfree(ringbuf); 22801b13d190SFrançois Tigeot drm_gem_object_unreference(&ctx_obj->base); 22811b13d190SFrançois Tigeot } 22821b13d190SFrançois Tigeot } 22831b13d190SFrançois Tigeot } 22841b13d190SFrançois Tigeot 22851b13d190SFrançois Tigeot static uint32_t get_lr_context_size(struct intel_engine_cs *ring) 22861b13d190SFrançois Tigeot { 22871b13d190SFrançois Tigeot int ret = 0; 22881b13d190SFrançois Tigeot 22892c9916cdSFrançois Tigeot WARN_ON(INTEL_INFO(ring->dev)->gen < 8); 22901b13d190SFrançois Tigeot 22911b13d190SFrançois Tigeot switch (ring->id) { 22921b13d190SFrançois Tigeot case RCS: 22932c9916cdSFrançois Tigeot if (INTEL_INFO(ring->dev)->gen >= 9) 22942c9916cdSFrançois Tigeot ret = GEN9_LR_CONTEXT_RENDER_SIZE; 22952c9916cdSFrançois Tigeot else 22961b13d190SFrançois Tigeot ret = GEN8_LR_CONTEXT_RENDER_SIZE; 22971b13d190SFrançois Tigeot break; 22981b13d190SFrançois Tigeot case VCS: 22991b13d190SFrançois Tigeot case BCS: 23001b13d190SFrançois Tigeot case VECS: 23011b13d190SFrançois Tigeot case VCS2: 23021b13d190SFrançois Tigeot ret = GEN8_LR_CONTEXT_OTHER_SIZE; 23031b13d190SFrançois Tigeot break; 23041b13d190SFrançois Tigeot } 23051b13d190SFrançois Tigeot 23061b13d190SFrançois Tigeot return ret; 23071b13d190SFrançois Tigeot } 23081b13d190SFrançois Tigeot 23092c9916cdSFrançois Tigeot static void lrc_setup_hardware_status_page(struct intel_engine_cs *ring, 23102c9916cdSFrançois Tigeot struct drm_i915_gem_object *default_ctx_obj) 23112c9916cdSFrançois Tigeot { 23122c9916cdSFrançois Tigeot struct drm_i915_private *dev_priv = ring->dev->dev_private; 23132c9916cdSFrançois Tigeot 23142c9916cdSFrançois Tigeot /* The status page is offset 0 from the default context object 23152c9916cdSFrançois Tigeot * in LRC mode. */ 23162c9916cdSFrançois Tigeot ring->status_page.gfx_addr = i915_gem_obj_ggtt_offset(default_ctx_obj); 23172c9916cdSFrançois Tigeot ring->status_page.page_addr = 23187ec9f8e5SFrançois Tigeot kmap(sg_page(default_ctx_obj->pages->sgl)); 23192c9916cdSFrançois Tigeot ring->status_page.obj = default_ctx_obj; 23202c9916cdSFrançois Tigeot 23212c9916cdSFrançois Tigeot I915_WRITE(RING_HWS_PGA(ring->mmio_base), 23222c9916cdSFrançois Tigeot (u32)ring->status_page.gfx_addr); 23232c9916cdSFrançois Tigeot POSTING_READ(RING_HWS_PGA(ring->mmio_base)); 23242c9916cdSFrançois Tigeot } 23252c9916cdSFrançois Tigeot 23261b13d190SFrançois Tigeot /** 23271b13d190SFrançois Tigeot * intel_lr_context_deferred_create() - create the LRC specific bits of a context 23281b13d190SFrançois Tigeot * @ctx: LR context to create. 23291b13d190SFrançois Tigeot * @ring: engine to be used with the context. 23301b13d190SFrançois Tigeot * 23311b13d190SFrançois Tigeot * This function can be called more than once, with different engines, if we plan 23321b13d190SFrançois Tigeot * to use the context with them. The context backing objects and the ringbuffers 23331b13d190SFrançois Tigeot * (specially the ringbuffer backing objects) suck a lot of memory up, and that's why 23341b13d190SFrançois Tigeot * the creation is a deferred call: it's better to make sure first that we need to use 23351b13d190SFrançois Tigeot * a given ring with the context. 23361b13d190SFrançois Tigeot * 23372c9916cdSFrançois Tigeot * Return: non-zero on error. 23381b13d190SFrançois Tigeot */ 23391b13d190SFrançois Tigeot int intel_lr_context_deferred_create(struct intel_context *ctx, 23401b13d190SFrançois Tigeot struct intel_engine_cs *ring) 23411b13d190SFrançois Tigeot { 23422c9916cdSFrançois Tigeot const bool is_global_default_ctx = (ctx == ring->default_context); 23431b13d190SFrançois Tigeot struct drm_device *dev = ring->dev; 23441b13d190SFrançois Tigeot struct drm_i915_gem_object *ctx_obj; 23451b13d190SFrançois Tigeot uint32_t context_size; 23461b13d190SFrançois Tigeot struct intel_ringbuffer *ringbuf; 23471b13d190SFrançois Tigeot int ret; 23481b13d190SFrançois Tigeot 23491b13d190SFrançois Tigeot WARN_ON(ctx->legacy_hw_ctx.rcs_state != NULL); 23502c9916cdSFrançois Tigeot WARN_ON(ctx->engine[ring->id].state); 23511b13d190SFrançois Tigeot 23521b13d190SFrançois Tigeot context_size = round_up(get_lr_context_size(ring), 4096); 23531b13d190SFrançois Tigeot 235419c468b4SFrançois Tigeot ctx_obj = i915_gem_alloc_object(dev, context_size); 235519c468b4SFrançois Tigeot if (!ctx_obj) { 235619c468b4SFrançois Tigeot DRM_DEBUG_DRIVER("Alloc LRC backing obj failed.\n"); 235719c468b4SFrançois Tigeot return -ENOMEM; 23581b13d190SFrançois Tigeot } 23591b13d190SFrançois Tigeot 23602c9916cdSFrançois Tigeot if (is_global_default_ctx) { 23611b13d190SFrançois Tigeot ret = i915_gem_obj_ggtt_pin(ctx_obj, GEN8_LR_CONTEXT_ALIGN, 0); 23621b13d190SFrançois Tigeot if (ret) { 23632c9916cdSFrançois Tigeot DRM_DEBUG_DRIVER("Pin LRC backing obj failed: %d\n", 23642c9916cdSFrançois Tigeot ret); 23651b13d190SFrançois Tigeot drm_gem_object_unreference(&ctx_obj->base); 23661b13d190SFrançois Tigeot return ret; 23671b13d190SFrançois Tigeot } 23682c9916cdSFrançois Tigeot } 23691b13d190SFrançois Tigeot 23701b13d190SFrançois Tigeot ringbuf = kzalloc(sizeof(*ringbuf), GFP_KERNEL); 23711b13d190SFrançois Tigeot if (!ringbuf) { 23721b13d190SFrançois Tigeot DRM_DEBUG_DRIVER("Failed to allocate ringbuffer %s\n", 23731b13d190SFrançois Tigeot ring->name); 23741b13d190SFrançois Tigeot ret = -ENOMEM; 23752c9916cdSFrançois Tigeot goto error_unpin_ctx; 23761b13d190SFrançois Tigeot } 23771b13d190SFrançois Tigeot 23781b13d190SFrançois Tigeot ringbuf->ring = ring; 23791b13d190SFrançois Tigeot 23801b13d190SFrançois Tigeot ringbuf->size = 32 * PAGE_SIZE; 23811b13d190SFrançois Tigeot ringbuf->effective_size = ringbuf->size; 23821b13d190SFrançois Tigeot ringbuf->head = 0; 23831b13d190SFrançois Tigeot ringbuf->tail = 0; 23841b13d190SFrançois Tigeot ringbuf->last_retired_head = -1; 23852c9916cdSFrançois Tigeot intel_ring_update_space(ringbuf); 23861b13d190SFrançois Tigeot 23872c9916cdSFrançois Tigeot if (ringbuf->obj == NULL) { 23881b13d190SFrançois Tigeot ret = intel_alloc_ringbuffer_obj(dev, ringbuf); 23891b13d190SFrançois Tigeot if (ret) { 23902c9916cdSFrançois Tigeot DRM_DEBUG_DRIVER( 23912c9916cdSFrançois Tigeot "Failed to allocate ringbuffer obj %s: %d\n", 23921b13d190SFrançois Tigeot ring->name, ret); 23932c9916cdSFrançois Tigeot goto error_free_rbuf; 23942c9916cdSFrançois Tigeot } 23952c9916cdSFrançois Tigeot 23962c9916cdSFrançois Tigeot if (is_global_default_ctx) { 23972c9916cdSFrançois Tigeot ret = intel_pin_and_map_ringbuffer_obj(dev, ringbuf); 23982c9916cdSFrançois Tigeot if (ret) { 23992c9916cdSFrançois Tigeot DRM_ERROR( 24002c9916cdSFrançois Tigeot "Failed to pin and map ringbuffer %s: %d\n", 24012c9916cdSFrançois Tigeot ring->name, ret); 24022c9916cdSFrançois Tigeot goto error_destroy_rbuf; 24032c9916cdSFrançois Tigeot } 24042c9916cdSFrançois Tigeot } 24052c9916cdSFrançois Tigeot 24061b13d190SFrançois Tigeot } 24071b13d190SFrançois Tigeot 24081b13d190SFrançois Tigeot ret = populate_lr_context(ctx, ctx_obj, ring, ringbuf); 24091b13d190SFrançois Tigeot if (ret) { 24101b13d190SFrançois Tigeot DRM_DEBUG_DRIVER("Failed to populate LRC: %d\n", ret); 24111b13d190SFrançois Tigeot goto error; 24121b13d190SFrançois Tigeot } 24131b13d190SFrançois Tigeot 24141b13d190SFrançois Tigeot ctx->engine[ring->id].ringbuf = ringbuf; 24151b13d190SFrançois Tigeot ctx->engine[ring->id].state = ctx_obj; 24161b13d190SFrançois Tigeot 24172c9916cdSFrançois Tigeot if (ctx == ring->default_context) 24182c9916cdSFrançois Tigeot lrc_setup_hardware_status_page(ring, ctx_obj); 24192c9916cdSFrançois Tigeot else if (ring->id == RCS && !ctx->rcs_initialized) { 24202c9916cdSFrançois Tigeot if (ring->init_context) { 2421*a05eeebfSFrançois Tigeot struct drm_i915_gem_request *req; 2422*a05eeebfSFrançois Tigeot 2423*a05eeebfSFrançois Tigeot ret = i915_gem_request_alloc(ring, ctx, &req); 2424*a05eeebfSFrançois Tigeot if (ret) 2425*a05eeebfSFrançois Tigeot return ret; 2426*a05eeebfSFrançois Tigeot 2427*a05eeebfSFrançois Tigeot ret = ring->init_context(req); 24281b13d190SFrançois Tigeot if (ret) { 24292c9916cdSFrançois Tigeot DRM_ERROR("ring init context: %d\n", ret); 2430*a05eeebfSFrançois Tigeot i915_gem_request_cancel(req); 24311b13d190SFrançois Tigeot ctx->engine[ring->id].ringbuf = NULL; 24321b13d190SFrançois Tigeot ctx->engine[ring->id].state = NULL; 24331b13d190SFrançois Tigeot goto error; 24341b13d190SFrançois Tigeot } 2435*a05eeebfSFrançois Tigeot 2436*a05eeebfSFrançois Tigeot i915_add_request_no_flush(req); 24372c9916cdSFrançois Tigeot } 24382c9916cdSFrançois Tigeot 24391b13d190SFrançois Tigeot ctx->rcs_initialized = true; 24401b13d190SFrançois Tigeot } 24411b13d190SFrançois Tigeot 24421b13d190SFrançois Tigeot return 0; 24431b13d190SFrançois Tigeot 24441b13d190SFrançois Tigeot error: 24452c9916cdSFrançois Tigeot if (is_global_default_ctx) 24462c9916cdSFrançois Tigeot intel_unpin_ringbuffer_obj(ringbuf); 24472c9916cdSFrançois Tigeot error_destroy_rbuf: 24482c9916cdSFrançois Tigeot intel_destroy_ringbuffer_obj(ringbuf); 24492c9916cdSFrançois Tigeot error_free_rbuf: 24501b13d190SFrançois Tigeot kfree(ringbuf); 24512c9916cdSFrançois Tigeot error_unpin_ctx: 24522c9916cdSFrançois Tigeot if (is_global_default_ctx) 24531b13d190SFrançois Tigeot i915_gem_object_ggtt_unpin(ctx_obj); 24541b13d190SFrançois Tigeot drm_gem_object_unreference(&ctx_obj->base); 24551b13d190SFrançois Tigeot return ret; 24561b13d190SFrançois Tigeot } 2457477eb7f9SFrançois Tigeot 2458477eb7f9SFrançois Tigeot void intel_lr_context_reset(struct drm_device *dev, 2459477eb7f9SFrançois Tigeot struct intel_context *ctx) 2460477eb7f9SFrançois Tigeot { 2461477eb7f9SFrançois Tigeot struct drm_i915_private *dev_priv = dev->dev_private; 2462477eb7f9SFrançois Tigeot struct intel_engine_cs *ring; 2463477eb7f9SFrançois Tigeot int i; 2464477eb7f9SFrançois Tigeot 2465477eb7f9SFrançois Tigeot for_each_ring(ring, dev_priv, i) { 2466477eb7f9SFrançois Tigeot struct drm_i915_gem_object *ctx_obj = 2467477eb7f9SFrançois Tigeot ctx->engine[ring->id].state; 2468477eb7f9SFrançois Tigeot struct intel_ringbuffer *ringbuf = 2469477eb7f9SFrançois Tigeot ctx->engine[ring->id].ringbuf; 2470477eb7f9SFrançois Tigeot uint32_t *reg_state; 2471477eb7f9SFrançois Tigeot struct vm_page *page; 2472477eb7f9SFrançois Tigeot 2473477eb7f9SFrançois Tigeot if (!ctx_obj) 2474477eb7f9SFrançois Tigeot continue; 2475477eb7f9SFrançois Tigeot 2476477eb7f9SFrançois Tigeot if (i915_gem_object_get_pages(ctx_obj)) { 2477477eb7f9SFrançois Tigeot WARN(1, "Failed get_pages for context obj\n"); 2478477eb7f9SFrançois Tigeot continue; 2479477eb7f9SFrançois Tigeot } 2480477eb7f9SFrançois Tigeot page = i915_gem_object_get_page(ctx_obj, 1); 2481477eb7f9SFrançois Tigeot reg_state = kmap_atomic(page); 2482477eb7f9SFrançois Tigeot 2483477eb7f9SFrançois Tigeot reg_state[CTX_RING_HEAD+1] = 0; 2484477eb7f9SFrançois Tigeot reg_state[CTX_RING_TAIL+1] = 0; 2485477eb7f9SFrançois Tigeot 2486477eb7f9SFrançois Tigeot kunmap_atomic(reg_state); 2487477eb7f9SFrançois Tigeot 2488477eb7f9SFrançois Tigeot ringbuf->head = 0; 2489477eb7f9SFrançois Tigeot ringbuf->tail = 0; 2490477eb7f9SFrançois Tigeot } 2491477eb7f9SFrançois Tigeot } 2492