i915_request.c - OpenGrok cross reference for /openbsd-src/sys/dev/pci/drm/i915/i915

Lines Matching full:we
73 	 * We could extend the life of a context to beyond that of all  in i915_fence_get_timeline_name()
75 	 * or we just give them a false name. Since  in i915_fence_get_timeline_name()
137 	 * freed when the slab cache itself is freed, and so we would get  in i915_fence_release()
146 	 * We do not hold a reference to the engine here and so have to be  in i915_fence_release()
147 	 * very careful in what rq->engine we poke. The virtual engine is  in i915_fence_release()
148 	 * referenced via the rq->context and we released that ref during  in i915_fence_release()
149 	 * i915_request_retire(), ergo we must not dereference a virtual  in i915_fence_release()
150 	 * engine here. Not that we would want to, as the only consumer of  in i915_fence_release()
155 	 * we know that it will have been processed by the HW and will  in i915_fence_release()
161 	 * power-of-two we assume that rq->engine may still be a virtual  in i915_fence_release()
162 	 * engine and so a dangling invalid pointer that we cannot dereference  in i915_fence_release()
166 	 * that we might execute on). On processing the bond, the request mask  in i915_fence_release()
170 	 * after timeslicing away, see __unwind_incomplete_requests(). Thus we  in i915_fence_release()
272 	 * is-banned?, or we know the request is already inflight.  in i915_request_active_engine()
274 	 * Note that rq->engine is unstable, and so we double  in i915_request_active_engine()
275 	 * check that we have acquired the lock on the final engine.  in i915_request_active_engine()
415 	 * We know the GPU must have read the request to have  in i915_request_retire()
420 	 * Note this requires that we are always called in request  in i915_request_retire()
426 		/* Poison before we release our space in the ring */  in i915_request_retire()
440 	 * We only loosely track inflight requests across preemption,  in i915_request_retire()
441 	 * and so we may find ourselves attempting to retire a _completed_  in i915_request_retire()
442 	 * request that we have removed from the HW and put back on a run  in i915_request_retire()
445 	 * As we set I915_FENCE_FLAG_ACTIVE on the request, this should be  in i915_request_retire()
446 	 * after removing the breadcrumb and signaling it, so that we do not  in i915_request_retire()
492 	 * Even if we have unwound the request, it may still be on  in __request_in_flight()
499 	 * As we know that there are always preemption points between  in __request_in_flight()
500 	 * requests, we know that only the currently executing request  in __request_in_flight()
501 	 * may be still active even though we have cleared the flag.  in __request_in_flight()
502 	 * However, we can't rely on our tracking of ELSP[0] to know  in __request_in_flight()
511 	 * latter, it may send the ACK and we process the event copying the  in __request_in_flight()
513 	 * this implies the HW is arbitrating and not struck in *active, we do  in __request_in_flight()
514 	 * not worry about complete accuracy, but we do require no read/write  in __request_in_flight()
516 	 * as the array is being overwritten, for which we require the writes  in __request_in_flight()
522 	 * that we received an ACK from the HW, and so the context is not  in __request_in_flight()
523 	 * stuck -- if we do not see ourselves in *active, the inflight status  in __request_in_flight()
524 	 * is valid. If instead we see ourselves being copied into *active,  in __request_in_flight()
525 	 * we are inflight and may signal the callback.  in __request_in_flight()
570 	 * active. This ensures that if we race with the  in __await_execution()
571 	 * __notify_execute_cb from i915_request_submit() and we are not  in __await_execution()
572 	 * included in that list, we get a second bite of the cherry and  in __await_execution()
576 	 * In i915_request_retire() we set the ACTIVE bit on a completed  in __await_execution()
578 	 * callback first, then checking the ACTIVE bit, we serialise with  in __await_execution()
614 	 * breadcrumb at the end (so we get the fence notifications).  in __i915_request_skip()
665 	 * With the advent of preempt-to-busy, we frequently encounter  in __i915_request_submit()
666 	 * requests that we have unsubmitted from HW, but left running  in __i915_request_submit()
668 	 * resubmission of that completed request, we can skip  in __i915_request_submit()
672 	 * We must remove the request from the caller's priority queue,  in __i915_request_submit()
675 	 * request has *not* yet been retired and we can safely move  in __i915_request_submit()
692 	 * Are we using semaphores when the gpu is already saturated?  in __i915_request_submit()
700 	 * If we installed a semaphore on this request and we only submit  in __i915_request_submit()
703 	 * increases the amount of work we are doing. If so, we disable  in __i915_request_submit()
704 	 * further use of semaphores until we are idle again, whence we  in __i915_request_submit()
731 	 * In the future, perhaps when we have an active time-slicing scheduler,  in __i915_request_submit()
734 	 * quite hairy, we have to carefully rollback the fence and do a  in __i915_request_submit()
740 	/* We may be recursing from the signal callback of another i915 fence */  in __i915_request_submit()
774 	 * Before we remove this breadcrumb from the signal list, we have  in __i915_request_unsubmit()
776 	 * attach itself. We first mark the request as no longer active and  in __i915_request_unsubmit()
785 	/* We've already spun, don't charge on resubmitting. */  in __i915_request_unsubmit()
790 	 * We don't need to wake_up any waiters on request->execute, they  in __i915_request_unsubmit()
837 		 * We need to serialize use of the submit_request() callback  in submit_notify()
839 		 * i915_gem_set_wedged().  We use the RCU mechanism to mark the  in submit_notify()
892 	/* If we cannot wait, dip into our reserves */  in request_alloc_slow()
924 	/* Retire our old requests in the hope that we free some */  in request_alloc_slow()
980 	 * We use RCU to look up requests in flight. The lookups may  in __i915_request_create()
982 	 * That is the request we are writing to here, may be in the process  in __i915_request_create()
984 	 * we have to be very careful when overwriting the contents. During  in __i915_request_create()
985 	 * the RCU lookup, we change chase the request->engine pointer,  in __i915_request_create()
991 	 * with dma_fence_init(). This increment is safe for release as we  in __i915_request_create()
992 	 * check that the request we have a reference to and matches the active  in __i915_request_create()
995 	 * Before we increment the refcount, we chase the request->engine  in __i915_request_create()
996 	 * pointer. We must not call kmem_cache_zalloc() or else we set  in __i915_request_create()
998 	 * we see the request is completed (based on the value of the  in __i915_request_create()
1000 	 * If we decide the request is not completed (new engine or seqno),  in __i915_request_create()
1001 	 * then we grab a reference and double check that it is still the  in __i915_request_create()
1044 	/* We bump the ref for the fence chain */  in __i915_request_create()
1064 	 * Note that due to how we add reserved_space to intel_ring_begin()  in __i915_request_create()
1065 	 * we need to double our request to ensure that if we need to wrap  in __i915_request_create()
1074 	 * should we detect the updated seqno part-way through the  in __i915_request_create()
1075 	 * GPU processing the request, we never over-estimate the  in __i915_request_create()
1094 	/* Make sure we didn't add ourselves to external state before freeing */  in __i915_request_create()
1130 	/* Check that we do not interrupt ourselves with a new request */  in i915_request_create()
1153 	 * The caller holds a reference on @signal, but we do not serialise  in i915_request_await_start()
1156 	 * We do not hold a reference to the request before @signal, and  in i915_request_await_start()
1158 	 * we follow the link backwards.  in i915_request_await_start()
1211 	 * both the GPU and CPU. We want to limit the impact on others,  in already_busywaiting()
1213 	 * latency. Therefore we restrict ourselves to not using more  in already_busywaiting()
1215 	 * if we have detected the engine is saturated (i.e. would not be  in already_busywaiting()
1219 	 * See the are-we-too-late? check in __i915_request_submit().  in already_busywaiting()
1237 	/* We need to pin the signaler's HWSP until we are finished reading. */  in __emit_semaphore_wait()
1251 	 * Using greater-than-or-equal here means we have to worry  in __emit_semaphore_wait()
1252 	 * about seqno wraparound. To side step that issue, we swap  in __emit_semaphore_wait()
1300 	 * that may fail catastrophically, then we want to avoid using  in emit_semaphore_wait()
1301 	 * semaphores as they bypass the fence signaling metadata, and we  in emit_semaphore_wait()
1307 	/* Just emit the first semaphore we see as request space is limited. */  in emit_semaphore_wait()
1365 	 * The execution cb fires when we submit the request to HW. But in  in __i915_request_await_execution()
1367 	 * run (consider that we submit 2 requests for the same context, where  in __i915_request_await_execution()
1368 	 * the request of interest is behind an indefinite spinner). So we hook  in __i915_request_await_execution()
1370 	 * in the worst case, though we hope that the await_start is elided.  in __i915_request_await_execution()
1379 	 * Now that we are queued to the HW at roughly the same time (thanks  in __i915_request_await_execution()
1383 	 * signaler depends on a semaphore, so indirectly do we, and we do not  in __i915_request_await_execution()
1385 	 * So we wait.  in __i915_request_await_execution()
1387 	 * However, there is also a second condition for which we need to wait  in __i915_request_await_execution()
1392 	 * immediate execution, and so we must wait until it reaches the  in __i915_request_await_execution()
1419 	 * The downside of using semaphores is that we lose metadata passing  in mark_external()
1420 	 * along the signaling chain. This is particularly nasty when we  in mark_external()
1422 	 * fatal errors we want to scrub the request before it is executed,  in mark_external()
1423 	 * which means that we cannot preload the request onto HW and have  in mark_external()
1511 		 * We don't squash repeated fence dependencies here as we  in i915_request_await_execution()
1534 	 * If we are waiting on a virtual engine, then it may be  in await_request_submit()
1537 	 * engine and then passed to the physical engine. We cannot allow  in await_request_submit()
1590 	 * we should *not* decompose it into its individual fences. However,  in i915_request_await_dma_fence()
1591 	 * we don't currently store which mode the fence-array is operating  in i915_request_await_dma_fence()
1593 	 * amdgpu and we should not see any incoming fence-array from  in i915_request_await_dma_fence()
1645  * @rq: request we are wishing to use
1665  * @to: request we are wishing to use
1670  * Conceptually we serialise writes between engines inside the GPU.
1671  * We only allow one engine to write into a buffer at any time, but
1672  * multiple readers. To ensure each has a coherent view of memory, we must:
1678  * - If we are a write request (pending_write_domain is set), the new
1770 		 * we need to be wary in case the timeline->last_request  in __i915_request_ensure_ordering()
1810 	 * we still expect the window between us starting to accept submissions  in __i915_request_add_to_timeline()
1818 	 * is special cased so that we can eliminate redundant ordering  in __i915_request_add_to_timeline()
1819 	 * operations while building the request (we know that the timeline  in __i915_request_add_to_timeline()
1820 	 * itself is ordered, and here we guarantee it).  in __i915_request_add_to_timeline()
1822 	 * As we know we will need to emit tracking along the timeline,  in __i915_request_add_to_timeline()
1823 	 * we embed the hooks into our request struct -- at the cost of  in __i915_request_add_to_timeline()
1828 	 * that we can apply a slight variant of the rules specialised  in __i915_request_add_to_timeline()
1830 	 * If we consider the case of virtual engine, we must emit a dma-fence  in __i915_request_add_to_timeline()
1836 	 * We do not order parallel submission requests on the timeline as each  in __i915_request_add_to_timeline()
1840 	 * timeline we store a pointer to last request submitted in the  in __i915_request_add_to_timeline()
1843 	 * alternatively we use completion fence if gem context has a single  in __i915_request_add_to_timeline()
1887 	 * should we detect the updated seqno part-way through the  in __i915_request_commit()
1888 	 * GPU processing the request, we never over-estimate the  in __i915_request_commit()
1910 	 * request - i.e. we may want to preempt the current request in order  in __i915_request_queue()
1911 	 * to run a high priority dependency chain *before* we can execute this  in __i915_request_queue()
1914 	 * This is called before the request is ready to run so that we can  in __i915_request_queue()
1961 	 * the comparisons are no longer valid if we switch CPUs. Instead of  in local_clock_ns()
1962 	 * blocking preemption for the entire busywait, we can detect the CPU  in local_clock_ns()
1989 	 * Only wait for the request if we know it is likely to complete.  in __i915_spin_request()
1991 	 * We don't track the timestamps around requests, nor the average  in __i915_spin_request()
1992 	 * request length, so we do not have a good indicator that this  in __i915_spin_request()
1993 	 * request will complete within the timeout. What we do know is the  in __i915_spin_request()
1994 	 * order in which requests are executed by the context and so we can  in __i915_spin_request()
2006 	 * rate. By busywaiting on the request completion for a short while we  in __i915_spin_request()
2008 	 * if it is a slow request, we want to sleep as quickly as possible.  in __i915_spin_request()
2086 	 * We must never wait on the GPU while holding a lock as we  in i915_request_wait_timeout()
2087 	 * may need to perform a GPU reset. So while we don't need to  in i915_request_wait_timeout()
2088 	 * serialise wait/reset with an explicit lock, we do want  in i915_request_wait_timeout()
2096 	 * We may use a rather large value here to offset the penalty of  in i915_request_wait_timeout()
2103 	 * short wait, we first spin to see if the request would have completed  in i915_request_wait_timeout()
2106 	 * We need upto 5us to enable the irq, and upto 20us to hide the  in i915_request_wait_timeout()
2114 	 * duration, which we currently lack.  in i915_request_wait_timeout()
2126 	 * We can circumvent that by promoting the GPU frequency to maximum  in i915_request_wait_timeout()
2127 	 * before we sleep. This makes the GPU throttle up much more quickly  in i915_request_wait_timeout()
2146 	 * We sometimes experience some latency between the HW interrupts and  in i915_request_wait_timeout()
2153 	 * If the HW is being lazy, this is the last chance before we go to  in i915_request_wait_timeout()
2154 	 * sleep to catch any pending events. We will check periodically in  in i915_request_wait_timeout()
2282 	 * The prefix is used to show the queue status, for which we use  in i915_request_show()