Age | Commit message (Collapse) | Author | Files | Lines |
|
As the tracepoint is now decoupled from when the actual register is
assigned and was never complemented by detailing when the object lost
its fence, it has outlived its limited usefulness. Profiling the actual
stalls is a far more profitable venture anyway.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
As the userspace mappings are torn down on every GPU write, we prefer to
track when the buffer is activated (via a fresh i915_gem_fault). This
makes the LRU conceptually simpler. With coherent mappings, the
remaining use-case for set_domain_ioctl is GPU synchronisation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
With this change, every batchbuffer can use all available fences (save
pinned and scanout, of course) without ever stalling the gpu!
In theory. Currently the actual pipelined update of the register is
disabled due to some stability issues. However, just the deferred update
is a significant win.
Based on a series of patches by Daniel Vetter.
The premise is that before every access to a buffer through the GTT we
have to declare whether we need a register or not. If the access is by
the GPU, a pipelined update to the register is made via the ringbuffer,
and we track the last seqno of the batches that access it. If by the
CPU we wait for the last GPU access and update the register (either
to clear or to set it for the current buffer).
One advantage of being able to pipeline changes is that we can defer the
actual updating of the fence register until we first need to access the
object through the GTT, i.e. we can eliminate the stall on set_tiling.
This is important as the userspace bo cache does not track the tiling
status of active buffers which generate frequent stalls on gen3 when
enabling tiling for an already bound buffer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This makes the various rings more consistent by removing the anomalous
handing of the rendering ring execbuffer dispatch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
... so that upon first use after resume we will reacquire the fence reg.
Reported-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Conflicts:
drivers/gpu/drm/i915/i915_gem.c
|
|
As the error occurred on the current object, it means that its state was
not changed and so it should be excluded from the unwind.
Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
We don't track gpu flush request in any special way. So even with
obj->write_domain == 0, a gpu flush might be outstanding but no
yet executed. Even worse, the latest request might use the object
only for reading. So and unconditional call to object_wait_rendering
is needed for !pipelined.
Hence revert that patch fully and untangle the flushing from the
synchronization again.
Reported-by: Keith Packard <keithp@keithp.com>
Tested-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Besides the minimal improvement in reducing the execbuffer overhead, the
real benefit is clarifying a few routines.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
A number of dragons have been seen lurking within the execbuffer code.
The first step is then to isolate them from the rest and begin to
scrutinise them in depth. Suggested by Daniel Vetter.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Simply remove our accounting of objects inside the aperture, keeping
only track of what is in the aperture and its current usage. This
removes the over-complication of BUGs that were attempting to keep the
accounting correct and also removes the overhead of the accounting on
the hot-paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
... to benefit from the compiler checking that we remember to handle
and propagate errors.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
With KMS, we can simply relinquish the fence when we idle the GPU and
reassign it upon first use.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Not employed just yet...
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Commit d09c23de intended to add a 30ms delay to give the ADD time to
detect any TVs connected. However, it used the sdvo->is_tv flag to do so
which is dependent upon the previous detection result and not whether the
output supports TVs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
|
|
Based on a patch by Daniel Vetter.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Avoid evicting buffers that will be used later in the batch in order to
make room for the initial buffers by pinning all bound buffers in a
single pass before binding (and evicting for) fresh buffer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Mark it so.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
On a few devices, like the Mac Mini, the CRT DDC pins are shared between
the analog connector and the digital connector. In this scenario, rely
on the EDID to determine if a digital panel is connected to the digital
connector.
Reported-and-tested-by: Tino Keitel <tino.keitel@tikei.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
... reduce the frequency of checking to further reduce the wakeups and
CPU overhead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This used to check the precondition that all fences were to be located
in a mappable area, redundant now as those two parameters are combined
into one.
After pinning, we assert that the buffer is bound into the desired
region.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The pipe control object is allocated by the device for the sole use of the
render ringbuffer. Move this detail from the general code to the render
ring buffer initialisation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Having seen the effects of erroneous fencing on the batchbuffer, a
useful sanity check is to record the fence registers at the time of an
error.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
drivers/char/agp/intel-gtt.c:340:48: warning: duplicate const
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Combining map_and_fenceable revealed a bug in
i915_gem_object_gtt_size() in that it always computed the appropriate
fence size for the object regardless of tiling state which caused us to
over-allocate linear buffers when binding to the GTT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
A glorified s/obj_priv/obj/ with a net reduction of over a 100 lines and
many characters!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Just some minor shuffling to get rid of any agp traces in the
exported functions.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
No more drm_*_agp in i915_gem.c!
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Its only user, intel-gtt.c is now gone.
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This still uses the agp functions to actually reinstate the mappings
(with a gross hack to make agp cooperate), but it wires everything
up correctly for the switchover.
The call to agp_rebind_memory can be dropped because all non-kms drivers
do all their rebinding on EnterVT.
v2: Be more paranoid and flush the chipset cache after restoring gtt
mappings.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
This is required to restore gtt mappings on resume when agp is gone.
The right way to do this would be to make sturct drm_mm_node embeddable
and use the allocation list maintained by the drm memory manager. But
that's a bigger project. Getting rid of the per bo agp_mem will save
more memory than this wastes, anyway.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
The intel drm calls the chipset functions now directly. Userspace
never called the corresponding ioctl, hence it can be killed, too.
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
No longer used.
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
... and a few other defines.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Now the intel-gtt.c rewrite is complete!
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Still a separate agp_bridge_driver because of the i81x-only
dedicated vram support.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Initialization is still done with the old code with a few
added things sprinkled in to make the intel_fake_agp helper
functions work.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Used for the now dead agp type_to_mask stuff.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
i830_check_flags already disallows it, so no need to implement it
in the write_entry function. Seems to be a remnant from i810 support.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
|
|
Currently if we hit a pagefault when applying a user relocation for the
execbuffer, we bail and return EFAULT to the application. Instead, we
need to unwind, drop the dev->struct_mutex, copy all the relocation
entries to a vmalloc array (to avoid any potential circular deadlocks
when resolving the pagefault), retake the mutex and then apply the
relocations. Afterwards, we need to again drop the lock and copy the
vmalloc array back to userspace.
v2: Incorporate feedback from Daniel Vetter.
Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
|
... and no need to perform a linear search for the index.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
|
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|