diff options
author | Christian König <christian.koenig@amd.com> | 2014-05-10 14:17:55 +0400 |
---|---|---|
committer | Alex Deucher <alexander.deucher@amd.com> | 2014-06-02 18:25:02 +0400 |
commit | ec3dbbcbd7a6ee165ca7eeafec8dbc733901ab2f (patch) | |
tree | f0a715cec425e8815870874892cce639b25eda1f /drivers/gpu/drm/radeon/radeon.h | |
parent | 831719d62f692e28699a7acd7b441c6f0c01b6f7 (diff) | |
download | linux-ec3dbbcbd7a6ee165ca7eeafec8dbc733901ab2f.tar.xz |
drm/radeon: add large PTE support for NI, SI and CIK v5
This patch implements support for VRAM page table entry compression.
PTE construction is enhanced to identify physically contiguous page
ranges and mark them in the PTE fragment field. L1/L2 TLB support is
enabled for 64KB (SI/CIK) and 256KB (NI) PTE fragments, significantly
improving TLB utilization for VRAM allocations.
Linear store bandwidth is improved from 60GB/s to 125GB/s on Pitcairn.
Unigine Heaven 3.0 sees an average improvement from 24.7 to 27.7 FPS
on default settings at 1920x1200 resolution with vsync disabled.
See main comment in radeon_vm.c for a technical description.
v2 (chk): rebased and simplified.
v3 (chk): add missing hw setup
v4 (chk): rebased on current drm-fixes-3.15
v5 (chk): fix comments and commit text
Signed-off-by: Jay Cornwall <jay@jcornwall.me>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Diffstat (limited to 'drivers/gpu/drm/radeon/radeon.h')
-rw-r--r-- | drivers/gpu/drm/radeon/radeon.h | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index b58e1afdda76..325f3a586cb7 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -848,6 +848,11 @@ struct radeon_mec { #define R600_PTE_READABLE (1 << 5) #define R600_PTE_WRITEABLE (1 << 6) +/* PTE (Page Table Entry) fragment field for different page sizes */ +#define R600_PTE_FRAG_4KB (0 << 7) +#define R600_PTE_FRAG_64KB (4 << 7) +#define R600_PTE_FRAG_256KB (6 << 7) + struct radeon_vm_pt { struct radeon_bo *bo; uint64_t addr; |