Commit graph

705 commits

Author SHA1 Message Date
ReinUsesLisp ffceca76c9 shader/arithmetic: Add FCMP_CR variant
Adds another variant of FCMP.
2020-04-14 19:11:04 -03:00
ReinUsesLisp cbbaeeb630 gl_rasterizer: Implement line widths and smooth lines
Implements "legacy" features from OpenGL present on hardware such as
smooth lines and line width.
2020-04-13 01:30:34 -03:00
Fernando Sahmkow 1b20368a0a Merge pull request #3578 from ReinUsesLisp/vmnmx
shader/video: Partially implement VMNMX
2020-04-12 10:44:03 -04:00
ReinUsesLisp 44e1a2c490 shader/video: Partially implement VMNMX
Implements the common usages for VMNMX. Inputs with a different size
than 32 bits are not supported and sign mismatches aren't supported
either.

VMNMX works as follows:
It grabs Ra and Rb and applies a maximum/minimum on them (this is
defined by .MX), having in mind the input sign. This result can then be
saturated. After the intermediate result is calculated, it applies
another operation on it using Rc. These operations are merges,
accumulations or another min/max pass.

This instruction allows to implement with a more flexible approach GCN's
min3 and max3 instructions (for instance).
2020-04-12 00:34:42 -03:00
ReinUsesLisp 17b14be027 video_core: Add MSAA registers in 3D engine and TIC
This adds the registers used for multisampling. It doesn't implement
anything for now.
2020-04-12 00:21:27 -03:00
bunnei 85d89e0758 Merge pull request #3601 from ReinUsesLisp/some-shader-encodings
video_core/shader: Add some instruction and S2R encodings
2020-04-09 00:17:39 -04:00
ReinUsesLisp 2eef8d7249 shader_bytecode: Rename MOV_SYS to S2R 2020-04-04 03:37:51 -03:00
ReinUsesLisp 1163fe0034 shader_bytecode: Add encoding for BAR 2020-04-04 03:36:21 -03:00
ReinUsesLisp 8a635a351b shader_bytecode: Add encoding for VOTE.VTG 2020-04-04 03:28:11 -03:00
ReinUsesLisp d66cae7bd5 shader_decompiler: Remove FragCoord.w hack and change IPA implementation
Credits go to gdkchan and Ryujinx. The pull request used for this can
be found here: https://github.com/Ryujinx/Ryujinx/pull/1082

yuzu was already using the header for interpolation, but it was missing
the FragCoord.w multiplication described in the linked pull request.
This commit finally removes the FragCoord.w == 1.0f hack from the shader
decompiler.

While we are at it, this commit renames some enumerations to match
Nvidia's documentation (linked below) and fixes component declaration
order in the shader program header (z and w were swapped).

https://github.com/NVIDIA/open-gpu-doc/blob/master/Shader-Program-Header/Shader-Program-Header.html
2020-04-01 21:48:55 -03:00
namkazy 3a65713029 shader_decode: merge GlobalAtomicOp to AtomicOp 2020-03-30 18:47:00 +07:00
ReinUsesLisp 1f7c51a57b shader_bytecode: Fix I2I_IMM encoding 2020-03-28 18:49:07 -03:00
ReinUsesLisp 8993217f01 engines/const_buffer_engine_interface: Store image format type
This information is required to properly implement SULD.B. It might also
be handy for all image operations, since it would allow us to implement
them on devices that require the image format to be specified (on
desktop, this would be AMD on OpenGL and Intel on OpenGL and Vulkan).
2020-03-27 00:36:22 -03:00
bunnei b769578c49 Merge pull request #3520 from ReinUsesLisp/legacy-varyings
gl_shader_decompiler: Implement legacy varyings
2020-03-25 19:27:51 -04:00
namkazy fa1b60cc8c apply replay logic to all writes. remove replay from MacroInterpreter::Send (@fincs) 2020-03-22 22:25:44 +07:00
namkazy 51e13ff50c maxwell_3d: change declaration order 2020-03-22 13:41:16 +07:00
namkazy 30112fcb3c maxwell_3d: init shadow_state 2020-03-22 13:35:11 +07:00
namkazy e89c5935f4 maxwell_3d: this seem more correct. 2020-03-22 12:02:54 +07:00
namkazy 854fb1ed2b maxwell_3d: update comments for shadow ram usage 2020-03-22 11:35:26 +07:00
Nguyen Dac Nam 771d117869 maxwell_3d: track shadow ram ctrl and hw reg value 2020-03-22 10:53:41 +07:00
Nguyen Dac Nam f2d9dacfed maxwell_3d: implement MME shadow RAM 2020-03-22 10:53:35 +07:00
ReinUsesLisp 07b52b1307 kepler_compute: Remove unused variables 2020-03-18 20:03:19 -03:00
Rodrigo Locatti 944d38efc8 Merge pull request #3502 from namkazt/patch-3
shader_decode: Reimplement BFE instructions
2020-03-15 21:23:04 -03:00
ReinUsesLisp bba58f7272 shader/shader_ir: Track usage in input attribute and of legacy varyings 2020-03-15 21:01:52 -03:00
ReinUsesLisp e7dfc5d8a3 maxwell_3d: Add padding words to XFB entries
Use INSERT_UNION_PADDING_WORDS instead of alignas to ensure a size
requirement.
2020-03-13 18:33:05 -03:00
ReinUsesLisp fcc4b81079 gl_rasterizer: Implement transform feedback bindings 2020-03-13 18:33:04 -03:00
Rodrigo Locatti e836473754 Merge branch 'master' into shader-purge 2020-03-13 16:44:06 -03:00
Nguyen Dac Nam 3f688622d7 shader_bytecode: update BFE instructions struct. 2020-03-13 12:52:16 +07:00
ReinUsesLisp 96fdbc638a gl_rasterizer: Implement polygon modes and fill rectangles 2020-03-09 20:39:58 -03:00
ReinUsesLisp 207b9ba28c engines/maxwell_3d: Add TFB registers and store them in shader registry 2020-03-09 18:40:53 -03:00
ReinUsesLisp 7a93d38e0f const_buffer_engine_interface: Store component types
This is required for Vulkan. Sampling integer textures with float
handles is illegal.
2020-03-09 18:40:53 -03:00
ReinUsesLisp 2eb2855b05 state_tracker: Remove type traits with named structures 2020-02-28 17:56:43 -03:00
ReinUsesLisp 31daa97fce maxwell_3d: Use two tables instead of three for dirty flags 2020-02-28 17:56:42 -03:00
ReinUsesLisp e94ea8758d maxwell_3d: Change write dirty flags to a bitset 2020-02-28 17:56:42 -03:00
ReinUsesLisp 95596b787e maxwell_3d: Flatten cull and front face registers 2020-02-28 17:56:41 -03:00
ReinUsesLisp 005f5ca883 video_core: Reintroduce dirty flags infrastructure 2020-02-28 17:56:41 -03:00
ReinUsesLisp 8b76fc1fff gl_state: Remove clip distances tracking 2020-02-28 17:26:26 -03:00
ReinUsesLisp b810da8400 gl_state: Remove viewport and depth range tracking 2020-02-28 17:25:18 -03:00
ReinUsesLisp c2d3732176 gl_rasterizer: Remove dirty flags 2020-02-28 16:39:27 -03:00
bunnei 80e577e2a6 Merge pull request #3425 from ReinUsesLisp/layered-framebuffer
texture_cache: Implement layered framebuffer attachments
2020-02-24 10:14:50 -05:00
bunnei 5c5b6cf721 Merge pull request #3414 from ReinUsesLisp/maxwell-3d-draw
maxwell_3d: Unify draw methods
2020-02-19 16:13:50 -05:00
Fernando Sahmkow d67c538d68 Merge pull request #3409 from ReinUsesLisp/host-queries
query_cache: Implement a query cache and query 21 (samples passed)
2020-02-18 11:31:06 -04:00
ReinUsesLisp 505f30fdaf texture_cache: Implement layered framebuffer attachments
Layered framebuffer attachments is a feature that allows applications to
write attach layered textures to a single attachment. What layer the
fragments are written to is decided from the shader using gl_Layer.
2020-02-16 04:19:32 -03:00
ReinUsesLisp 518a6182f9 maxwell_3d: Unify draw methods
Pass instanced state of a draw invocation as an argument instead of
having two separate virtual methods.
2020-02-14 18:09:40 -03:00
ReinUsesLisp d8a42816d7 gl_query_cache: Optimize query cache
Use a custom cache instead of relying on a ranged cache.
2020-02-14 17:38:27 -03:00
ReinUsesLisp 339a227a5e gl_query_cache: Implement host queries using a deferred cache
Instead of waiting immediately for executed commands, defer the query
until the guest CPU reads it. This way we get closer to what the guest
program is doing.

To archive this we have to build a dependency queue, because host APIs
(like OpenGL and Vulkan) use ranged queries instead of counters like
NVN.

Waiting for queries implicitly uses fences and this requires a command
being queued, otherwise the driver will lock waiting until a timeout. To
fix this when there are no commands queued, we explicitly call glFlush.
2020-02-14 17:33:13 -03:00
ReinUsesLisp 11206f8a28 maxwell_3d: Slow implementation of passed samples (query 21)
Implements GL_SAMPLES_PASSED by waiting immediately for queries.
2020-02-14 17:27:17 -03:00
bunnei 082ba6fc64 Merge pull request #3379 from ReinUsesLisp/cbuf-offset
shader/decode: Fix constant buffer offsets
2020-02-14 13:22:53 -05:00
bunnei 12b89ac3d0 Merge pull request #3395 from FernandoS27/queries
GPU: Refactor queries implementation and correct GPU Clock.
2020-02-13 20:18:26 -05:00
Fernando Sahmkow 2dd9d660e3 GPU: Address Feedback. 2020-02-13 18:16:07 -04:00