Commit graph

4318 commits

Author SHA1 Message Date
ReinUsesLisp ff7b0d7329 maxwell_3d: Add viewport swizzles 2020-05-04 17:50:59 -03:00
bunnei 99075400e3 Merge pull request #3808 from ReinUsesLisp/wait-for-idle
{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
2020-05-03 02:43:18 -04:00
bunnei 18a1e885a1 Merge pull request #3732 from lioncash/header
vulkan: Remove unnecessary includes
2020-05-02 01:36:57 -04:00
bunnei 716a999346 Merge pull request #3809 from ReinUsesLisp/empty-index
vk_rasterizer: Skip index buffer setup when vertices are zero
2020-05-02 01:21:57 -04:00
bunnei 33a218bea2 Merge pull request #3693 from ReinUsesLisp/clean-samplers
shader/texture: Support multiple unknown sampler properties
2020-05-02 00:45:41 -04:00
Jan Beich 1d30ce7b5b fixed_pipeline_state: explicitly use template keyword after be8742e286
In file included from src/video_core/renderer_opengl/renderer_opengl.cpp:25:
In file included from src/./video_core/renderer_opengl/gl_rasterizer.h:26:
In file included from src/./video_core/renderer_opengl/gl_fence_manager.h:11:
src/./video_core/fence_manager.h:91:32: error: use 'template' keyword
      to treat 'Write' as a dependent template name
                memory_manager.Write<u32>(current_fence->GetAddress(), current_fence->GetPayload());
                               ^
                               template
src/./video_core/fence_manager.h:137:32: error: use 'template'
      keyword to treat 'Write' as a dependent template name
                memory_manager.Write<u32>(current_fence->GetAddress(), current_fence->GetPayload());
                               ^
                               template
2020-05-01 23:38:23 +00:00
bunnei 6a57e5fc49 Merge pull request #3807 from ReinUsesLisp/fix-depth-clamp
maxwell_3d: Fix depth clamping register
2020-04-30 13:07:31 -04:00
bunnei e043a955f1 Merge pull request #3799 from ReinUsesLisp/iadd-cc
shader: Implement P2R CC, IADD Rd.CC and IADD.X
2020-04-30 12:56:36 -04:00
bunnei 2765bb73a2 Merge pull request #3805 from ReinUsesLisp/preserve-contents
texture_cache: Reintroduce preserve_contents accurately
2020-04-30 12:56:19 -04:00
bunnei e20f3161a3 Merge pull request #3788 from FernandoS27/revert
Revert: shader_decode: Fix LD, LDG when track constant buffer.
2020-04-30 12:55:39 -04:00
Lioncash a35345d217 vulkan: Remove unnecessary includes
Reduces some header churn and reduces rebuilds when some header
internals change.

While we're at it we can also resolve a missing include in buffer_cache.
2020-04-28 21:54:46 -04:00
ReinUsesLisp 8d28a56a6c shader/arithmetic_integer: Fix tracking issue in temporary
This temporary is not needed as we mark Rd.CC + IADD.X as unimplemented.
It caused issues when tracking global buffers.
2020-04-28 17:14:53 -03:00
bunnei 510842d827 Merge pull request #3784 from ReinUsesLisp/shader-memory-util
shader/memory_util: Deduplicate code
2020-04-28 12:05:50 -04:00
ReinUsesLisp 6e79375f46 vk_rasterizer: Skip index buffer setup when vertices are zero
Xenoblade 2 invokes a draw call with zero vertices.
This is likely due to indirect drawing (glDrawArraysIndirect).

This causes a crash in the staging buffer pool when trying to create a
buffer with a size of zero. To workaround this, skip index buffer setup
entirely when the number of indices is zero.
2020-04-28 02:24:33 -03:00
ReinUsesLisp 8835d40024 {maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
Drop MemoryBarrier from the buffer cache and use Maxwell3D's register
WaitForIdle.

To implement this on OpenGL we just call glMemoryBarrier with the
necessary bits.

Vulkan lacks this synchronization primitive, so we set an event and
immediately wait for it. This is not a pretty solution, but it's what
Vulkan can do without submitting the current command buffer to the queue
(which ends up being more expensive on the CPU).
2020-04-28 02:18:12 -03:00
ReinUsesLisp b82e61dff0 maxwell_3d: Fix depth clamping register
Using deko3d as reference:
4e47ba0013/source/maxwell/gpu_3d_state.cpp (L42)

We were using bits 3 and 4 to determine depth clamping, but these are
the same both enabled and disabled:

state->depthClampEnable ? 0x101A : 0x181D

The same happens on Nvidia's OpenGL driver, where they do something like
this (default capabilities, GL 4.5 compatibility):

(state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c

There's always a difference between the first bits in this register, but
bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This
commit changes yuzu's behaviour to use bit 11 to determine depth
clamping.

- Fixes depth issues on Super Mario Odyssey's intro.
2020-04-27 20:50:14 -03:00
Fernando Sahmkow fe8ea9e0c9 Merge pull request #3766 from ReinUsesLisp/renderpass-cache-key
vk_renderpass_cache: Pack renderpass cache key and unify keys
2020-04-27 16:05:14 -04:00
Fernando Sahmkow 2043198e50 Merge pull request #3756 from ReinUsesLisp/integrated-devices
vk_memory_manager: Remove unified memory model flag
2020-04-27 16:04:22 -04:00
bunnei d952b2e1da Merge pull request #3742 from FernandoS27/command-list
Optimize GPU Command Lists and Introduce Fast GPU Time Option
2020-04-27 00:18:46 -04:00
ReinUsesLisp 8e3af5d3ca texture_cache: Reintroduce preserve_contents accurately
This reverts commit 1f1e80c67d.

preserve_contents proved to be a meaningful optimization. This commit
reintroduces it but properly implemented on OpenGL.

We have to make sure the clear removes all the previous contents of the
image.

It's not currently implemented on Vulkan because we can do smart things
there that's preferred to be introduced in a separate commit.
2020-04-26 19:53:02 -03:00
Rodrigo Locatti 0f7a89c2ef Merge pull request #3753 from ReinUsesLisp/ac-vulkan
{gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers
2020-04-26 01:55:43 -03:00
ReinUsesLisp 9b433b2467 shader/memory_util: Deduplicate code
Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as
well as shader decoder code.

While we are at it, fix a bug in gl_shader_cache where compute shaders
had an start offset of a stage shader.
2020-04-26 01:38:51 -03:00
ReinUsesLisp 7442f21c5b shader/arithmetic_integer: Fix edge case and mark IADD.X Rd.CC as unimplemented
IADD.X Rd.CC requires some extra logic that is not currently
implemented. Abort when this is hit.
2020-04-25 22:58:33 -03:00
ReinUsesLisp 45cb8fc72a shader/arithmetic_integer: Change IAdd to UAdd to avoid signed overflow
Signed integer addition overflow might be undefined behavior. It's free
to change operations to UAdd and use unsigned integers to avoid
potential bugs.
2020-04-25 22:57:54 -03:00
ReinUsesLisp 6404cd824b shader/arithmetic_integer: Implement IADD.X
IADD.X takes the carry flag and adds it to the result. This is generally
used to emulate 64-bit operations with 32-bit registers.
2020-04-25 22:56:11 -03:00
ReinUsesLisp 7e8f51273c shader/arithmetic_integer: Implement CC for IADD 2020-04-25 22:55:26 -03:00
ReinUsesLisp 26dc95c7bc decode/register_set_predicate: Implement CC
P2R CC takes the state of condition codes and puts them into a register.
We already have this implemented for PR (predicates). This commit
implements CC over that.
2020-04-25 22:54:42 -03:00
ReinUsesLisp 491e2cbfd7 decode/register_set_predicate: Use move for shared pointers
Avoid atomic counters used by shared pointers.
2020-04-25 22:54:14 -03:00
bunnei 6cd0fc2a95 Merge pull request #3721 from ReinUsesLisp/sort-devices
vulkan/wrapper: Sort physical devices
2020-04-25 03:27:40 -04:00
bunnei 05a54192f8 Merge pull request #3734 from ReinUsesLisp/half-float-mods
decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits
2020-04-25 00:41:43 -04:00
ReinUsesLisp 88a6c10687 vk_rasterizer: Pack texceptions and color formats on invalid formats
Sometimes for unknown reasons NVN games can bind a render target format
of 0. This may be a yuzu bug.

With the commits before this the formats were specified without being
"packed", assuming all formats and texceptions will be written like in
the color_attachments vector.

To address this issue, iterate all render targets and pack them as they
are valid. This way they will match color_attachments.

- Fixes validation errors and graphical issues on Breath of the Wild.
2020-04-24 22:21:29 -03:00
bunnei d1294ad83b Merge pull request #3749 from ReinUsesLisp/lea-imm
shader/arithmetic_integer: Fix LEA_IMM encoding
2020-04-24 14:30:13 -04:00
Fernando Sahmkow b19e1d2c09 Revert: shader_decode: Fix LD, LDG when track constant buffer. 2020-04-24 11:00:54 -04:00
Markus Wick 1acd6b34e9 Fix -Wdeprecated-copy warning. 2020-04-24 09:33:04 +02:00
Markus Wick ac24f0506c Fix -Werror=conversion error. 2020-04-24 09:33:04 +02:00
ReinUsesLisp 3e808936a8 decode/arithmetic_half: Fix HADD2 and HMUL2 absolute and negation bits
The encoding for negation and absolute value was wrong.
Extracting is now done manually. Similar instructions having different
encodings is the rule, not the exception. To keep sanity and readability
I preferred to extract the desired bit manually.

This is implemented against nxas:
8dbc389957/table.h (L68)

That is itself tested against nvdisasm (Nvidia's official disassembler).
2020-04-23 18:29:38 -03:00
ReinUsesLisp 0034e6310d shader/texture: Support multiple unknown sampler properties
This allows deducing some properties from the texture instruction before
asking the runtime. By doing this we can handle type mismatches in some
instructions from the renderer instead of the shader decoder.

Fixes texelFetch issues with games using 2D texture instructions on a 1D
sampler.
2020-04-23 18:04:13 -03:00
ReinUsesLisp c9b4c56d69 shader_ir: Turn classes into data structures 2020-04-23 18:00:06 -03:00
ReinUsesLisp f78f26b75a vk_rasterizer: Fix framebuffer creation validation errors
Framebuffer creation was ignoring the number of color attachments.
2020-04-23 17:34:16 -03:00
ReinUsesLisp ab7eae6fff vk_pipeline_cache: Unify pipeline cache keys into a single operation
This allows us to call Common::CityHash and std::memcmp only once for
GraphicsPipelineCacheKey. While we are at it, do the same for compute.
2020-04-23 17:34:16 -03:00
ReinUsesLisp 7b76c67803 vk_renderpass_cache: Pack renderpass cache key to 12 bytes 2020-04-23 17:34:16 -03:00
bunnei da893629a0 kernel: memory: Improve implementation of device shared memory. (#3707)
* kernel: memory: Improve implementation of device shared memory.

* fixup! kernel: memory: Improve implementation of device shared memory.

* fixup! kernel: memory: Improve implementation of device shared memory.
2020-04-23 11:37:12 -04:00
Fernando Sahmkow 0cf32d6184 Clang Format. 2020-04-23 08:52:58 -04:00
Fernando Sahmkow c8f4549d43 GPU: Add Fast GPU Time Option. 2020-04-23 08:52:57 -04:00
Fernando Sahmkow 9311983f3d Maxwell3D: Process Macros on MultiMethod. 2020-04-23 08:52:56 -04:00
Fernando Sahmkow ef3a0ae64a DMAPusher: Propagate multimethod writes into the engines. 2020-04-23 08:52:55 -04:00
bunnei 9c753735c5 Merge pull request #3697 from lioncash/declarations
CMakeLists: Enable -Wmissing-declarations on Linux builds
2020-04-23 02:18:52 -04:00
bunnei c916ad62e7 Merge pull request #3677 from FernandoS27/better-sync
Introduce Predictive Flushing and Improve ASYNC GPU
2020-04-22 22:09:38 -04:00
ReinUsesLisp 910decd9cb vk_pipeline_cache: Fix unintentional memcpy into optional
The intention behind this was to assign a float to from an uint32_t, but
it was unintentionally being copied directly into the std::optional.

Copy to a temporary and assign that temporary to std::optional. This can
be replaced with std::bit_cast<float> once we are in C++20.
2020-04-22 21:36:05 -03:00
Fernando Sahmkow e211e30093 GL_Fence_Manager: use GL_TIMEOUT_IGNORED instead of a loop, 2020-04-22 20:34:32 -04:00