Skip to content

Add in windows support#7

Open
Mythra wants to merge 3 commits into
Exzap:mainfrom
Mythra:mythra/windows-compat
Open

Add in windows support#7
Mythra wants to merge 3 commits into
Exzap:mainfrom
Mythra:mythra/windows-compat

Conversation

@Mythra
Copy link
Copy Markdown

@Mythra Mythra commented Dec 12, 2025

this commit adds in support for building, and compiling shaders on windows using MSVC (this would've been much easier with clang on windows, but windows clang/cl2 is unsupported by meson (and older clang is so unsupported that setup is a nightmare), so we had to do quite a bit to get it to work with MSVC, and cl.exe). This probably has much more potential for problems, but we seem to at least be getting a compile. I've worked on fixing all warnings that were generated.

(As a side note the compilation issue I mentioned in the issue on windows turned out to be two things:

  1. Debug code did have a particular AddressSanitizer fault, I've fixed this.
  2. It turned out some of my bitmath was wrong which lead to some poor id generation. I was doing a count leading zeroes, instead of count trailing zeroes.)

Warning

As a side note, while compilations do succeed, when comparing the outputs of the test sets. While test 1 is exactly the same. Test set 2 generates slightly different register use, this doesn't seem to be an issue as far as I can tell, but it'd be good for someone else to confirm, or if someone else has test shaders to run. To confirm they all load okay.

Linux block:

[DBG] [TexInstr::emit_lowered_tex] samplerId 0 textureId 0
bytecode 56 dw -- 4 gprs -- 0 nstack -------------
shader 1 -- 7
0000 00000004 80800000  ALU 18 @8 KC0[CB143:0-32]
 0008 00020010 F00D1002     1 x:     ADD                    R4.x,  R0.w, R3.w
 0010 FC800000 00000000       y:     ADD                    R4.y,  R0.x, R3.x
 0012 80106082 0F800010       z:     ADD                    R4.z,  R0.y, R3.y
 0014 8010807C 0F800010       w:     ADD                    R4.w,  R0.z, R3.z
 0016 80102080 2F800010       t:     ADD                    R5.y,  KC0[0].x, KC0[1].x
 0018 800F847C 0F800010     2 x:     ADD                    R3.x,  PV.z, R1.y
 0020 8010A07C 0F800010       y:     ADD                    R3.y,  PV.y, R1.x
 0022 8011207C 0F800010       z:     ADD                    R0.z,  KC0[2].x, KC0[3].x
 0024 8011407C 0F800010       w:     ADD                    R2.w,  PV.w, R1.z
 0026 8011C07C 0F800010       t:     ADD                    R5.x,  PV.x, R1.w
 0028 00006000 0F800010     3 x:     ADD                    R0.x,  PV.z, KC0[4].x
 0030 00806400 2F800010     4 y:     ADD                    R0.y,  R5.y, PV.x
 0032 01006800 4F800010     5 w:     ADD                    R0.w,  PV.y, KC0[5].x
 0034 01806C00 6F800010     6 z:     ADD                    R0.z,  PV.w, KC0[9].x
 0036 8012407C 0FA00010     7 x:     ADD                    R0.x,  PV.z, KC0[10].x
 0038 0000207C 0F800010     8 z:     ADD                    R0.z,  PV.x, KC0[14].x
 0040 0080247C 2F800010     9 x:     ADD                    R0.x,  PV.z, KC0[18].x
 0042 0100287C 4F800010    10 z:     ADD                    R2.z,  PV.x, KC0[22].y
0002 83C00006 A0540000  TEX 1 @44
 0044 01802C7C 6F800010 8092C07D   SAMPLE         R0.xyzw, R2.xy__,  RID:0, SID:0  CT:NNNN
0004 C0000000 94200688  ALU 4 @48
 0048 0000407C 0002807D    11 x:     MUL_IEEE               R1.x,  R3.x, R0.y  BS:5
 0050 0080447C 20000110       y:     MUL_IEEE               R1.y,  R2.w, R0.z
 0052 0100487C 40000110       z:     MUL_IEEE               R1.z,  R5.x, R0.w
 0054 81804C7C 60000110       w:     MULADD_IEEE            R1.w,  R3.y, R0.x, R2.z
0006 00000000 00000000  EXPORT_DONE        PIXEL 0     R1.wxyz      ES:3 EOP
--------------------------------------

Windows Block:

[DBG] [TexInstr::emit_lowered_tex] samplerId 0 textureId 0
bytecode 56 dw -- 4 gprs -- 0 nstack -------------
shader 1 -- 7
0000 00000004 80800000  ALU 18 @8 KC0[CB143:0-32]
 0008 00010010 F00D1001     1 x:     ADD                    R4.x,  R2.w, R3.w
 0010 FC800000 00000000       y:     ADD                    R4.y,  R2.x, R3.x
 0012 80106082 0F800010       z:     ADD                    R4.z,  R2.y, R3.y
 0014 8010807C 0F800010       w:     ADD                    R4.w,  R2.z, R3.z
 0016 80102080 2F800010       t:     ADD                    R5.y,  KC0[0].x, KC0[1].x
 0018 800F847C 0F800010     2 x:     ADD                    R2.x,  PV.z, R0.y
 0020 8010A07C 0F800010       y:     ADD                    R2.y,  PV.y, R0.x
 0022 8011207C 0F800010       z:     ADD                    R1.z,  KC0[2].x, KC0[3].x
 0024 8011407C 0F800010       w:     ADD                    R2.w,  PV.w, R0.z
 0026 8011C07C 0F800010       t:     ADD                    R3.x,  PV.x, R0.w
 0028 00006002 0F800010     3 x:     ADD                    R0.x,  PV.z, KC0[4].x
 0030 00806402 2F800010     4 y:     ADD                    R0.y,  R5.y, PV.x
 0032 01006802 4F800010     5 w:     ADD                    R0.w,  PV.y, KC0[5].x
 0034 01806C02 6F800010     6 z:     ADD                    R0.z,  PV.w, KC0[9].x
 0036 8012407C 0FA00010     7 x:     ADD                    R0.x,  PV.z, KC0[10].x
 0038 0000007C 0F800010     8 z:     ADD                    R0.z,  PV.x, KC0[14].x
 0040 0080047C 2F800010     9 x:     ADD                    R0.x,  PV.z, KC0[18].x
 0042 0100087C 4F800010    10 z:     ADD                    R2.z,  PV.x, KC0[22].y
0002 83C00006 A0540000  TEX 1 @44
 0044 01800C7C 6F800010 8092C07D   SAMPLE         R0.xyzw, R1.xy__,  RID:0, SID:0  CT:NNNN
0004 C0000000 94200688  ALU 4 @48
 0048 0000207C 0002807D    11 x:     MUL_IEEE               R1.x,  R2.x, R0.y  BS:5
 0050 0080247C 20000110       y:     MUL_IEEE               R1.y,  R2.w, R0.z
 0052 0100287C 40000110       z:     MUL_IEEE               R1.z,  R3.x, R0.w
 0054 81802C7C 60000110       w:     MULADD_IEEE            R1.w,  R2.y, R0.x, R2.z
0006 00000000 00000000  EXPORT_DONE        PIXEL 0     R1.wxyz      ES:3 EOP

I did validate all the test shaders I could find were able to be compiled.


Some of the changes that went through:

  • Updated the readme to be a little bit more accurate, and add a powershell equivalent to the bash script for compiling.
  • Properly mark not only declarations, but also implementations as their appropriate dllexport/dllimport. as this is required for msvc.
  • use _aligned_malloc since aligned_malloc doesn't exist on windows, ensure it's always aligned to a power of 2 since windows requires that.
  • undo the undef __cplusplus hack which is not supported at all in msvc, and frankly is just a very messy set of behavior for the compilers it does work for.
    • as far as I can tell this was due to two things: 1. incorrect include orders. 2. trying to not define a specific set of functions which are not supported.
    • I've included things in the correct order, and added a define to work around defining the functions that don't work.
  • remove <bit> header because it's only supported on c++20 on MSVC which we aren't compiling with
  • cleanup of extra spaces
  • make glslcompiler have the "correct" file extension for each platform (no extension on linux/mac/bsd's, exe on windows, and .elf on CafeOS).
  • Tests always fire off asserts, even when assert() isn't compiled in.
  • Move off of deprecated yacc functions.
  • Comment out all initialized but unused variables as this is an error for MSVC.
  • Create a python script to work on defining static arrays which need to be sized on MSVC, and do not support designated initializers (which is really unfortunate). This makes it easy to take a linux array, and define a windows working equivalent. For the one shader that uses it.
  • Use alternative builtins for __builtin_ctz, and __builtin_clz

this commit adds in support for building, and compiling shaders on
windows.
@Mythra
Copy link
Copy Markdown
Author

Mythra commented Dec 12, 2025

FWIW I've confirmed this builds on linux/macOS/windows, but it'd be good for someone else to confirm too that this still builds for cafeOS, my cafeOS build is trying to look for wiiuvars.sh, and portlibs_prefix.sh which I can't find a definitive source for anywhere. Just since this is a larger pr than mac it'd be good to confirm.😄

@Exzap
Copy link
Copy Markdown
Owner

Exzap commented Dec 23, 2025

When compiling for Wii U via compile_for_cafe.sh I now get:

bmesa.a.p/state_tracker_st_nir_lower_tex_src_plane.c.o -c ../src/mesa/state_tracker/st_nir_lower_tex_src_plane.c
../src/mesa/state_tracker/st_nir_lower_tex_src_plane.c: In function 'add_sampler':
../src/mesa/state_tracker/st_nir_lower_tex_src_plane.c:75:4: error: implicit declaration of function 'asprintf'; did you mean 'sprintf'? [-Werror=implicit-function-declaration]
   75 |    asprintf(&name, "%s:%s", orig_sampler->name, ext);
      |    ^~~~~~~~
      |    sprintf

wiiuvars.sh and portlibs_prefix.sh are from devkitpro package dkp-toolchain-vars.

@Mythra
Copy link
Copy Markdown
Author

Mythra commented Dec 23, 2025

Ah, I thought I had done a package search, but guess not. Thanks for that pointer. Luckily that's a really easy fix for this PR. Assuming I can compile safely I'll build and test the whole stack on the Wii-U as well.

@Mythra
Copy link
Copy Markdown
Author

Mythra commented Dec 23, 2025

Thanks for the pointer! I've fixed the functions that the devkitPro standard library doesn't have, and also a linking error that popped up.

Though i'm gonna be honest, I have 0 idea how on any earth this worked before. The rpx/rpl files were not linking in libglslcompiler which provided the actual implementation for things like InitGLSLCompiler. The files in main.cpp just don't include the actual function implementations. I saw there was: #ifndef GLSL_COMPILER_CAFE_RPL that prevented any definitions of those functions at all (which might make sense when not linking in the implementations), but those functions ARE exported in exports.def. Which is what the RPL uses to tell what functions are available.

Anyway to ignore whatever was happening there. I've removed the ifndef for RPL compilation so those functions are defined when building an RPL, and I've linked in libglslcompiler so those functions have an implementation when linked together as they say they do in exports.def.

Things seem to be working, but I'd appreciate any tests folks have with RPLs to happen to make sure there's nothing insidious sneaking in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants