Crate onedrop_hlsl

Crate onedrop_hlsl 

Source
Expand description

HLSL to WGSL Translation

Pragmatic, MilkDrop-2-targeted translator. The MD2 user shader body lives inside a shader_body { ... } block, samples the previous frame via tex2D / GetPixel / GetBlur1..3, and uses HLSL-style typed local declarations (float2 uv2;). The translator turns that into a WGSL fragment-body fragment that the codegen wrapper can paste inside its fs_main.

The rewrites are still string-driven (no AST) โ€” but they understand the MD2 conventions enough to land the dominant cases.

Re-exportsยง

pub use types::SymbolTable;
pub use types::WgslType;

Modulesยง

ast
HLSL AST.
lex
HLSL tokenizer.
parse
HLSL recursive-descent parser.
rewrite
AST-driven HLSLโ†’HLSL rewrites.
texture_plan ๐Ÿ”’
Per-preset user-texture binding plan + the tex2D rewriter that consumes it.
types
Type-aware post-translator passes for the HLSLโ†’WGSL pipeline.

Structsยง

TextureBindingPlan
Per-preset mapping from HLSL sampler names to the comp passโ€™s user-texture slots.
TextureSlot
One slot in a TextureBindingPlan. Carries the resolved pool texture name (or None when the renderer should fall back to the 1ร—1 white fallback) and the textureโ€™s vec4<f32>(w, h, 1/w, 1/h) for the texsize_<NAME> constant the wrapper emits.
UserSamplerRef
One parsed sampler sampler_X; declaration. The renderer consumes this to populate a TextureBindingPlan.

Enumsยง

TranslationError

Constantsยง

KNOWN_CALL_NAMES ๐Ÿ”’
HLSL function names that the rewrite pipeline treats specially. When a preset writes name (args) with whitespace between the identifier and (, downstream substring matches ("lerp(", paren-balanced walkers) would skip the call. Collapse that whitespace once up-front so every downstream pass sees the no-space form.
LIFTED_FN_SENTINEL
Marker line that separates module-scope user functions (lifted by lift_user_functions) from the fragment body inside the translated output. The codegen wrapper splits on this marker: text before goes before fs_main, text after goes inside it.
MAX_USER_TEXTURE_SLOTS
Maximum simultaneously-bound user textures per preset. Sized to cover the in-the-wild distribution: a typical preset survey shows โ‰ค 4 distinct sampler sampler_X; declarations per shader; 8 leaves headroom without blowing past the comp pipelineโ€™s bind-group budget. Presets that exceed this cap are translated with the first 8 slots used and the rest falling through to the standard fallback path (sampler_main).

Staticsยง

CONST_TYPE_REGEX ๐Ÿ”’
HLSL const TYPE NAME at start of a typed local declaration. We strip const only when followed by a recognised HLSL type token โ€” leaves WGSL module-level const (which doesnโ€™t appear in user shader bodies anyway) alone.
INLINE_DECL_SPLIT_REGEX ๐Ÿ”’
Normalise ;<space>TYPE to ;\nTYPE so the line-anchored LOCAL_DECL_REGEX sees each declaration on its own line. The Isosceles preset (and a handful of MD2 packs that compress kaleidoscope state) put multiple typed locals on a single source line:
LEADING_ZERO_REGEX ๐Ÿ”’
Strips leading zeros from integer literals (HLSL allows 02 for 2, WGSL rejects them with invalid numeric literal format). Targets only integer literals โ€” 0.5 and 0 and 100 are untouched because the pattern requires \b0+ followed by another decimal digit (so 0. and 0) never match).
LOCAL_DECL_REGEX ๐Ÿ”’
HLSL typed local declaration, post type-substitution. The whole <TYPE> <decls>; statement is captured; the declarator list is then expanded into one WGSL var per name. Examples: f32 gx1 = a; โ†’ var gx1: f32 = a; vec2<f32> uv2; โ†’ var uv2: vec2<f32>; vec3<f32> ret1, neu, crisp; โ†’ var ret1: vec3<f32>; var neu: vec3<f32>; var crisp: vec3<f32>;
POSTFIX_DEC_REGEX ๐Ÿ”’
POSTFIX_INC_REGEX ๐Ÿ”’
Postfix <ident>++ and <ident>-- (HLSL increment/decrement). WGSL has no postfix operators; we rewrite to the equivalent compound assignment <ident> = <ident> + 1 only at statement boundaries (; or )) so expression-position uses like a[i++] donโ€™t get mangled. Real preset pattern: n++; at end of a per-iteration loop.
PREPROC_REGEX ๐Ÿ”’
Preprocessor directives (#define, #include, #pragma, โ€ฆ). HLSL presets occasionally use them; WGSL has no preprocessor, so we strip the whole line.
SAMPLER_DECL_REGEX ๐Ÿ”’
HLSL sampler foo; / texture foo; declarations at module scope. The codegen wrapper provides the actual texture/sampler bindings, so user declarations are redundant โ€” and they confuse the WGSL parser when they land inside fs_main. Stripped out wholesale.
SEMANTICS_REGEX ๐Ÿ”’
SHADER_BODY_OPEN ๐Ÿ”’
shader_body keyword optionally followed by whitespace/newlines and {. We then balance braces ourselves to recover the body โ€” a regex alone canโ€™t do nested-brace balancing reliably.
STORAGE_CLASS_REGEX ๐Ÿ”’
HLSL storage-class qualifiers โ€” strip when they appear as a leading word in a declaration. static const is a common HLSL pattern for โ€œfunction-scope compile-time constantโ€, but WGSL inside a function uses let (or const at module scope only). Stripping both lets rewrite_local_declarations turn the rest into a regular var.

Functionsยง

body_assigns_to ๐Ÿ”’
Whether body (a { ... } WGSL block as text) contains an assignment to a bare identifier matching name. Recognises plain =, compound assignments (+=, -=, *=, /=, %=) and post-increment / decrement (++ / --); excludes the comparison operators ==, !=, <=, >=. Skips matches inside //-style line comments because the textual scan would otherwise see a commented-out assignment as a real write.
brace_up_single_statement_blocks ๐Ÿ”’
Wrap single-statement if/while/for bodies in { ... }. WGSL requires braces on every conditional/loop body; HLSL doesnโ€™t, and a lot of MD2 preset code uses the brace-less form (if (cond) ret.z -= 0.5;).
collect_global_var_types ๐Ÿ”’
Re-parse hlsl (the original input, before any rewriter touched it) and return a map of name โ†’ wgsl_type for top-level Item::GlobalVar items. Used by hoist_global_vars to decide which body-level var declarations belong at module scope.
comment_out_prose_lines ๐Ÿ”’
Comment out lines that look like prose (English) rather than HLSL/WGSL code. Real preset pattern: comp_30=`written by martin โ€” an attribution typed without a // prefix, threaded into the shader body by the .milk parser as a literal line of code. Many presets failed with expected assignment or increment/decrement; found 'by' (or found 'rota', etc.) on lines of this shape.
dedup_var_declarations ๐Ÿ”’
Walk the source for var <NAME>: <TYPE> [= INIT]; declarations; the first time a NAME appears in the current scope, keep it; every later declaration of the same NAME in the same scope becomes a plain assignment (NAME = INIT;) or โ€” if it had no initialiser โ€” is dropped entirely.
expand_simple_defines ๐Ÿ”’
Scan for lines of the form #define IDENT IDENT (whitespace-separated single-token replacement) and substitute from โ†’ to everywhere else in the source. Operates as a single pass: defines are collected first, then applied to the rest of the source. Skips macros whose to looks like anything other than a bare identifier so we donโ€™t accidentally inline #define K 0.5 (where the rest of the source has plain K in arithmetic context โ€” the existing fall-through preserves it as an undefined-but-untouched identifier the user can spot in the error).
hlsl_type_to_wgsl ๐Ÿ”’
Map an HLSL type name (float, float2, int3, float3x3, โ€ฆ) to its WGSL equivalent (f32, vec2<f32>, vec3<f32>, mat3x3<f32>, โ€ฆ). Returns None for types the hoist pass shouldnโ€™t touch (e.g. user struct names, samplers โ€” these donโ€™t show up as GlobalVar anyway, but be defensive).
hoist_global_vars ๐Ÿ”’
Hoist top-level user globals to module scope.
is_function_signature ๐Ÿ”’
Detect โ€œthis decls capture is actually a function signatureโ€. decls is the text the local-decl regex captured between the type and the terminating ;. A function signature shape is <ident>(...) { ... return ... with the ; being the first statement-terminator inside the body, but the giveaway sits at the front: a ( appears before any =. Variable declarations never put ( ahead of the initializer assignment (var x = sin(0); โ€” ( follows =).
is_unary_context ๐Ÿ”’
is_wgsl_builtin_function_name ๐Ÿ”’
true when name collides with a WGSL builtin function. List covers the subset MD2 user shaders actually invoke; anything outside this set isnโ€™t worth worrying about (real authors donโ€™t shadow inverseSqrt).
keyword_at ๐Ÿ”’
Match if/while/for keywords on word boundaries; return their length.
lift_user_functions ๐Ÿ”’
Find HLSL-shaped function definitions (<TYPE> <name>(...) { ... }) at depth 0 in the translated body, rewrite each signature to WGSL shape (fn name(arg: TYPE, โ€ฆ) -> TYPE { ... }), and remove them from src in place. The lifted functions are returned as a single string concatenated in source order โ€” wrap_user_comp_shader_with_plan places it before fs_main.
looks_like_prose ๐Ÿ”’
normalise_call_whitespace ๐Ÿ”’
Replace name<WS>( with name( for every entry in KNOWN_CALL_NAMES. Only fires when name is on a word boundary and <WS> is non-empty whitespace (so the no-whitespace form is left alone). Skips matches inside /* */ and // comments so commented-out code stays stable.
parse_hlsl_params ๐Ÿ”’
Parse an HLSL parameter list into (type, name) pairs in source order. Same shape as [convert_hlsl_params_to_wgsl] but returns the structured form so callers can decide per-param whether to rename / shadow.
parse_wgsl_function_return_type ๐Ÿ”’
Match one of the known WGSL types at byte position i. Returns the canonical type text and the byte position immediately after it. void is rejected โ€” user comp shader functions always return a typed value in MD2.
rename_reserved_identifiers ๐Ÿ”’
Rename WGSL-reserved keywords used by MD2 preset authors as locals (mod, filter, sample). Every occurrence on a word boundary that isnโ€™t immediately followed by ( (a function call โ€” already rewritten by rewrite_mod_balanced or rejected upstream) gets a trailing _.
rename_word_call ๐Ÿ”’
Rename <from>( โ†’ <to>( at every word boundary. Used to alias HLSL builtins that WGSL spells differently (sat โ†’ saturate, rsqrt โ†’ inverseSqrt). Differs from a plain replace: a preset local frsqrt = q1 wonโ€™t pick up an unwanted frinverseSqrt = q1 rewrite because we require a non-identifier byte (or start of source) to the left of the match.
replace_functions ๐Ÿ”’
replace_semantics ๐Ÿ”’
replace_statement_commas ๐Ÿ”’
HLSL allows comma-as-statement-separator at the top of a function body:
replace_types ๐Ÿ”’
rewrite_binary_call_balanced ๐Ÿ”’
Generic paren-balanced rewriter for two-argument calls. Walks the source, matches <name>( on a word boundary, finds the top-level ,, and replaces the whole call with the closureโ€™s output.
rewrite_local_declarations ๐Ÿ”’
rewrite_mod_balanced ๐Ÿ”’
mod(a, b) โ†’ ((a) - floor((a) / (b)) * (b)). WGSL reserves mod as a keyword; HLSL uses it as the float-modulo helper. The expansion matches HLSLโ€™s semantics (and matches GLSLโ€™s mod) so behaviour stays identical.
rewrite_mul_balanced ๐Ÿ”’
mul(a, b) โ†’ (a) * (b). Paren-balanced on both arguments โ€” needed because real shaders write mul(rotation_matrix(theta), uv). The outermost , at depth 0 splits the two arguments.
rewrite_postfix_inc_dec ๐Ÿ”’
rewrite_tex2dbias ๐Ÿ”’
tex2Dbias(s, vec4(uv, mip, bias)) โ†’ tex2D(s, uv). Paren-balanced over both arguments. The bias component is dropped (real presets use it cosmetically at 0 or near-0 โ€” no visual delta).
rewrite_tex3d_calls ๐Ÿ”’
Rewrite tex3D(<sampler>, <uvw>) to a real 3D textureSample against the noise-volume bindings.
rewrite_unary_call_balanced ๐Ÿ”’
Generic paren-balanced rewriter for <name>(<single-arg>) calls. Walks the source, finds <name> on a word boundary followed by (, balances to the matching ), and replaces the whole call with make_replacement applied to the captured argument text (verbatim, not trimmed).
scan_user_samplers
Extract every sampler sampler_X; declaration from a MilkDrop comp shader HLSL and return the logical name (the part after sampler_) for each occurrence that isnโ€™t already a built-in.
split_param ๐Ÿ”’
Split a single HLSL parameter declaration into (type, name). The type may contain <...> (vec3<f32>); we split on the last whitespace at angle-depth 0.
split_top_level_commas ๐Ÿ”’
Split a declarator list on top-level commas only โ€” commas inside () or <> (e.g. vec3<f32>(0, 0, 0)) must not split the declarator.
strip_first_vec_component ๐Ÿ”’
For a string like vec4(uv, 0, 0.1) or float3(uv, 0), return uv โ€” the slice up to the first top-level comma inside the constructor. Used by rewrite_tex2dbias to drop the mip-bias arguments.
strip_preprocessor ๐Ÿ”’
strip_sampler_declarations ๐Ÿ”’
strip_shader_body_wrapper ๐Ÿ”’
MD2 ships warp/comp shaders wrapped in a shader_body { ... } block. The codegen wrapper pastes the user code inside its own fs_main { ... }, so the outer wrapper has to come off first โ€” otherwise WGSL sees a stray identifier (shader_body) followed by { and fails with expected assignment or increment/decrement, found "{".
strip_storage_class_qualifiers ๐Ÿ”’
strip_unary_plus ๐Ÿ”’
Strip HLSL unary + (a syntactic no-op WGSL doesnโ€™t accept) when it directly follows (, ,, =, +, -, *, /, <, >, ?, : after optional whitespace. Preserves byte positions of everything except the + itself.
translate_shader
Translate HLSL shader code to WGSL.
translate_shader_with_plan
Same as translate_shader, but routes unrecognised tex2D sampler names through the supplied TextureBindingPlan. Preset authors reference disk-loaded textures via sampler sampler_<NAME>; + tex2D(sampler_<NAME>, uv); the renderer scans the HLSL, builds a plan that resolves each name to a slot in the comp pipelineโ€™s user-texture binding array, and threads it through here so the emitted WGSL points at the right binding.
try_extract_user_function ๐Ÿ”’
Try to match a single HLSL-shaped function definition starting at byte position start (after leading whitespace). Returns the byte position just past the closing } and the rewritten WGSL function text. Returns None if no signature matches โ€” caller advances by 1 byte.
user_texture_binding_name
WGSL binding name for user-texture slot slot. The codegen wrapper declares var sampler_user_<n>_texture: texture_2d<f32> for each slot 0..MAX_USER_TEXTURE_SLOTS, and this is what the translator emits in textureSample(...) calls for plan-routed samplers.

Type Aliasesยง

Result