I mentioned somewhere else but I might as well mention here too: there is no standard assembler that everyone uses. Each one may have a slightly different syntax, even for the same arch, and at least some C++ compilers allow you to customize the assembler used during compilation. Therefore, one would assume that inline assembly can't be uniform in general, without picking a single assembler (even assembler version) for each arch.
You're talking about the syntax of the assembly code itself. In practice small variations between assemblers isn't much of a problem for inline assembly in the same way it would be for standalone .s sources, because inline assembly rarely has implementation-specific directives and macros and such. It's not like the MASM vs NASM split.
This thread is about the compiler-specific syntax used to indicate the boundary between C and assembly and the ABI of the assembly block (register ins/outs/clobbers). Take a look at the documentation for MSVC vs GCC:
>This thread is about the compiler-specific syntax used to indicate the boundary between C and assembly and the ABI of the assembly block (register ins/outs/clobbers).
I see... Nevertheless, this is a really weird issue to get bent out of shape over. How many people are really writing so much inline assembly and also needing to support multiple compilers with incompatible syntax?
Biggest category of libraries that need inline assembly with compiler portability are compression/decompression codecs (like the linked article) -- think of images (PNG, JPEG), audio (MP3, Opus, FLAC), video (MPEG4, H.264, AV1).
Also important is cryptography, where inline assembly provides more deterministic performance than compiler-generated instructions.
Compiler intrinsics can get you pretty far, but sometimes dropping down to assembly is the only solution. In those times, inline assembly can be more ergonomic than separate .s source files.
> Currently, all supported targets follow the assembly code syntax used by LLVM’s internal assembler which usually corresponds to that of the GNU assembler (GAS)
Uniformity like that is a good thing when you need to ensure that your code compiles consistently in a supported manner forever. Swapping out assemblers isn’t helpful for inline assembly.
The quoted statement is weaker than what you're reading it as, I think. It's not a statement that emitted assembly code is guaranteed to conform to LLVM syntax, it's just noting that (1) at present, (2) for supported targets of the rustc implementation, the emitted assembly uses LLVM syntax.
Non-LLVM compilers like gccrs could support platforms that LLVM doesn't, which means the assembly syntax they emit would definitionally be non-LLVM. And even for platforms supported by both backends, gccrs might choose to emit GNU syntax.
Note also that using a non-builtin assembler is sometimes necessary for niche platforms, like if you've got a target CPU that is "MIPS plus custom SIMD instructions" or whatever.
I didn't follow up the stabilization process very closely, but I believe you're wrong. What you're describing is what used to be asm! and is now llvm_asm!. The current stable asm! syntax actually parses its own assembly instead of passing it through to the backend unchanged. This was done explicitly to allow for non-llvm backends to work, and for alternative front-ends to be able to be compatible. I saw multiple statements on this thread about alternative compilers or backends causing trouble here, and that's just not the case given the design was delayed for ages until those issues could be addressed.
Given that not all platforms that are supported by rust have currently support for asm!, I believe your last paragraph does still apply.
> The exact assembly code syntax is target-specific and opaque to the compiler
> except for the way operands are substituted into the template string to form
> the code passed to the assembler.
You can verify that rustc doesn't validate the contents of asm!() by telling it to emit the raw LLVM IR:
That IR is going to get passed to llvm-as and possibly onward to an external assembler, which is where the actual validation of instruction mnemonics and assembler directives happens.
---
The difference between llvm_asm!() and asm!() is in the syntax of the stuff outside of the instructions/directives -- LLVM's "~{cc},~{memory}" is what llvm_asm!() accepts more-or-less directly, and asm!() generates from backend-independent syntax.
Assembly by definition is platform specific. The issue isn’t that it’s the same syntax on every platform but that it’s a single standardized syntax on each platform.
C/C++ doesn't have a standard syntax for inline assembly. Clang and GCC have extensions for it, with compiler-specific behavior and syntax.