In the beginning, the CPU was created. This has made a lot of people very angry and has been widely regarded as a bad move.
Though computers have only gained in popularity, in our interactions with them we have increasingly attempted to interact less with the CPU; and more and more with the abstractions that simply happen to be implemented atop CPUs. One of the earliest such abstractions was the instruction set architecture (ISA). The idea was this: instead of writing code for a specific CPU, you would design an abstract specification for one: a language, if you will, that could be spoken by multiple different CPUs. You would then write code according to that instruction set, in that language, and it could be interpreted by any CPU implementing that instruction set. This was desirable partly because it allowed code to be written once and run multiple places—the first inklings of portability, rising!—but mostly because it allowed you to think only of the specification while you wrote your code, and perhaps even forget for a moment that there was a CPU underlying it all.
Ahhh, the power of abstraction in the morning...
Another such abstraction was the mnemonic assembler. Before the assembler was created, CPUs and ISAs were instructed using numbers which had special meaning to the ISA in question. For example, consider the following number0 (in octal form, for no particular reason):
In the context of the ISA implemented by the CRAY-11, that number is understand to mean ‘add the contents of registers A1 and A4, and store the result in register A5’. The association between this numeric form and its meaning under the ISA is completely arbitrary, and serves only to remind our poor programmers that they are serving a CPU, whose native language is written with arbitrary numbers. Wouldn’t it be nice if, instead of meaningless 030514, our CPU could understand a form like this one?
(This latter form, while no less arbitrary than the other, is arbitrary in a way which matches preexisting human arbitrarinesses. This makes humans feel warm and fuzzy inside, like they’ve conquered a part of the machine kingdom with their superior human intellect, and is called a ‘mnemonic’.)
Unfortunately, while CPUs are equally capable of parsing numbers like 030514 and strings of words like A5 A1+A4, they are much slower at the second. Thus, a compromise was borne: a mnemonic assembler would be a program (like all programs, it would run on a CPU implementing an ISA), and it would translate mnemonic forms like A5 A1+A4 into machine forms like 030514. This allows humans to pretend that the CPUs understand their own native language, while still allowing the CPUs themselves to run at tolerable speeds.
At this point, it is worth making a couple of things devastatingly clear.
First, an assembler that runs on a given ISA may produce machine code for that ISA, or machine code for a completely different ISA.
Second, there is almost always a 1:1 correspondence between the mnemonics inhaled by a given assembler and the instructions it spits out. (Where there is not, the situation is generally close enough to 1:1 as to make no ends.)
Third, mnemonics were not standardised the same way that ISAs were. A given set of mnemonics only works on one ISA, but there might be any number of other mnemonic-sets that also work on that ISA. (Despite this, a number of tropes tend emerge which are shared by all mnemonic-sets targeting a given ISA.)
We are now ready to attack the problem at hand.
There is a popular group of ISAs variably known as:
This ill-fated ISA, which has more addressing forms than names and is so immensely complicated that you need to read a 5000-page tome2 plus an additional 400 or so pages of unofficial reference before you can even begin to be qualified to understand the consequences of a single of its instructions, will henceforth be referred to as ‘“SIB”’. I give it that name not so much to deliberately sow confusion (though that is definitely a contributing factor) as because that’s the only name that everybody seems to agree is associated with it. (This is also the reason its name is quoted: it’s the only word, indeed the only idea in this missive I didn’t make up from whole cloth, ripped cloth, cheese cloth, pins, pinheads, camembert3, angels, のっぺらぼう.)
Because ‘SIB’ is, inexplicably, enduringly popular, a lot of assemblers have been made for it. As I mentioned, though different assemblers are not always compatible with each other, tropes and commonalities tend to emerge. In fact, there are two incompatible sets of tropes; they are generally called ‘AT&T syntax’ and ‘Intel syntax’, and you can read about the differences here.
...but you knew all of this already. If I’ve done my due diligence, you were incited by the title but are still looking for something to disagree (or agree) with strongly. Patience!
I contend that the AT&T syntax is harmful and bad, and should never be used, for any reason, under any circumstances, by anyone. Here’s why:
This is the single greatest sin perpetrated by the AT&T syntax. If not for this, it would be sufficient to say that Intel syntax were superior. If not for this, the use of AT&T syntax would be at least be acceptable, at least be moral4, even if still in supremely poor taste. If not for this, it would be possible to write correct programs using the AT&T syntax.
Now, to be clear, I don’t expect advanced safety features from an assembler. I don’t expect dependent types, or linear types, or even any types at all. But I would pretty damn well appreciate it if my assembler didn’t actively try to sabotage me!
Let me show you an example. Here’s an annotated snippet of Intel-style ‘SIB’ assembly:
mov eax, 28 ; (1) store the immediate value 28 in the EAX register mov eax, dword  ; (2) load one ‘dword’ (4 bytes) from memory location 28 and store it in the EAX register
Here’s another snippet, AT&T syntax this time:
movl $28, %eax ; (1) store the immediate value 28 in the EAX register movl (28), %eax ; (2) load one ‘(l)ong’ (4 bytes) from memory location 28 and store it in the EAX register
Good so far? Reasonable? Good.
I present to you two more candidates, again in AT&T syntax:
movl 28, %eax ; (3) ??? movl ($28), %eax ; (4) ???
Before you continue reading, I want you to imagine what it would make sense for those instructions to mean. Should they even be correct? If you happen to already know, try to forget (pray for oblivion from the horror...)
If you don’t know, make your best guess.
Ready? Here they are again, with annotations this time:
movl 28, %eax ; (3) same as (2) movl ($28), %eax ; (4) load one long from the memory location indicated by symbol ‘$28’ and store it in the EAX register
Go back and read that again and tell me in what world that could possibly be okay. Tell me in what world an assembler that silently accepts the above forms, that is almost certainly corrupting your meaning in a way you don’t intend, could possibly produce correct code. Tell me you would never forget to put a $ in front of an immediate, and that you would never accidentally put one in a displacement.
This doesn’t even have anything to do with Intel syntax. This isn’t a win for Intel over AT&T. This is just AT&T syntax being straight-up batshit fucking bonkers for no very good reason.
All it would take to redeem AT&T syntax—okay, maybe not redeem, but at least elevate from Cocytus to Phlegyas—would be to make syntaxes (3) and (4) illegal. That’s it. Want to help? Send a patch to your local AT&T-style assembler to make it warn for both of those forms. Give it a command-line flag to err instead of warning.
Having eaten the proverbial elephant, it is time to start addressing the smaller fry.
Smaller fry: you have done an admirable job of thorning my side.
Reader: following are reasons why the AT&T operand order is backwards.
I’m sure you’re familiar with the issue, but here’s a quick refresher. Intel first:
mov eax, ebx ; (1) load one dword from the EBX register and store it in the EAX register add eax, ebx ; (2) load a dword each from the EAX and EBX registers, add them together, and store the result (truncated to a dword) in the EAX register
mov %ebx, %eax ; (1) load one dword from the EBX register and store it in the EAX register add %ebx, %eax ; (2) load a dword each from the EAX and EBX registers, add them together, and store the result (truncated to a dword) in the EAX register
In general, Intel instruction mnemonics take the form ‘<op> <destination>, <source>’, where AT&T uses ‘<op> <source>, <destination>’.
Before continuing, I want to establish a couple of things.
First, the consistency argument is bunk. The consistency argument in favour of Intel syntax is illustrated by the following example:
Putting the destination first is more consistent with other, higher-level languages like C. mov eax, ebx is analogous to eax = ebx, and add eax, ebx to eax += ebx. In general, op x, y is analogous to x op= y.This is a classic logical fallacy: it’s an appeal to authority. It doesn’t say why the syntax is better, just asserts that somebody else (C) thinks it’s better. A slight variant on the argument says that making mnemonic assembler syntax reminiscent of the syntax of higher-level languages will make the language easier to learn for people who already know higher-level languages. This may be somewhat true, but the fact of the matter is that:
Second, the linguistic argument is bunk. The linguistic argument in favour of AT&T syntax is illustrated by the following example:
Putting the destination second is consistent with phrasal forms in English. mov %ebx, %eax is analogous to ‘move EBX to EAX’, and add %ebx, %eax to ‘add EBX to EAX’. In general, these correspond to the idiomatic English form ‘<verb> <object> to <indirect object>’. Where Intel syntax demands awkward productions like ‘move to EAX from EBX’.All the arguments against the consistency argument apply; but it also bears mentioning that assembly and English are different languages. In particular, the way objects and verbs interact is different (and this hints at the real reason that AT&T syntax is backwards). It’s somewhat telling that assembly has no equivalent to the word ‘to’, and few enough production rules to count on both hands.
So, why does it make more sense to put the destination first?
Mutation! The bane of every programmer’s existence!
In higher-level languages, we frequently try to minimize mutation, or at least keep it under control. But guess what—every single instruction in assembly mutates. Even the venerable NOP effectively increments the instruction pointer.
Mutation is, in fairness, not a universal evil, but it does complicate a reader’s mental model of the code. Like the GOTO, cursed by Djikstra, it adds path dependence to every line of code that follows it. Meaning that in order to understand the code you read, you need to have a handle on where the mutation happens. Putting the destination operand first emphasizes the site of mutation5.
(Or, these sigils are making a din!)
AT&T syntax puts a % in front of register names and a $ in front of immediate values. Unsigiled words are always displacements; either a number or the name of a label. The $ we have already established is almost as bad as using cat /dev/urandom as your assembler. But let’s consider the %.
On its face, the idea seems to have some merit. Indeed, the GNU assembler seems to recognize this: if you pass it the flag -msyntax=intel (or use the .intel_syntax directive in source code), you get intel syntax. Really. The whole shabang. Except—you still have to put %s on your registers. If you want to not have to do that, you have to additionally specify -mnaked-reg (or .intel_syntax noprefix).
Not only that, but if you look at some older assembly code of mine, it uses this mode: intel syntax, but with %s littered all over the place. So I sympathize. Really, I do. I’m not going to make a pretense of presenting the arguments in favour, because I know the arguments in favour. But, ultimately, I don’t think the the %s help.
The deal is this: humans are really good at context-sensitive parsing. Probably better than computers. The computer sees an identifier in its entirety, then compares it against its in-built list of registers. Only once it’s done that can it know if the identifier refers to a label or a register. But humans—when you, a human see a branching instruction, you immediately know that the operand is going to be a label. 99% of the time you’re right. The remaining tiny fraction of the time, your internal branch-predictor does a little backtracking and sets you to rights.
Identifier operands to branching instructions are almost always labels, and identifier operands to non-branching instructions are almost always registers. This is the kind of context-sensitivity that humans are really good at, given a small amount of practice. The result is that, to the expert eye, a sea of % is mostly just noise.
Did somebody mention small fry? This kind of stuff is really not a big deal; but when you get down to it, it’s kind of astounding how AT&T syntax managed to get everything wrong, even the little things.
First order of business: elision. Ever assembler I know of will accept the equivalent of this (Intel):
mov rax, [rdi]
or this (AT&T):
mov (%rdi), %rax
even if they would really prefer for you to write one of
mov rax, qword [rdi] movq (%rdi), %rax
There are only a few cases where you really need to manually include size information; off the top of my head, I can only think of: sign- or zero-extended moves out of memory, and operations involving a memory and immediate operand. However, style conventions may still mandate size annotations. For both AT&T and Intel syntax. And this is quite reasonable, as these annotations can help to catch bugs; for example:
mov eax, qword [rdi] ; oops! ; this was probably unintended ; the assembler can err here ; where it wouldn’t be able to if we just said ‘mov eax, [rdi]’ movq (%rdi), %eax ; same story...
Here, again, Intel wins. It lets you express the constraints of your data, and the assembler will check that the requisite operation really can be performed. You express your intent at a higher level, without even having to complicate the implementation.
But this isn’t actually the problem with the suffixes. The problem is that they obscure the distinction between dissimilar operations. Consider (AT&T):
movq ...... movl ...... movsbl ...... movzbl ...... movb ...... movl ......
It’s not exactly easy to tell the difference between movsbl and movzbl here. The entire instruction column gets lost in a sea of mov... The ubiquitous b and l suffixes make the movq seem like the odd one out!
Admittedly, even on Intel, movzx and movsx are not that distinct from each other. But they are rare enough (compared with mov) that taking a little extra time on them is acceptable. The problem here is that you can’t even tell them apart from the movs!
When every mov has junk after it, it becomes harder to spot patterns. It’s hard to convince your visual cortex that a movq and a movl are the same type of thing, but that both are very different from a movsbl. (Sure, movq and movl are different. They have different operand sizes. A move to a register to another register is also different from a move to memory from an immediate; AT&T makes no distinction there. But, perhaps more to the point, the meaning of a mov may be different in different places, regardless of the operand size. This isn’t something that can usefully be encoded on the level of assembly. A strength of a good language is the versatility and modularity of its primitives.) Whereas it’s immediately obvious that mov is the same as mov, and that movzx is distinct.
Example time! Intel/AT&T; you know the drill:
mov eax, [edi + 8*ebx + 3] ; load 4 bytes from the memory location indicated by EDI + 8*EBX + 3 and put them in EAX. ; Duh, self-explanatory. mov 3(%edi,%ebx,8), %eax ; load 4 bytes from the memory location indicated by 3(%edi,%ebx,8) and put them in %eax. ; D-whaat?
All right, what’s going on here? I have a theory6: AT&T secretly wants ‘SIB’ to be a RISC architecture; in other words, an architecture where only the ‘load’ and ‘store’ instructions affect memory. They would really like you to write:
sibold 3, %edi, %ebx, 3, %eax ; the famous 5-operand load instruction; use (s)cale 23, (i)ndex %ebx, (b)ase %edi, ; and (o)ffset 3 to get a memory address; load 4 bytes from it and put them in %eax
If ‘load’ and ‘store’ really were the only instructions that could affect memory, as on a RISC architecture, then that would be fine. But they’re not, and we need an unambiguous syntax for memory operands. So they went with this parenthesized nonsense.
It’s actually not that bad once you know it, and as I’ve mentioned before, being familiar to users of higher-level languages is not very important. But the syntax is not without its problems. The fact that the offset goes outside the parentheses throws off humans’ internal parsetreebranch predictors. (It makes you think you’re going to see an immediate operand.) And the parentheses cause ambiguity for macro assemblers (that is, pretty much all of them) where you can evaluate constant arithmetic expressions in-line and use parentheses to override precedence.
Now we’re getting to the really low-down, dirty, nasty, below-the-belt stuff. They say not to kick somebody while they’re down; but if you’ve already beaten them five times in a fair fight, there’s really nowhere to go next.
Everybody uses Intel! And I mean, everybody. Every assembler, every disassembler, every reverse-engineering tool, every debugger. Documentation from Intel and AMD’s official manuals, as well as most of the unofficial ones. Inline assemblers for the D, Rust, and Zig reference compilers, and Microsoft’s C compiler.
Everybody except for GCC and the GNU toolchain (and its clones, Clang and TCC), who just have to be different.7 All the cool kids are doing it, why won’t you? Even assemblers for other architectures use syntax that looks a lot like Intel syntax. Even the Plan 9 assembler, made by many of the same AT&T employees who made Unix and the original AT&T assembler, walks back on some of AT&T’s horrible mistakes (though unfortunately none of the important ones).8
Be the trend!
(Is this really a reason to use Intel syntax? I think it is. AT&T is somewhat entrenched in the unix world, where it’s used in components of the kernels and system libraries of most unices. But it’s not un-noteworthy that pretty much all non-trivial applications written assembly and targeting ‘SIB’ use the Intel syntax. Nor that all reverse-engineering tools—reverse-engineering being the area where readability of assembly is probably most important—use the Intel syntax. Nor that most assemblers—including newly-developed, open source ones—choose Intel syntax, like FASM, NASM, YASM, as well as, as previously mentioned, inline assemblers for D, Rust, and Zig.
Were all other things equal, it would be best to go with the standard solution. The fact that they are not only accentuates matters.)
Please, please, please stop using AT&T syntax.
For your own sake and the world’s.
(Or, non-reasons to use AT&T syntax.)
I sometimes see arguments to the effect that AT&T syntax is easier for machines to parse while Intel is easier for humans to parse; or that AT&T is easier for compilers to emit automatically while Intel is easier to write by hand; or that AT&T is closer to the actual instruction encoding; or that sentences with fewer words are easier for humans to understand. This is all nonsense.
It has a tiny kernel of truth: in the instruction encoding, size information really does go with the operation code, not the operands. (Technically, it goes in between.) That’s it.
Operand order is neither src,dst nor dst,src; it depends on the specific instruction. Whenever an instruction has both a register and memory operand, the register operand goes first. When an instruction has a memory and an immediate operand, the memory operand goes first. And register goes before immediate. When there are two register operands, they can usually go in either order; the assembler has a choice of encodings.
As noted, AT&T syntax is more ambiguous and has more special cases than Intel. Though it is also true that Intel needs to infer the context in which a label is used.
But you know what?
This is small fucking fry.
This is inconsequential.
This, in the grand scheme of things, is about on par with a water molecule asking Poseidon if it can have the weekend off to frolic in the clouds.
Poseidon doesn’t care about water molecules. He has a whole damn ocean to worry about!
Worrying about water molecules instead of fluid dynamics and materials while your boat is sinking is pretty much the same as thinking that computer programs care what kind of assembler they ingest or excrete. It’s completely trivial.
You know what’s hard? Supporting all the operand encodings ‘SIB’ allows is hard. Writing a good register allocator is hard. Supporting a thousand different object formats used by different OSes is hard. Making a good macro system is hard. Creating a high-quality, correct mapping from assembly to a higher-level representation is hard. These are tasks which are hard for assemblers and compilers and decompilers and other tools which interact with assembly. Intel? AT&T? Completely irrelevant.
Yes, this is not technically a number, but a sequence of bytes which under the ASCII interpretation forms a sequence of numerals which can be interpreted under the big-endian octal base-radix representation to produce a number equivalent to that formed by the binary base-radix interpretation of a series of electrical impulses on a number of CPUs produced by CRAY. I say this not because it’s interesting—though it is—but simply to deter the hair-splitters. In fact, it is not a sequence of bytes at all but a set of anomalies in the brightnesses of the LEDs in your monitor. (Unless you have printed it out, or photographed it, or transcribed it by hand.) What is interesting is the glyphs and their interpretation, not their representation.
g h i j k +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----- | Operation code | Destination reg | Source reg #1 | Source reg #2 +-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----
(The opcode is broken into two sections: the ‘g’ section, which comprises the first 4 bits, and the ‘h’, which gets the following 3. I don’t know why.)
Though this form seems fairly simple to write by hand, there are special cases aplenty that justify the assembler (as well as more complicated instructions that go beyond basic arithmetic). For instance, the zero register has special meaning to many instructions. As mentioned earlier, 030514 represents something similar to A5 ← A1+A4. The leading 7 bits are 030, which is the opcode for ‘add’; 5 is the destination register; and 1 and 4 are the source registers. All this is fairly intuitive. But consider something like 030510; you might expect that to mean A5 ← A1+A0, but it in fact means A5 ← A1+1. (Not because A0 is some kind of ‘one register’, like ARM’s xzr or RISC-V’s x0; it’s just a special case for that instruction. 030501, for instance, means A5 ← A1.) An assembler lets you write the latter mnemonic form and get the correct result.