by vgel on 9/4/23, 7:17 PM with 165 comments
by brundolf on 9/4/23, 7:45 PM
IIRC, C was specifically designed to allow single-pass compilation, right? I.e. in many languages you don't know what needs to be output without parsing the full AST, but in C, syntax directly implies semantics. I think I remember hearing this was because early computers couldn't necessarily fit the AST for an entire code file in memory at once
by mati365 on 9/4/23, 7:52 PM
by Joker_vD on 9/5/23, 11:53 AM
block
;; code for "i = 0"
loop
;; code for "i < 5"
i32.eqz
br_if 1
i32.const 1
loop
if
;; code for "i = i + 1"
br 2
else
end
;; code for "j = j * 2 + 1"
i32.const 0
end
end
end
It doesn't require cloning the lexer so probably would still fit in 500 lines? But yeah, in normal assembly it's way easier, even in one-pass: ;; code for "i = 0"
.loop_test:
;; code for "i < 5"
jz .loop_end
jmp .loop_body
.loop_incr:
;; code for "i = i + 1"
jmp .loop_test
.loop_body:
;; code for "j = j * 2 + 1"
jmp .loop_incr
.loop_end:
Of course, normally you'd want to re-arrange things like so: ;; code for "i = 0"
jmp .loop_test
.loop_body:
;; code for "j = j * 2 + 1"
.loop_incr:
;; code for "i = i + 1"
.loop_test:
;; code for "i < 5"
jnz .loop_body
.loop_end:
I propose the better loop syntax for languages with one-pass implementations, then: "for (i = 0) { j = j * 2 + 1; } (i = i + 1; i < 5);" :)by tptacek on 9/4/23, 7:39 PM
https://www.blackhat.com/presentations/win-usa-04/bh-win-04-...
(minus directly emitting opcodes, and fitting into 500 lines, of course.)
by ak_111 on 9/5/23, 8:41 AM
I am wondering if this complexity exists due to historical reasons, in other words if you were to invent C today you would just define int as always being 32, long as 64 and provide much more sane and well-defined rules on how the various datatypes relate to each other, without losing anything of what makes C a popular low-level language?
by kaycebasques on 9/4/23, 7:52 PM
by WalterBright on 9/4/23, 9:30 PM
http://www.trs-80.org/tiny-pascal/
I figured out the basics of how a compiler works by going through it line by line.
by marcodiego on 9/4/23, 7:52 PM
I wonder if is this a good path to becoming an extremely productive developer. If some one spends time developing projects like this, but for different areas... A kernel, a compressor, renderer, multimedia/network stack, IA/ML... Will that turn a good dev into a 0.1 Bellard?
by jll29 on 9/4/23, 11:42 PM
- demystifies compilers, interpreters, linkers/loaders and related systems software, which you now understand. This understanding will no doubt one day help in your debugging efforts;
- elevates you to become a higher level developer: you are now a tool smith who can make their own language if needed (e.g. to create domain specific languages embedded in larger systems you architect).
So congratulations, on top of other forms of abstraction, you have mastered meta-linguistic abstraction (see the latter part of Structure and Interpretation of Computer Programs, preferably the 1st or 2nd ed.).
by mananaysiempre on 9/5/23, 9:59 AM
It takes too much code in Python. (Not a phrase one gets to say often, but it’s generally true for tree processing code.) In, say, SML this sort of thing is wonderfully concise.
by meitham on 9/5/23, 12:30 PM
by nn3 on 9/4/23, 9:07 PM
C4x86 | 0.6K (very close)
small C (x86) | 3.1K
Ritchie's earliest struct compiler | 2.3K
v7 Unix C compiler | 10.2K
chibicc | 8.4K
Biederman's romcc | 25.0K
by Shocka1 on 9/7/23, 3:01 PM
by rcarmo on 9/4/23, 9:49 PM
by aldousd666 on 9/5/23, 1:49 AM
by varispeed on 9/5/23, 10:31 AM
by jokoon on 9/5/23, 10:28 AM
by MrYellowP on 9/5/23, 9:24 AM
This is more of a transpiler, than an actual compiler.
Am I missing something?
by teddyh on 9/4/23, 8:06 PM
> Notably, it doesn't support:
> structs :-( would be possible with more code, the fundamentals were there, I just couldn't squeeze it in
> enums / unions
> preprocessor directives (this would probably be 500 lines by itself...)
> floating point. would also be possible, the wasm_type stuff is in, again just couldn't squeeze it in
> 8 byte types (long/long long or double)
> some other small things like pre/post cremements, in-place initialization, etc., which just didn't quite fit any sort of standard library or i/o that isn't returning an integer from main()
> casting expressions
by fan_of_yoinked on 9/4/23, 7:39 PM
by moomin on 9/5/23, 8:13 AM
by hamilyon2 on 9/5/23, 6:43 AM
by rhabarba on 9/4/23, 7:31 PM
by ForOldHack on 9/5/23, 8:20 AM
by Jake_K on 9/5/23, 10:04 AM
by golemarms on 9/5/23, 2:39 AM