Hacker Newsnew | past | comments | ask | show | jobs | submit | shikaan's commentslogin

Shameless plug. I built a tool[1] to manage Keepass archives in the terminal which might scratch some of the itches I am reading here: it has a TUI, but can be piped into other commands too.

[1]: https://github.com/shikaan/keydex


Something similar, but you can play with the examples in the browser without any local setup https://shikaan.github.io/assembly/x86/guide/2024/09/08/x86-...

For full disclosure, I am the author - apologies for the shameless plug


It's cool. Do you sanitize the untrusted input? As far as I can see, it directly assembles with NASM and runs the binary.


It might be similar to Matt Godbolt's experience with his "Compiler Explorer". Most of your users are not trying to set fire to the free system, and when somebody does, on purpose or by accident, you focus on being able to reliably recover, not prevent it. So e.g. maybe Clara T Vandal "cleverly" seizes control of a random Compiler Explorer build box, well, that box is no longer marked OK because of her changes, it gets automatically torn down and replaced, no real problem. Did Clara do 0.001¢ of Bitcoin creation without paying for it? Yeah, maybe, and Clara probably cost Matt 0.1 cents for the data centre fees but it's not a big deal.


Looking at the source code of the code-editor [1], it seems to be embedding https://onecompiler.com via the iframe and delegating code compilation and execution to it. So I guess it's a question to onecompiler, whether they sanitize input or not. :)

[1]: https://github.com/shikaan/shikaan.github.io/blob/main/_incl...


Exactly this.

I have been planning on trying to glue up something with v86[1] as I did in OSle[2] but I did not get to it yet.

In that case, everything would run locally and sandboxes, so you would not have to care.

[1]: https://github.com/copy/v86

[2]: https://github.com/shikaan/osle


Thanks, I'll be using this.

What I don't understand is why assembly feels so hard to learn in the first place?

I mean, isn't it just a simple language with a few function calls (instructions) and types (operand sizes) and fixed number of variables (registers) and a small number of control flow operators, and that's it? Why does it feel so mysterious?


Simple languages are not necessarily easy languages to understand. See: brainfuck, APL, K, etc.

I think the problem with assemblers in particular is that the canonical definition is the byte code, not the human readable text. x86 is particularly annoying because no one agrees on the syntax of that text, there are hundreds of mnemonics, things are constantly being updated, and the practicing assembly programmer cares deeply about the execution semantics of the microarchitecture more than the specific sequence of instructions. Some of the language is also completely foreign to higher level programmers, like instruction pipelines, uops, instruction latency, and so on.

Rarely do you sit down and write this assembler by hand, you compile some C code and poke at it with vtune/uprof to measure hot sections of the code, break those down, and implement faster versions. It's fundamentally an iterative, experimental process.


I can think of several reasons:

1. You are primed to think that it is mysterious because that’s all you usually hear about assembly. (“Roller Coaster Tycoon was written in 100% hand crafted assembly… what an absolute wizard!!”)

2. The language’s textual format is odd - columns vs nested indentation. Actually really nice once you get used to it, but it’s definitely alien at first.

3. Mnemonics and directives have short, cryptic spellings. x86 in particular has arbitrary looking register names as well. RV, AArch64, m68k etc do better here.

4. Mnemonics are inconsistently overloaded and encode lots of stuff. SIMD instructions tend to look like a cat sat on your keyboard.

5. Manually laying out memory is technically simpler than the abstractions provided by higher level languages (structs and classes, fancy generic types, pointer syntax), but it’s fiddly and you have to deal with alignment.

6. You have to do a lot of bookkeeping yourself. It’s like malloc/free turned to 11.

7. Register allocation is a hard problem for computers. It’s kinda tough for humans, too.

8. Lots of books and online stuff discuss assembly for use with high performance code, tight compute kernels, raw hardware access, and fiddly CPU configuration for OS startup and virtual memory configuration. This requires even more specialized registers, arcane instructions, and bit fiddling. This stuff - along with reverse engineering and security research/attacks - gets lumped into what people think of as “assembly language”. The resulting concept surface therefore looks much larger than it actually is.

I highly recommend making a non-trivial program entirely in assembly at least once. I need to do it occasionally professionally but even when I don’t I usually have a hobby project or two cooking at home.

Becoming as proficient in asm as - say - C or Python is quite the lovely expression of craft. You feel like a wizard (see point 1) while simultaneously learning what’s really going on.

For people with a certain geeky disposition it pays lots of aesthetic, psychological, and professional dividends.


It is simple until you need something complex - working on a team, call stack management, memory management (allocate, track, free), working on a team, event handling / non-synchronous interrupts, and working on a team.

For a little 8-bit microprocessor with program size < 8k it can be quite easy and even a joy. Anything else and your compiler will outperform you, better to inline hand-coded assembler as needed.


x86 assembly has many addressing schemes and a significant amount of historical baggage, even from its inception.

It's up to the compiler/programmer to handle calling conventions.

Modern programmers also don't regularly encounter "unstructured programming" in higher-level languages these days.

All of these, and more, make it feel overwhelming, as anything they examine through their disassembler will contain all of these elements.


Honestly, I made it up :)

I thought about what would be the minimum I have to build in order to run some userland software that does "something". That to me looked like: spawn guest applications, make them persist something.

With slightly more leeway, I would probably do memory management as the next thing (besides what I mentioned in another thread here)


I would have assumed the same, but I haven't managed. On the other hand, I did not tinker too much with all these toggles; it's such a little amount of shared code (which is also partially different in some cases) that didn't particularly make sense to me.

If you know how to make it happen and/or want to contribute, hit me up (:


1. If you look through the commit history, you'll see that the first implementation was actually with Pascal strings.

Printing with Pascal strings is actually shorter (you skip the null test, basically), but constructing Pascal strings to pass as an argument when they are not constants yielded much more code to prepare for that call. Had I had more leeway, I would have used Pascal strings, it much less headache.

2. Files in `/bin` all include from the SDK. You can pretty much do the same for utility functions.

The includes, at least in nasm, are very much like copy-pasted code (or includes in C for that matter), and then you can just jump/call to the label.

I did not do it because I haven't been able to get nasm to optimize away the code that I don't use, and I didn't want to bloat the binaries or make a file for a 5LOC function.

All in all not good reasons in general, but it made sense to me in this context.


Thanks for answering my questions. Your project is really really interesting.

Two more questions if you find some spare time:

3. Why does it use tty for interrupts instead of directly calling int 10?

4. How does this even print to the screen or use a tty in the first place? Is it just something inherent in bios api?


Hey, thanks for your interest in this project!

3. The tty interrupt advances the cursor along with printing. So, once again, I do it to save on some instructions. In the first iterations I wanted to retain more control (by printing and moving as separate operations) so that I could reuse this across the board, but eventually I ran out of space.

4. I am relying heavily on BIOS interrupts, which are criminally underdocumented. The most reliable source is Ralph Brown's documentation[1] which is very far from what I was expecting to be authoritative documentation. Turns out this collection is really good and basically _the_ source of truth for BIOS interrupts.

To answer your question, yes, this is basically calling the BIOS API.

[1]: https://wiki.osdev.org/Ralf_Brown's_Interrupt_List


THIS is the bible for BIOS APIs"

https://bitsavers.trailing-edge.com/pdf/ibm/pc/ps2/PS2_and_P...

Complete with reference assembler source code.


Oh boy, this is amazing! Thanks for the reference


Hey, thanks for taking a look!

On the former, I have no idea how to estimate BIOS functions size. Maybe I could just peek into an image and get a sense for it...

On the latter, with a 16x increase in available space, I guess I would do a much more thorough work in putting guardrails in place.

The API currently comes with a couple of traps (e.g., file names can be duplicated, processes are cooperative, all file operations perform disk I/O...) and it essentially requires guest applications to know about BIOS services in order to function.

Another sticky point I wish I had the space to address better are calling conventions, which I had to get rid of almost immediately to save on instructions.

> Thanks for pointing me towards the bosh emulator.

You're welcome! Bochs is such a nice tool which I discovered only for this project as well. It was a no-brainer, since I got no way to debug 16-bit assembly from QEMU (unless you go off and fork it[1])

[1]: https://gist.github.com/Theldus/4e1efc07ec13fb84fa10c2f3d054...


I went back and forth about the file system and disk stuff a fair bunch, to be honest. Most of it, as you say, was mostly due to wrestling the space constraints.

If one day I'll give in and take the shell out or go multi-stage, I will definitely look at that.

Maybe it's worth blogging about the journey; it's been a few weeks of merciless trade-offs to reach a usable API. It can make for a fun read (:

Thanks for taking a look!


haha, well all the best! its a cool project. i am happy i can forgot about BIOS and went UEFI haha. remember so many tedious nights trying to get an mbr to load an elf file and init x64 mode in one go :'). uefi (edk2) is a blessing if you come from BIOS land (tho mybe less fun in a way!)


The -le suffix is used in south of Germany for the small version of something. So OSle stands for small OS.

I'm not a native speaker, so maybe somebody else can paint a better picture. I used it just because part of my extended family comes from there (:

EDIT: s/prefix/suffix/


I live in Alsace, which is in France but has a German-like dialect (Alemannic)

https://en.m.wikipedia.org/wiki/Alsatian_dialect

-ele is used a lot to denote something small, cute, adorable; maybe think of it as kind of like ちび (chibi) or -ちゃん (-chan) in Japanese.

Mann (man) => Mannele https://cookingwithbrendagantt.net/mannele-st-nicholas-bread...

Katz (cat) => katzele (kitty)

The suffix can be liberally (ab)used with any - native or foreign - word or (sur)name to emphatic or comedic effect.

Here I kinda guessed the -le use was such but around here I would have said "OSele" (oh-ess-uh-luh)


Similar in English, the ie suffix is used to create a diminutive. Sweet -> sweetie. You can make cute cuter by saying cutie.


as seen also in Spätzle, Müsli, or, to pick something more relevant on HN, the words Brötli (or Zöpfli)

-li is a different version of the same ending


*suffix.

A prefix goes before something.


Indeed. Thanks for the correction; I edited the original message


Hey all,

As a follow up to my relatively successful series in x86 Assembly of last year[1], I started making an OS that fits in a boot sector. I am purposefully not doing chain loading or multi-stage to see how much I can squeeze out of 510bytes.

It comes with a file system, a shell, and a simple process management. Enough to write non-trivial guest applications, like a text editor and even some games. It's a lot of fun!

It comes with an SDK and you can play around with it in the browser to see what it looks like.

The aim is, as always, to make Assembly less scary and this time around also OS development.

[1] https://news.ycombinator.com/item?id=41571971


As a follow up to my relatively successful series in x86 Assembly of last year[1], I started making an OS that fits in a bootloader.

I am purposefully not doing chain loading or multi-stage to see how much I can squeeze out of 510bytes.

It comes with a file system, a shell, and a simple process management. Enough to write non-trivial guest applications, like a text editor. It's a lot of fun!

Not quite done with it yet, but you can see the progress here https://github.com/shikaan/OSle and even test it out in the browser https://shikaan.github.io/OSle/

[1] https://shikaan.github.io/assembly/x86/guide/2024/09/08/x86-...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: