More

skitter · 2025-11-25T02:09:05 1764036545

I do the same in my toy JVM (to implement the reentrant mutex+condition variable that every Java object has), except I've got a rare deadlock somewhere because, as it turns out, writing complicated low level concurrency primitives is kinda hard :p

skitter · 2025-09-10T13:45:46 1757511946

They're great, compared to cars. But while they have a relatively fast and cheap setup, over the long term light rail and trams are a lot cheaper to run and can coexist with foot & bike traffic easier since the rails make them very predictable.

mschuster91 · 2025-09-10T14:13:17 1757513597

I'd place serious concerns on the "coexist with bike traffic" thing though. Tram rails are a massive danger if you're running anything smaller than these "fatbike" wheels and have to cross them for whatever reason.

skitter · 2025-09-05T08:59:13 1757062753

> for the sole purpose of not having to add them to function signatures all over the place.

I thought it was because you couldn't be fully generic over exceptions.

skitter · 2025-08-22T12:43:25 1755866605

The author obviously knows that too, otherwise they wouldn't have written about it. All of these issues are just how the language works, and that's the problem.

skitter · 2025-07-24T14:02:45 1753365765

I do not see evidence that they are against violence against women and girls, only that they are claiming to be.

skitter · 2025-06-03T13:05:13 1748955913

Fun post! An alternative to using futexes to store thread queues in kernel space is to store them yourself. E.g. the parking_lot[0] Rust crate, inspired by WebKit[1], uses only one byte to store the unlocked/locked/locked_contended state, and under contention uses the address of the byte to index into a global open-addressing hash table of thread queues. You look up the object's entry, lock said entry, add the thread to the queue, unlock it, and go to sleep. Because you know that there is at most one entry per thread, you can keep the load factor very low in order to keep the mutex fast and form the thread queue out of a linked list of thread-locals. Leaking the old hash on resizing helps make resizing safe.

As a result, uncontended locks work the same as described in the blog post above; under contention, performance is similar to a futex too. But now your locks are only one byte in size, regardless of platform – while Windows allows 1-byte futexes, they're always 4 bytes on Linux and iirc Darwin doesn't quite have an equivalent api (but I might be wrong there). You also have more control over parked threads if you want to implement different fairness criteria, reliable timeouts or parking callbacks.

One drawback of this is that you can only easily use this within one process, while at least on Linux futexes can be shared between processes.

I've written a blog post[2] about using futexes to implement monitors (reëntrant mutexes with an associated condvar) in a compact way for my toy Java Virtual Machine, though I've since switched to a parking-lot-like approach.

[0]: https://github.com/amanieu/parking_lot [1]: https://webkit.org/blog/6161/locking-in-webkit [2]: https://specificprotagonist.net/jvm-futex.html

jcranmer · 2025-06-03T13:37:57 1748957877

> But now your locks are only one byte in size,

That's not a very useful property, though. Because inter-core memory works on cache-line granularities, packing more than one lock in a cache line is a Bad Idea™. Potentially it allows you to pack more data being protected by a lock with that data... but alignment rules means that you're going to invariably end up spending 4 or 8 bytes (via a regular integer or a pointer) on that lock anyways.

vlovich123 · 2025-06-03T15:51:18 1748965878

In rust the compiler will auto-pack everything so your 1 byte mutex would be placed after any multibyte data to avoid padding.

scottlamb · 2025-06-03T18:26:07 1748975167

That's typically not true due to the `Mutex<T>` design: the `T` gets padded to its alignment, then placed into the `struct Mutex` along with the signaling byte, and that struct is padded again before being put into the outer struct.

You can avoid this with a `parking_lot::Mutex<()>` or `parking_lot::RawMutex` guarding other contents, but then you need to use `unsafe` because the borrow checker doesn't understand what you're doing.

I coincidentally was discussing this elsewhere recently: https://www.reddit.com/r/rust/comments/1ky5gva/comment/mv3kp...

zozbot234 · 2025-06-03T17:01:00 1748970060

You could use CAS loops throughout to make your locks "less than one byte" in size, i.e. one byte, or perhaps one machine word, but using the free bits in that byte/word to store arbitrary data. (This is because a CAS loop can implement any read-modify-write operation on atomically sized data. But CAS will be somewhat slower than special-cased hardware atomics, so this is a bad idea for locks that are performance-sensitive.)

gpderetta · 2025-06-03T17:09:14 1748970554

Single bit spin locks to protect things like linked list nodes are not unheard of.

gpderetta · 2025-06-03T14:08:16 1748959696

Enough to be able to pack a mutex and a pointer together for example. If you are carefully packing your structs a one byte mutex is great.

skitter · 2025-06-03T14:21:07 1748960467

Yup, that's what I'm doing - storing the two bits needed for an object's monitor in the same word as its compressed class pointer. The pointer doesn't change over the lock's lifetime.

mandarax8 · 2025-06-03T20:05:56 1748981156

But you can embed this 1 byte lock into other bigger objects (eg. high bytes of a pointer).

With 4 byte locks your run into the exact same false sharing issues.

gmokki · 2025-06-03T16:56:50 1748969810

Doesn't the futex2 syscall allow 1 byte futexes on recent kernel?

Double checks. Nope. The api is there and the patch to implement them has been posted multiple times: https://lore.kernel.org/lkml/20241025093944.707639534@infrad...

But the small futex2 patch will not go forward until some users say they want/need the feature

skitter · 2025-03-29T18:41:03 1743273663

If you're interested in how the mountains and rivers are generated, it's mostly based on the paper "Large Scale Terrain Generation from Tectonic Uplift and Fluvial Erosion": Each chunk rises (at a noise-based, constant rate) while erosion is applied based on the chunk's slope and the size of its catchment area.

The result is a river network as well as the central height of each chunk; based on this roads, caves and structures are laid out. The actual voxels are only determined when a player loads the area and are (usually) not persisted.

Also, for some technologies not related to worldgen: Rendering is done via wgpu, models are built in MagicaVoxel, and both client and server use an ECS (specs).

skitter · 2025-03-29T16:59:15 1743267555

They're rather different: In Rust types only exist at compile time; dyn Any is a normal trait object, so you can only call the trait's methods. With C#'s dynamic, you can call arbitrary methods and access any fields with type checking of those accesses being delayed until runtime, which works because types exist at runtime too.

Rust's dyn Any corresponds better to C#'s Object; dynamic exists to interface with dynamic languages and is rarely used.

skitter · 2025-03-27T07:56:14 1743062174

Why limit yourself to one language, when you can have 23 of them in the same Datei?

https://github.com/charyan/unirust

andai · 2025-03-27T08:56:20 1743065780

They chose the right license!

skitter · 2025-02-10T18:28:56 1739212136

…what? Why bring "AI" into this?

notagoodidea · 2025-02-10T19:42:48 1739216568

I guess this is due to the tag line of the company. I am not familiar with the compiler/LLVM space so unsure how the different branches (compiler maintenance and AI tool infrastructure for example) are covered by the PHD internships, etc.

pdimitar · 2025-02-11T04:17:57 1739247477

Oops, I severely misread LLVM as LLM!