Hacker Newsnew | past | comments | ask | show | jobs | submit | ralegh's commentslogin

Wonder if it would be better to auto translate to broken rust, ie forcing the user to fix memory issues. I imagine that would lead to pretty big refactors in some cases though.


No. What comes out of C2Rust is awful. The Rust that comes out reads like compiler output. Basically, they have a library of unsafe Rust functions that emulate C semantics. Put in C that crashes, get Rust that crashes in the same way. Tried that on a JPEG 2000 decoder.


Just noting that 4000 vCPUs usually means 2000 cores, 4000 threads


It doesn't mean that here. Epdsv6 is 1 core = 1 vCPU.


I stand corrected…


DoS is a performance problem, if your server was infinitely fast with infinite storage they wouldnt be an issue.


> DoS is a performance problem

Not really. Running out of computational resources to fulfill requests is not a performance issue. Think of thinks such as exhausting a connection pool. More often than not, some components of a system can't scale horizontally.


It is actually a financial problem too. Servers stop working when the bill goes unpaid. Sad but true.


If my gandma had wheels it would be a car.


This is fine assuming the popular request types don’t change, but arguably if both new versions of matching are sufficiently fast then I would prefer Ken’s long term as the other could become slow again if the distribution of request types changes.


As a counterpoint, what fraction of the future engineers who will touch the project are likely to be able to competently edit the finite automata based version without introducing bugs and what fraction will be able to competently edit the if statement that checks the particular policy?


A further question mark is whether any of this has sufficient instrumentation to be able to notice and act on a change of and when it occurs.


Nonsense. The pre-check can literally be one line (if common_case {fast_path()} else {slow_path()}), and thus enabling or disabling it is dead simple and obvious if the problem changes in the future.

Lines of thinking like that are part of the reason most modern software is so sloooow :)


This situation where two paths produce the same output but one is optimized is the easiest case in property-based testing, as the property is just:

  normal(x) == optimized(x)


I have sometimes done just this. First I write the simplest possible brute force code, something that any competent programmer can look at and say, "Yes, this may be slow, but it is straightforward and obvious that it will handle all cases."

Then I write the optimized code and use a test like yours to compare the results between the simple code and the optimized code.

One time I needed to write a function to search for a specific pattern in a binary file and change it. So I wrote the brute force code as a first step, the same code that anyone would probably write as a simple solution. It worked the first time, and a couple of people reviewed the code and said "yep, even if it's slow, it is correct."

But this code took more than a second to run!

Of course I thought about optimizing it with Boyer-Moore or the like. Then I went, "Hold on to your horses. This isn't something like a web page load where one second matters. It's part of a build process that only runs a few times a day and already takes several minutes to run. One extra second is nothing!"

In the wise words of Kenny Rogers in The Gambler:

  You got to know when to hold 'em,
  know when to fold 'em
  Know when to walk away
  and know when to run


Knuth's famous quote about premature optimization captures this perfectly.

"Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."


Also true for deciding whether to write code at all! About 15 years ago I was working with a junior who'd spent 3 hours trying to automate some data editing until I said "mate, you can just edit it all by hand in about an hour!"


I had a similar one where a billing mistake had happened in a system and the other devs were trying to find out how to determine all of the bad values. I asked if they had simply looked at the top N that could fit on their screen sorted by value. The incorrect values were the obvious outliers. Only about 4 of them. They had been blocked on identifying them for about an hour.


Funny enough, with LLMs this trade off may well have flipped. For simple tasks like given a string like X, format it like Y, they work amazingly well.


Absolutely. Even back then, trying to script it would sometimes be fine. I've had plenty of situations where I'd try to write a script, give it 30 minutes then say "fuck it, I'm hand-fixing it." The same probably applies to LLMs, you just need to set a cut-off point before it stops being worthwhile.


Sure, but I have to obtain my dopamine somehow.


Very true. I would have gone with

  Every coder knows
  the secret to optimizin’
  is knowing what to throw away
  and knowing what to keep


Absolutely! Proptesting is a great but underutilised feature. I use it a lot wherever I'm in a language which makes it easy (and with a mature library for that), like Haskell, Ocaml, or Rust.


The simple cases are great demos, and then the complex cases melt your brain! But still worth it a lot of the time.


Hyperoptimizing for the fast path today and ignoring that hardware and usage patterns change is the reason modern software is so slooow :)

A more robust strategy would be at least be to check if the rule was the same as the previous one (or a small hash table) so that the system is self-healing.

Ken’s solution is at least robust and by that property I would prefer it since it’s just as fast but doesn’t have any weird tail latencies where the requests out of your cache distribution are as fast as the ones in.


You were shown an example of exactly why this thinking is incorrect but you still insist...

Also, it's trivial to keep Ken's implementation as the slow path. If request patterns change, dig up the new fast path and put the old one in Ken's slow path code. Most of the performance will still come from the initial `if`.


It’s ungenerous to assume I would be against the if statement + Ken’s. But Ken’s approach is critically important and the “if” statement should just be a 1 entry cache instead of being hard coded. Getting this stuff right in a future proofed durable way is actually quite hard even when you notice the opportunity.


Nobody is hyperoptimizing the fast path today.

Ken's solution was stated to have been slower than the alternative optimization.


Ken's solution optimized the general case, basically everything that doesn't match the if-statement.


You can even track the request statistics live and disable the fast path if the distribution of requests changes significantly.


I think you missed the point. Ken's version wasn't removed, it was simply prepended with something like:

  if value in most_common_range:
    take_fast_path()
  else:
    use_kens_solution()


Could you give some examples of where you're using it?


My YAML loader[0] is where I first broke through the wall. It's still languishing in a relatively proof-of-concept state but does exhibit the basic design principles.

There's also a Metamath verifier that does parallel proof verification on the GPU. It's unpublished right now because the whole thing is just a handful of handwritten code in my notebook at the moment. Hoping to get this out this month, actually.

A DOOM port is bouncing around in my notes as well as a way to explore asynchronous APL.

I'm also helping Aaron Hsu in his APL compiler[1] for stuff adjacent to my professional work, which I can't comment on much, unfortunately.

Et hoc genus omne

[0]:https://github.com/xelxebar/dayaml

[1]:https://github.com/Co-dfns/Co-dfns


A port of Doom in Apl would be something to see. I keep meaning to get more proficient in using the language, but it's hard to prioritize given how challenging it would be to use in my pretty conservative industry.


Web browsers could have 1/10th of the features, basically enough for markdown, forms, forums, displaying media, and minimal styling.


Yes, and that would be enough to make a useful internet. Unfortunately, out of control capitalistic tendencies push for maximum features for no real benefit at all.


> but do I really know without performing a benchmark?

Not really. But that’s one of Rob Pikes rules [1], I think the intention is to write whatever is simplest and optimize later. The programmer doesn’t need to remember 100 rules about how memory is allocated in different situations.

[1] https://users.ece.utexas.edu/~adnan/pike.html


I mean it's a great idea, and I fully agree that I do not want to worry about memory allocation. So then why is `make` a thing? And why is `new` a thing? And why can't I take an address to a primitive/literal? And yet I can still take an address to a struct initialization? And why can't I take an address to anything that's returned by a function?



I used generics once, was kinda useful, but definitely avoidable. The only feature I could see myself using is something Linq-esque for slices and maps. Otherwise I’m content.


I personally found the polars API much clunkier, especially for rapid prototyping. I use it only for cemented processes where I could do with speed up/memory reduction.

Is there anything specific you prefer moving from the pandas API to polars?


Not OP but the ability to natively implement complex groupby logic is a huge plus for me at least.

Say you want to take an aggergation like "the mean of all values over the 75th percentile" algonside a few other aggregations. In pandas, this means you're gonna be in for a bunch of hoops and messing around with stuff because you can't express it via the api. Polars' api lets you express this directly without having to implement any kind of workaround.

Nice article on it here: https://labs.quansight.org/blog/dataframe-group-by


Markets are what you described. Participants that regularly beat the market are rewarded with more money (and confidence) which lets them bet larger size and have more impact on the market.

Uninformed bets should wash out as noise, and informed bettors should reverse uninformed moves so long as they are profitable.


The distinction that I am drawing is the ability use the betting market generate forecasts with greater accuracy than the market itself.

The stock market analogy would be the predictions you could make as an individual if you knew the internal limits and assessments of the best trading firms, and not just current market prices.

If I could pay the NYSE for real time trading info on the buy/sell limits from warren buffet and other whales, I would.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: