Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> But if the OP is actually trying to benchmark raw directory enumeration speed vs git ls-files, they should make sure they're benchmarking against something that's not making per-file stat calls at all.

I think OP is trying to benchmark which tool is fastest/most efficient for his workflow. If one of the tools has bugs (or intentional, but unnecessary behavior) that slow it down unnecessarily, that's great if they're fixed, but doesn't help if they're not.

I do think it's going to be pretty hard for a directory walker to do better than a pre-made index listing of the files, though, even with a warm filesystem cache.



This is true and an ok point, but the writing of the discussed article even has a subtitle: "Git ls-files is 5 times faster than fd or find, but why?"

My answer is "at least partly because `fd` & `find` are both slow - for different reasons". You are never going to do better than reading a saved answer, but I only get a 1.8x hit for not having an index { which needs maintenance as has been pointed out by almost everyone :-) }. `walk`, linked elsewhere, is less of a strawman comparison but could probably be optimized a bit more.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: