This seems to just complain that cons is hard to understand or something, without actually saying what makes it hard or unintuitive.
--------
Cons is fundamentally a linked list. The "pair" is a (value+pointer), and sometimes "value" can itself be a pointer. So when you have (pointer, pointer), you now can create trees.
As such, "Cons" is a powerful abstraction for trees, pairs, and linked lists. That's all it is. If you need "more" than this (ie: graphs), then perhaps its insufficient.
---------
"Cons" itself is a malloc(sizeof(pair)) operation. While car/cdr depend on whatever you're doing or however you've designed the program.
If I were to criticize "Cons", its that high performance modern programming needs to use arrays because the cost of pointers has grown. Latency of memory lookups is relatively slower on modern machines (or really, latency hasn't improved in like 20 years), so you need to consolidate more members together.
But in terms of a "simple" data-structure, "cons" cells and list-programming is a thing for a reason.
> If I were to criticize "Cons", its that high performance modern programming needs to use arrays
This has been true of linked lists since the advent of memory caches; it’s not just the last 20 years. They’re simple to implement, but they’re usually very bad for performance; you really want better locality of reference in a primitive data structure. You almost always should be using an array of some kind instead.
Arrays are much worse at insertions, especially at the middle or beginning.
There's a reason why the Linux Kernel continues to use Red-Black trees today for a huge number of its data-structures. When you need reliable performance on all operations (insert, delete, reorderings, reallocations), its hard to beat a tree (or "cons" for that matter).
If you can use an array, then use an array. But making the tree a "good default" seems like the right move. Lisp does that with cons/cdr/car and (a . b), or (a b c d e) syntax.
Turning (a b c d e) into (a b b1 c d e), especially for larger sizes, will likely be faster than arrays (which innately needs to memcpy ~half the array on the average).
Why does there need to be a single consistent way to implement something with cons? That's an odd complaint. What language would the OP be happy with if they want one singular, canonical way to implement everything?
To be clear, I'm steal manning his argument as I'm not sure I agree with it. But I think the point is summarized best with an example: "Here's an API that takes a red-black tree as an argument." Okaaay, what format is the red-black tree? How do I construct one?
In Lisp you'd pretty much have to ship your module with its own tree-construction logic, or use some common (but importantly, not technically standard) implementation, and document that's what you did, so people know how to structure the data they pass to you.
In C++, you simply take a std::map, which implements a red-black tree for you, and being part of the standard library it is accepted and commonly used nearly everywhere.
I think OP wishes Lisp had something like std::map.
Cons cells aren't the only data structures in Lisp. And there are libraries, packages, that you can use if you want a common structure across multiple projects.
> To be clear, I'm steal manning his argument as I'm not sure I agree with it. But I think the point is summarized best with an example: "Here's an API that takes a red-black tree as an argument." Okaaay, what format is the red-black tree? How do I construct one?
If the API calls for taking a red-black tree as an argument, it will also provide either the specific red-black tree implementation itself or it will refer you to an existing package to include as a dependency. This is not dissimilar from C++. In C++ if something takes a std::map, you use a std::map. If it takes another data structure that's not in the STL, then it will provide it or tell you how to obtain it as a dependency.
The ergonomics of the language and ecosystem are so much better when there is a standard container implementation. See for example the annoyance of using a hashmap in C++ for the first few decades of its existence.
I mean, in Lisp your go-to structure will likely be unbalanced trees
Which like Quicksort (nearly the same analysis btw), has exceptional average case performance (even faster than RB trees on the average !!!!) but the downside of horrific O(n^2) performance on the worst case.
RBTrees or AVL trees all help at improving those worst cases (AVL strangely has its best case be the worst case of unbalanced trees lol).
In any case, there's a lot of tree flavors. Lisp is certainly sufficient and simple when it's exploring these different details.
Unbalanced trees, in any case, is actually a reasonably sane default. RB Tree or AVL is probably a better default but don't knock the unbalanced tree before you try it!!
------------
I've heard of highly optimized lisps where the typical list is turned into an array or at least a rope in the average case btw.
>In any case, there's a lot of tree flavors. Lisp is certainly sufficient and simple when it's exploring these different details.
I've become a fan of scapegoat trees¹ lately for (immutable) balanced trees. There are certainly a ton of other options besides RB and AVL trees that are fun to explore (in any language)
>I've heard of highly optimized lisps where the typical list is turned into an array...
Back in the lisp machine days there was CDR coding² to flatten out a list into a contiguous block, but I'm not sure if any modern common lisp or scheme implementations use it.
When some random tech person says "modern", the dividing line is usually the date when they started programming. Everything which was already available at that time is "modern", and none of it existed the year before that, since it has no associations with their pre-programming life.
I think that the author's complaint is that cons cells are too low level a construct to be exposed so prominently to the user. I think he would prefer that it was abstracted over with some higher level interface the way a lot of other language do it. He mentions Clojure as a Lisp that gets it right in his opinion and that's exactly what Clojure does. Cons cells are still available as a construct, but the default method for interacting with lists is a set of higher level functions that also work on arrays and other sequences and don't require that you think as much about the implementation of the lists.
> The "pair" is a (value+pointer), and sometimes "value" can itself be a pointer. So when you have (pointer, pointer), you now can create trees.
Pointers don't always imply trees. Any Lisp object that won't directly fit in half a cons cell gets allocated separately and a pointer to it is placed in that half of the cons cell. Examples: Strings, symbols, bignums, etc.
> If you need "more" than this (ie: graphs), then perhaps its insufficient.
You can certainly represent cyclic graphs with cons cells. The *print-circle* mechanism in Common Lisp handles printing them, for example.
I mean are we talking about like optimized C here? Or are we talking about a fast-ish language with more like the flexibility of JavaScript but a bit better performance?
I'm pretty sure Lisp is more the latter, about flexibility and elegance. But not the fastest.
We are talking about hardware. How recent is this supposed "modern" time when it became faster to go to the next cell of an array rather than chase a pointer (in the best hand-written machine code to do either).
Linux continues to use RB Trees for a huge number of its processing. Because its faster to keep a tree sorted than to sort arrays.
You talk as if linked data-structures are inferior or something. Rather than different. Now yes, linked data-structures get relatively worse when latency remains the same but CPU GHz gets faster, but then the reverse is true when cache sizes get larger. L3 caches are now 192MBs on some processors, while Intel consistently has been pushing 1MB+ L2 caches with full load/store bandwidth per core.
There is a time and place for all techniques. Including linked data-structures / cons / list-processing.
On the contrary, I'm simply questioning the idea that we should all be switching to vectors and abandoning list structures, because of "modern" machines.
Every other tech bro blowhard likes to repeat this.
So, first of all, I want to know what they mean by "modern"---how recently was it supposedly okay to chase pointers rather than just offsets.
This seems to just complain that cons is hard to understand or something, without actually saying what makes it hard or unintuitive.
--------
Cons is fundamentally a linked list. The "pair" is a (value+pointer), and sometimes "value" can itself be a pointer. So when you have (pointer, pointer), you now can create trees.
As such, "Cons" is a powerful abstraction for trees, pairs, and linked lists. That's all it is. If you need "more" than this (ie: graphs), then perhaps its insufficient.
---------
"Cons" itself is a malloc(sizeof(pair)) operation. While car/cdr depend on whatever you're doing or however you've designed the program.
If I were to criticize "Cons", its that high performance modern programming needs to use arrays because the cost of pointers has grown. Latency of memory lookups is relatively slower on modern machines (or really, latency hasn't improved in like 20 years), so you need to consolidate more members together.
But in terms of a "simple" data-structure, "cons" cells and list-programming is a thing for a reason.