Beyond Procedure Calls as Component Glue: Connectors Deserve Metaclass Status

jauntywundrkind · on Oct 18, 2024

I really like the general notion. Yes we should start to improve glue! So much of what coders do today is glue code, and it's so rarely given form or distinction even though it's so omnipresent.

Think of something like Yahoo Pipes or If This Then That: almost purely glue code systems. And these programs composing systems so seem worthy of being their own things.

Marcel Weiher is lead author. https://news.ycombinator.com/user?id=mpweiher here, and lots of great posting on https://news.ycombinator.com/from?site=metaobject.com

causal · on Oct 18, 2024

Interesting premise. Can't read the entire paper right now but going on the abstract alone I'm not able to think of an illuminating example of the need for metaclass connectors

jerf · on Oct 18, 2024

It isn't so much that there's no need for such things but that they are so pervasive and normal you can't hardly even "see" them.

This is related to what killed the original OO conception as "your OO program should match your physical model", so, you have an Engine class that is directly related to the Drivetrain class (ignore "inheritance" versus "composition" and all such questions for this post), which is directly related to the Axle class, which is related to the directly Wheel class... what kills this model is that in practice you need stuff in between them. This model of OO is almost useless for anything except its original specification for simulations. Real-world OO however continues alive and well precisely in these "metaclass connectors", like, iterators, network call managers, and even the dreaded SomethingFactoryFactory. You need a WheelIterator and an CarVisitor and all kinds of other structural classes that are super useful to programmers but have no physical interpretation in OO, even if you are dealing with physical objects at all, which you often aren't.

The paper points out that if we sometimes communicate over HTTP, and sometimes over a "Remote Procedure Call", and sometimes over some pipe or messaging system, to which I'd also add, sometimes via database communications or passing text files around and lots of other things, it could/would be helpful if our languages could interact with these things in ways other than procedure calls, which have a very particular abstraction they provide that is a poor match to these sorts of interactions [1].

On my personal list of "things I'd like to see new languages try" one of them is to make the default function call less capable, so that the language might more pervasively deal with "this function call failed" in a way modern languages can't very well... not, "this function ran and returned an error/exception" but this function call itself failed; couldn't find the code, couldn't reach the server, etc. Things like, every "function call" in the language being able to set a timeout rather than only certain special things.

The purpose of this approach is for the foreignness of these various other methodologies to be something that no longer breaks function calls so thoroughly and causes these remote calls to be treated specially everywhere, where you have to specially wrap them in timeouts, specially handle a lot of failures you don't have to handle with normal functions, etc. As a side effect the language would potentially be more able to swap in "remote calls" for local calls more transparently than in modern languages, where local vs. remote calls act a lot like a "color" as referenced in the sync/async conversation, except it's even worse. Sync code wants to call sync code and async code wants to call async code, and if you need to cross the barrier you need an expensive and potentially lossy conversion process, but in this case, local code wants to call local code but there is no "remote code mode", the expensive and potentially lossy conversion is mandatory. Even when you stuff all that conversion code behind a single function call and sweep all the complexity somewhere where you can't see it at the call site, it's still there... it's just hidden better in some languages and frameworks and less well in others, and the question of the exact level of exposure that is best is one that can be debated for quite a while (it is definitely not clear that you should completely sweep away the differences even if you can, and there is a history of techs that take the sweeping too far [1]. (Yes, the same link twice.))

This paper and the accompanying language appears to take the opposite approach to what I was proposing, which was to loosen what a "function call" was until later abstractions could potentially harmonize the various approaches, and instead build an all-singing, all-dancing "connector" concept that envelops and abstracts all the various approaches to foreign communication, thus providing much more machinery than a normal function call.

I consider it an interesting idea worth exploring, however, as someone who has worked with a lot of these "connectors" over the years (as many of us have, I am not claiming anything unique here), I observe that they have wildly varying characteristics and I question the ability of an all-singing, all-dancing abstraction to accommodate all or even a good subset of the possible options without becoming a poor lowest-common-denominator of all of them. For instance, even within two members of the same class, a message bus that is 0-or-1 has significantly different characteristics than 1-or-many delivery, and by "significantly" I don't mean the normal casual definition of "larger than average", but in this case, relevant to my design. It doesn't take much scale before any attempt to abstract around that isn't going to work. I've got cases where I speak to HTTP services and I may or may not add a client certificate and may or may not apply certain checks to my server's certificate, depending on how I route over the network to this service. It's hard to abstract that. And abstracting "this code may connect with either an HTTP JSON API or a transactional message bus", with all the accompanying differences in how much context the answer comes back with. HTTP has an associated socket, so even if you're "async" or even if you do something exotic like make the request and hand off the reply to a completely different thread, it's got much more context than a message coming in over the bus that initially is just "some kind of message" to your system and it has to laboriously work it down into a reply to what, how do I route that reply, etc. which is all much more complicated than the HTTP request, to say nothing of, what if I want want to send the message from one thread, but I want the reply to not only come back to that thread but also go somewhere else in my program, like a high-level metric tracking how many of each kind of reply I get, that is not inline with the original request/response cycle?

I do not question that a lowest-common-denominator library could be constructed. Such things have been done. I would question whether it could meet real-world needs. For that someone would have to study their language directly for more time than I have for an HN post, because to really know if they succeeded you'd need to do more than just read the docs, but actually start building real programs and seeing how they do in the field. (Although reading the docs could be enough to tell they have not succeeded.) If they do succeed, then I give them significant props because I have just explained why I think it's hard.

I have learned to be very skeptical of this sort of all-encompassing abstraction over things that in reality have such huge and significant differences between them despite superficial similarities that in the end I end up questioning if there was any "true" commonality to abstract over in the first place. I have been burned both in the using of such systems, and the occasional error I've made trying to build them. The ones that do work take massive, sustained usage from a large user base over years of development. While it may not be a connector in the sense this paper means, an ORM that tries to go beyond a simple table mapping but to expose the underlying capabilities of your database, for multiple databases, is an example of this. Ignoring the "are ORMs even good at all?" for a moment, as that is well-trod ground, the ones that are good and usable and functional are huge and take years upon years of bashing out in the real world before they get there. I would suspect that if something like what is described in this paper was going to get to the point that even "medium sized" cloud usage (to say nothing of Facebook or Google) could consider it a reliable technology would be a similarly sized task... the initial cut of the library would seem superficially appealing and seem like they're 90% of the way there, but upon examination it would turn out to be more like 10%. The real world is fairly harsh.

[1]: https://news.ycombinator.com/item?id=30831309

CoastalCoder · on Oct 18, 2024

FYI, my personal experience has been that few HN readers will engage with such a long comment.

What's worked for me is to make my points / elaborate incrementally.

pests · on Oct 19, 2024

Nah, I had most of it read before I even got curious on it's length.

Comment sounded knowledgeable and well written, and that's what we all come here for right?

ValentinA23 · on Oct 18, 2024

I come for this kind of comments

carapace · on Oct 18, 2024

Um...

    user: jerf
    created: October 13, 2008
    karma: 85530

chubot · on Oct 20, 2024

Wrapping network requests in naive procedure/function calls is definitely not the way to go, as many people have learned

(I've personally seen a good-but-inexperienced co-worker make that specific design error, on a failing project)

But you can express things other than naive calls within existing languages, e.g. by passing context around, which deals with deadlines, cancellation, and perhaps idempotence/retries

https://pkg.go.dev/context

I also think that "connectors" are primarily and OS feature, or platform feature, not really a language feature.

And I'm not convinced that most systems need to abstract connectors, i.e. to provide a way for the system to work over multiple connectors. I think the machine boundaries can usually be pretty firmly defined, and you don't change them that much

---

Although I was a little surprised at the #1 FAQ here -- not that it's the wrong answer, but that it is implicitly assuming you need to have this choice for all components/apps

https://serviceweaver.dev/docs.html#faq

mpweiher · on Oct 18, 2024

I would also be skeptical of an "all-dancing" connector.

Fortunately, this isn't one of those.

The concept of "connector" is not something I came up with, it's a core concept of the field of software architecture and has been for at least 30 years.

This is the paper that the title riffs off:

Procedure Calls Are the Assembly Language of Software Interconnection: Connectors Deserve First-Class Status

https://insights.sei.cmu.edu/library/procedure-calls-are-the...

So if you're going to argue that the concept of connector is ill-founded, well good luck with that.

The taxonomy is also not mine:

Towards a taxonomy of software connectors

https://dl.acm.org/doi/10.1145/337180.337201

282 citations so far, so probably not total BS. So basically these connectors exist in real software, and though the list may not be complete, it is pretty comprehensive. And there's the taxonomy.

My key insight is that "first class" is the wrong layer for connectors. They belong at the "meta class layer", basically in the language.

Which is where they actually have been all along. Procedure call is a connector (type), and general-purpose languages tend to have support for some form of procedure call. Data access is also a connector (type), and most languages have some way to access data.

But they tend to be monomorphic, to have support for exactly one way of accessing data, for exactly one kind of procedure call. If you want something different: good luck!

Another option would be what you describe, some really general mechanism, but with that approach you get into the problem you describe of "one size fits all" connectors: they just don't work.

So instead you define a metaobject protocol for each of the various connector types and allow many different implementations of the basic connector types that were identified by the taxonomists.

At the same time, architectural concepts really help out metaprogramming:

"I would suggest that more progress could be made if the smart and talented Squeak list would think more about what the next step in metaprogramming should be -- how can we get great power, parsimony, AND security of meaning?"

https://lists.squeakfoundation.org/pipermail/squeak-dev/1998...

Things that implement one of the metaobject protocols can then be used (and sometimes defined) using a convenient and common syntax. So I don't just "loosen the definition" of a function call, though I do that as well, I also topple the dominant position of function calling. It's a lot looser than what you propose. Which makes it sometimes difficult to get your head around.

But then you get to model systems with any of the supported connectors (and that is open-ended) using appropriate linguistic tools.

If something is appropriately modeled as a dataflow system, you can conveniently express a dataflow system, without having the shoehorn that into the syntax and semantics of function calling.

If something is appropriately modeled as data access, you can conveniently express a data access system, without having the shoehorn that into the syntax and semantics of function calling.

You can combine these.

The common metaobject protocol means that any abstraction that you can make fit one of the provided protocols gets to participate, gets the convenient syntax.

Steve Sinowsky wrote: "Does developer convenience really trump correctness, scalability, performance, separation of concerns, extensibility, and accidental complexity?"

https://darkcoding.net/research/IEEE-Convenience_Over_Correc...

Turns out that in many if not most cases, it empirically does. How about we don't have to choose?

In addition, when you adhere to these metaobject protocols, you also get syntactic composability. The interfaces are compatible, so you can plug them together. It does not guarantee semantic compatibility, just like the fact that you can plug any Unix filters together does not guarantee that they do something useful together. But you can plug them together without additional glue code (just the pipe operator | )

> start building real programs and seeing how they do in the field.

Been there, done that, bought the T-Shirt. That's why this has been a very long time coming: you need the meta-abstraction, with which you build abstractions, with which you build actual frameworks/libraries/systems.

And you don't know what the right meta-abstraction is at the start.

This has been a LOT of work.

jerf · on Oct 18, 2024

As I said, if you pull it off, you'll be celebrated by me all the more for the fact that it was hard. I just don't have time to verify whether it actually is.

(I feel the same way about quantum computing, for instance; I'm skeptical, but if someone builds one, I'm not going to insist it doesn't exist, I'm going to give them more credit for doing something hard.)

chubot · on Oct 20, 2024

Hm why is there only a single syntax -> for a connector?

If different connectors have different semantics, as local and remote communication fundamentally do, then it seems like you want different syntax too

With a single syntax, it does feel like trying to be a "universal" connector

mpweiher · on Oct 21, 2024

Thanks for reading!

And yes, the → can be unexpected.

If you don't think about it, it just does what you expect it to do. And if you look at the details, it also makes sense. This is explained in section 3.

It turns out that → is not the actual connector. In software architecture, the connector does not directly connect components. It goes via the ports on the component(s), which are "connected" (different kind of "connected") to the roles on the connector.

Nominally, the → does this hooking up, so in theory it goes something like this:

Component1.port1 → connector.rolaA .. connector.roleB → Component2.port2

So the → "really" hooks up components to connectors via their respective ports and roles, it's not the connector and does not represent the connector. So there are no different semantics, it's always "please hook up this port to this role".

For the vast majority of cases, this much detail is way too cumbersome, which is probably one reason architecture description languages did not catch on. Fortunately it is also unnecessary in the vast majority of cases.

Objective-S takes advantage of this by making → polymorphic. So you can leave out the ports, the roles and often even the specific connector, and → figures it out anyway, filling in the blanks. This has been working extremely well so far.

If either it becomes necessary because there is ambiguity or you want to document what's going on, you can specify all these intermediate elements that → figures out for you.

I have thought about a syntax to make connectors explicit and still look like connectors, something along the lines of

     filterA  -pipe->  filterB

So you put the name of the connector inside the arrow. But it seems a bit wonky and so far just hasn't been proved necessary.

knome · on Oct 18, 2024

Unless there are two languages called Objective-S, here's a site for the language: https://objective.st/

I'll have to look closer at this later, but it doesn't give me a good feeling on first skim.

mpweiher · on Oct 18, 2024

That's the one. Alas, the website precedes my so far best attempt at explaining what I've done, which is this paper, and I haven't had time to properly update the site since. And probably won't for some time: first give the presentation at Onward! in Pasadena next Wednesday, then finish writing the PhD thesis, then update the site.

It is what it is ¯\_(ツ)_/¯

carapace · on Oct 18, 2024

I think you're on to something. My only advice would be something like, don't neglect the debugging, uh, "stories".

mpweiher · on Oct 21, 2024

Do you mean the stories of debugging the steaming pile of ... I mean my wonderful implementation of the language?

Or how to debug these kinds of system?

For the latter, I am actually very optimistic about making it possible to debug systems at a much higher and more meaningful level than currently, because you are making the high-level structure visible in the code instead of hand-compiling it down to some procedural implementation, with that hand-compiled procedural representation of your system being the thing that the the debugger sees.

With both the polymorphic write streams and the storage combinators being so highly composable, it becomes fairly easy to insert logging/instrumentation/debugging elements between the components of your system.

carapace · on Oct 22, 2024

The latter. ;)

Sounds good.

Joker_vD · on Oct 18, 2024

I think any of those openapi.yaml files in the wild is a good example that we need better metaclass connectors.

jauntywundrkind · on Oct 18, 2024

What are you asserting? Why?

rwmj · on Oct 18, 2024

Strong CORBA vibes with this one.

jsrcout · on Oct 18, 2024

I dunno, this sounds like it might actually be useful. (Haven't read the paper yet).

shermantanktop · on Oct 18, 2024

In that era we had a whole class of professional architect types who spoke at conferences in these abstractions. I should know, I was adjacent to that role and learned to speak a pidgin form of it.

Those people are mostly gone - not sure where they all went. This paper looks like the soldier in the jungle who never heard the war was over.

kstrauser · on Oct 18, 2024

All we really need is a simple object access protocol to link it all together.