Pull not being a good model of two way communications is going to be the major b...

egypturnash · on Feb 29, 2024

In the olden days when bloggers walked the earth, emitting lengthy posts over RSS, they solved this problem in two ways:

Firstly, by appending forms to the end of the post where someone could type out a reply that was more likely to be a few sentences or paragraph, rather than a full-blown essay.

Secondly, by inventing "TrackBack", a standardized way for someone else's blog software to say "hey I wrote some stuff on my blog in response to this post of yours".

Both of these would get appended to the end of the blog post's page as "comments".

This very quickly enabled the new problem of "trackback/comment spam"; the enduring solution in the world of blogs to that has been "Wordpress' Askimet plugin", which is a very centralized piece of the otherwise mostly-distributed infrastructure of RSS-based blogs. I think it's like $15 a year on top of the $60 or so I pay for my Wordpress site on cheap hosting.

ttepasse · on March 1, 2024

Thirdly there was the Salmon protocol, although with the Plussing of Google Buzz it never got traction:

https://en.wikipedia.org/wiki/Salmon_(protocol)

bigiain · on March 1, 2024

Hmmmm. Now I'm wondering it Askimet are selling every blog comment they scan to OpenAI (and Palantir)?

egypturnash · on March 3, 2024

Askimet is completely owned by Automattic (the corporate entity attached to the main author of Wordpress). And Wikipedia says this of Automattic:

In February of 2024, Automattic announced that it would begin selling user data from Tumblr and WordPress.com to Midjourney and OpenAI.[27]

27: https://www.404media.co/tumblr-and-wordpress-to-sell-users-d...

"Announced" may be too strong; the link is to an internal email leak discussing this possibility, but "Automattic may be experimenting with selling data to midjourney/openai" is still pretty close to "Automattic's selling user data". Hell, I can see the positive spin blog post announcing this: "They're also giving Automattic a bunch of free cycles on improving Askimet's spam/ham filters, in exchange for a look at every other comment anyone is sending through Askimet ever. We're aware this is a thorny ethical issue; click HERE for the archives of our mailing list dedicated to this project."

As for Palantir, my inner paranoid says that if the FBI/NSA/etc wants this data, they have some way of getting whether or not a deal with a public front like Palantir is involved.

krapp · on March 1, 2024

I think it's safe to assume at this point that all public content and most private content on the internet is being fed to both AI and American intelligence services.

bigiain · on March 3, 2024

And China. And Mossad...

PaulHoule · on Feb 29, 2024

Note RSS is an ill-defined polling protocol. The server emits an RSS file which has the top N pieces of content.

All you can do is poll it at a greater or less frequency and hope you don't underpoll or overpoll. (I can easily fetch the RSS feed for an independent blog 1000 times for every time I fetch an HTML page, but should I? What if I wanted to follow 1000 independent blogs?)

With ActivityPub on the other hand you can ask for all updates since the last time you checked so there is a well-defined strategy to keep synced.

lxgr · on Feb 29, 2024

Oh wow, RSS really doesn't support pagination? I didn't know that.

WebSub can help with solving the poll rate issue, but that presumably wouldn't solve the problem for consumers that are offline for a while.

ttepasse · on March 1, 2024

There is RFC 5005 - Feed Paging and Arching, but sadly the world of RSS Tools has never been very specification-forward, mostly, because the publishers of RSS feeds are even more desinterested.

https://www.rfc-editor.org/rfc/rfc5005.html

zozbot234 · on Feb 29, 2024

ActivityStreams could be seen as a viable extension of RSS (aside from ActivityPub being based off it already) and it does support some simple pagination via its "Collection" vocabulary. Since ActivityStreams is ultimately based on JSON+LD, one could also add seamless querying support to an ActivityStreams endpoint based on SPARQL, for more advanced uses.

jdthedisciple · on Feb 29, 2024

Pro tip: Split your gigantic (and certainly thoughtful!) comment[s] into paragraphs.

Makes it a ton easier to parse. Cheers!

logicprog · on Feb 29, 2024

Thanks for the advice and compliment! :D I usually write them out and then read them and use the edit function to insert paragraph breaks after the fact, but I forgot to do that this time lol

lorean_victor · on Feb 29, 2024

Yes it is a big hurdle. However, I think content discovery is generally a big part of any content platform, way broader than discovering "who have reacted to my content". Now if you want to solve the problem of content discovery in a broader sense, then you have already fixed this particular shortcoming of pull-model as well. If a service that can inform you about new posts with a particular hashtag, it most probably can also tell you about reactions to a particular post.

And yes, I do realise that such services will tend to not be really decentralised (similar to the relationship of websites and search engines). But that means the downside is not that you don't get such discovery, but that you'll be reliant on more centralised services for such discovery, whereas in the fediverse you would be less reliant on such services for finding out who has commented on your post (though it will, as you've mentioned still not be enough).

logicprog · on Feb 29, 2024

> Yes it is a big hurdle. However, I think content discovery is generally a big part of any content platform, way broader than discovering "who have reacted to my content". Now if you want to solve the problem of content discovery in a broader sense, then you have already fixed this particular shortcoming of pull-model as well.

Right but I don't think as a general case finding all RSS feeds on the internet that satisfy a certain criteria, like publishing a hashtag or responding to a particular post, is a problem that can actually be solved in a principled way, because a fundamental limitation of the pull methodology is that you have to know the list of places you are checking beforehand, you can't get content from somewhere you didn't know about prior. The only way to solve this would be to have some kind of crawling and indexing system that regularly crawls the entire internet looking for these expanded RSS feeds and then categorized them according to various criteria in order to poll them. And that is both a very high technical investment and has a lot of limitations itself. So in the end it seems like you haven't really actually distributed the work of a social media system more equally after all, you've just inverted who is doing the work, going from a Federated set of servers that do all the work pushing content everywhere to a Federated set of servers that do all the work pulling content from places.

lorean_victor · on Feb 29, 2024

I do recognise the fact that such "aggregators" would be hugely centralised (if not outright monopolised, like the search engine space). however, maybe I'm wrong but I don't see the federated model succeeding without such services either, so I think of "need for centralised content discovery" as an independent problem, honestly.

krapp · on March 1, 2024

I see your negatives as mostly positive. Engagement and virality are inevitably cancerous to any social network, and comment velocity needs to be suppressed and controlled to reduce entropy and limit the degree to which that network can be used by people primarily interested in "getting a following." Any feature (or anti-feature) that makes a platform unattractive to capitalists and influencers and shitposting trolls is a good thing. Discouraging people from posting and commenting is a good thing. Making it difficult to network is a good thing. None of these things need to be impossible, but I do believe there needs to be enough friction to make low hanging fruit and opportunism not worth the effort.

Otherwise everything gets taken over by AI and bots and psychopaths and propagandists and turns to shit.