Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Pull not being a good model of two way communications is going to be the major blocker here in my opinion. It's going to mean that people are only going to see comments and reactions on their posts or comments from people they already subscribe to, because their RSS reader would have no possible way of knowing if anyone outside of that list commented, since you can't get notified of content sources you don't already know about, only poll ones you already know. That's already bad enough (one of the big negative things people with large followings on the fediverse talk about is how they can't see what people are saying in the replies to their posts a lot of the time if the servers those people are on are blocked by their server, which means hate and harassment and one sided conversations can fester, and often many commenters can't even see each others' comments, leading to people saying the same things over and over exhaustingly). This also means that people who don't have any followers will literally be essentially muted by default: no one will see their comments or interactions, because no one polls their feed yet, which means that it's basically pointless for them to interact at all, which sounds dispiriting and would probably lead to no one wanting to use this type of social media — moreover, it also creates a catch-22 problem, because a major way to get followers in the first place is to directly interact with other people and bigger blog posts, to make people aware of you and maybe get some of them interested in hearing more of what you have to say, yet in this model, you can't really interact until you have a following already, so your main means of getting a following is gated behind needing a following to work!


In the olden days when bloggers walked the earth, emitting lengthy posts over RSS, they solved this problem in two ways:

Firstly, by appending forms to the end of the post where someone could type out a reply that was more likely to be a few sentences or paragraph, rather than a full-blown essay.

Secondly, by inventing "TrackBack", a standardized way for someone else's blog software to say "hey I wrote some stuff on my blog in response to this post of yours".

Both of these would get appended to the end of the blog post's page as "comments".

This very quickly enabled the new problem of "trackback/comment spam"; the enduring solution in the world of blogs to that has been "Wordpress' Askimet plugin", which is a very centralized piece of the otherwise mostly-distributed infrastructure of RSS-based blogs. I think it's like $15 a year on top of the $60 or so I pay for my Wordpress site on cheap hosting.


Thirdly there was the Salmon protocol, although with the Plussing of Google Buzz it never got traction:

https://en.wikipedia.org/wiki/Salmon_(protocol)


Hmmmm. Now I'm wondering it Askimet are selling every blog comment they scan to OpenAI (and Palantir)?


Askimet is completely owned by Automattic (the corporate entity attached to the main author of Wordpress). And Wikipedia says this of Automattic:

In February of 2024, Automattic announced that it would begin selling user data from Tumblr and WordPress.com to Midjourney and OpenAI.[27]

27: https://www.404media.co/tumblr-and-wordpress-to-sell-users-d...

"Announced" may be too strong; the link is to an internal email leak discussing this possibility, but "Automattic may be experimenting with selling data to midjourney/openai" is still pretty close to "Automattic's selling user data". Hell, I can see the positive spin blog post announcing this: "They're also giving Automattic a bunch of free cycles on improving Askimet's spam/ham filters, in exchange for a look at every other comment anyone is sending through Askimet ever. We're aware this is a thorny ethical issue; click HERE for the archives of our mailing list dedicated to this project."

As for Palantir, my inner paranoid says that if the FBI/NSA/etc wants this data, they have some way of getting whether or not a deal with a public front like Palantir is involved.


I think it's safe to assume at this point that all public content and most private content on the internet is being fed to both AI and American intelligence services.


And China. And Mossad...


Note RSS is an ill-defined polling protocol. The server emits an RSS file which has the top N pieces of content.

All you can do is poll it at a greater or less frequency and hope you don't underpoll or overpoll. (I can easily fetch the RSS feed for an independent blog 1000 times for every time I fetch an HTML page, but should I? What if I wanted to follow 1000 independent blogs?)

With ActivityPub on the other hand you can ask for all updates since the last time you checked so there is a well-defined strategy to keep synced.


Oh wow, RSS really doesn't support pagination? I didn't know that.

WebSub can help with solving the poll rate issue, but that presumably wouldn't solve the problem for consumers that are offline for a while.


There is RFC 5005 - Feed Paging and Arching, but sadly the world of RSS Tools has never been very specification-forward, mostly, because the publishers of RSS feeds are even more desinterested.

https://www.rfc-editor.org/rfc/rfc5005.html


ActivityStreams could be seen as a viable extension of RSS (aside from ActivityPub being based off it already) and it does support some simple pagination via its "Collection" vocabulary. Since ActivityStreams is ultimately based on JSON+LD, one could also add seamless querying support to an ActivityStreams endpoint based on SPARQL, for more advanced uses.


Pro tip: Split your gigantic (and certainly thoughtful!) comment[s] into paragraphs.

Makes it a ton easier to parse. Cheers!


Thanks for the advice and compliment! :D I usually write them out and then read them and use the edit function to insert paragraph breaks after the fact, but I forgot to do that this time lol


Yes it is a big hurdle. However, I think content discovery is generally a big part of any content platform, way broader than discovering "who have reacted to my content". Now if you want to solve the problem of content discovery in a broader sense, then you have already fixed this particular shortcoming of pull-model as well. If a service that can inform you about new posts with a particular hashtag, it most probably can also tell you about reactions to a particular post.

And yes, I do realise that such services will tend to not be really decentralised (similar to the relationship of websites and search engines). But that means the downside is not that you don't get such discovery, but that you'll be reliant on more centralised services for such discovery, whereas in the fediverse you would be less reliant on such services for finding out who has commented on your post (though it will, as you've mentioned still not be enough).


> Yes it is a big hurdle. However, I think content discovery is generally a big part of any content platform, way broader than discovering "who have reacted to my content". Now if you want to solve the problem of content discovery in a broader sense, then you have already fixed this particular shortcoming of pull-model as well.

Right but I don't think as a general case finding all RSS feeds on the internet that satisfy a certain criteria, like publishing a hashtag or responding to a particular post, is a problem that can actually be solved in a principled way, because a fundamental limitation of the pull methodology is that you have to know the list of places you are checking beforehand, you can't get content from somewhere you didn't know about prior. The only way to solve this would be to have some kind of crawling and indexing system that regularly crawls the entire internet looking for these expanded RSS feeds and then categorized them according to various criteria in order to poll them. And that is both a very high technical investment and has a lot of limitations itself. So in the end it seems like you haven't really actually distributed the work of a social media system more equally after all, you've just inverted who is doing the work, going from a Federated set of servers that do all the work pushing content everywhere to a Federated set of servers that do all the work pulling content from places.


I do recognise the fact that such "aggregators" would be hugely centralised (if not outright monopolised, like the search engine space). however, maybe I'm wrong but I don't see the federated model succeeding without such services either, so I think of "need for centralised content discovery" as an independent problem, honestly.


I see your negatives as mostly positive. Engagement and virality are inevitably cancerous to any social network, and comment velocity needs to be suppressed and controlled to reduce entropy and limit the degree to which that network can be used by people primarily interested in "getting a following." Any feature (or anti-feature) that makes a platform unattractive to capitalists and influencers and shitposting trolls is a good thing. Discouraging people from posting and commenting is a good thing. Making it difficult to network is a good thing. None of these things need to be impossible, but I do believe there needs to be enough friction to make low hanging fruit and opportunism not worth the effort.

Otherwise everything gets taken over by AI and bots and psychopaths and propagandists and turns to shit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: