Ask HN: What are the best links that we've missed?

paraschopra · on Jan 18, 2010

I was actually thinking of making a predictive model which would rate the new stories automatically. There are a lot of features that can be used in learning the prediction model: text length, content, etc. I already have the infrastructure more or less ready: http://www.wingify.com/contextsense/

In my opinion automatically rating posts will provide a useful filter to find stories that the community may have missed due to timezone/title issues.

What do you think about the idea?

astroguy · on Jan 18, 2010

Nice Idea, I am curious to know how contextsense give ratings.. could you please describe in more detail .. Thank you

paraschopra · on Jan 18, 2010

It doesn't give out ratings. It will rather generate tags and categories which represent a submission. Tags such as: erlang, database, module, in-memory, etc. You can also generate other features such as word count, presence of images, (domain name, poster's name, etc.)

Now what you do is to go through all (or as many as you could) stories that have been on HN front page, calculate features for those stories. Those features are used to learn a prediction model which you can apply to new stories to predict ratings.

whatusername · on Jan 18, 2010

I like the general concept.. The only issue I really see is with promoting groupthink. Doesn't that approach create a positive feedback loop. (TO be specific - if those predictions were exposed in a meaningful way, people would be more likely to view those stories, thus they would be upvoted, thus similar stories would be predicted to be successful)

The second issue is that stuff like this: (One of mine but I think it was a fascinating article) would probably fail any relevancy test: http://news.ycombinator.com/item?id=1046378 Concepts: Charm, Bed and Breakfast, Localities, Lodging, Ohio

paraschopra · on Jan 18, 2010

I think HN community is composed of mostly independent thinkers, so what it will more likely do is to expose interesting stories to them, it is their choice in the end to actually upvote or not. (That is precisely the problem we are trying to solve: interesting stories not getting noticed).

Yes, the algorithm won't be perfect but it could factor in historical data such as previously how many upvotes submitor's other stories got or how many votes did the stories from that website got. On both those counts, your post will score high.

RiderOfGiraffes · on Jan 18, 2010

I like the idea - I've been toying on and off with similar thoughts, it appears you're more advanced in your thinking than I.

However, you say:

  > I think HN community is composed of mostly independent thinkers ...

That's certainly been true, and is probably still mostly true, but the evidence suggests it's becoming less true as HN becomes more popular and well-known. I think that's inevitable, and is recognised by PG, which is why he's testing and experimenting with the way votes and karma work.

I think you need to plan for the quality and subjects to drift significantly in the future, starting now. If you can solve (or anticipate) that, you may have a winner.

cperciva · on Jan 18, 2010

I think my 'Dissecting SimpleDB BoxUsage' post deserved more attention than it received -- AFAIK it is the first (and perhaps only?) time someone published exactly what SimpleDB requests cost.

http://news.ycombinator.com/item?id=227327

PStamatiou · on Jan 18, 2010

http://hnweekly.chibidesign.com/

csuper · on Jan 18, 2010

That's really cool! I sometimes go on streaks of actual work and always knew I was missing some great stuff. Thanks.

Alex3917 · on Jan 18, 2010

From my last twenty submissions...

Pretending objects: http://www.boingboing.net/2009/11/16/pretending-and-games.ht...

Superstitious beliefs cemented before birth: http://dsc.discovery.com/news/2009/10/30/paranormal-supersti...

Sacraficial virgins of the Mississippi: http://www.salon.com/books/review/2009/08/06/cahokia/index.h...

Also, anything that isn't text rarely makes the front page regardless of how good it is. On that note, I think the new technology developed to create the movie Oceans is pretty cool:

http://www.youtube.com/watch?v=UfjEydlUdT8

jayliew · on Jan 18, 2010

I use PostRank (what used to be AideRSS) to sift through my piles of hundreds of RSS feeds. Basically some feeds are so high volume that I can say, "give me the top 30% most popular" only and just look at those. There are buckets like "good", "great", etc.

It does various things to determine the interesting-ness of a blog post.

http://www.postrank.com/

jamesbritt · on Jan 18, 2010

Idle thought: A script running on HN that, on slow news days, looks back on submissions on high news days and finds those that (by some criteria) got pushed off the the radar too quickly.

It then selects some number of these (again, by whatever criteria, TBA, etc.) and re-submits them.

That way, slow news days go away, and decent links get a second chance.

yungchin · on Jan 18, 2010

A couple of days ago I was really looking for everyone's insights on this patent application by Google: http://news.ycombinator.com/item?id=1042128 ... but the original title I put on it didn't make clear what it was (I think it said "Google News related"). If it could get a second chance, I'm still quite curious what people think.

kf · on Jan 18, 2010

This one didn't make it because of tweet down-weighting.

http://news.ycombinator.com/item?id=1049016 Still no confirmation...

akkartik · on Jan 18, 2010

Resubmitting the runt of my recent submissions: the myth of college football http://news.ycombinator.com/item?id=1059489