Hacker Newsnew | past | comments | ask | show | jobs | submit | more coder543's commentslogin

"Hundreds of thousands of people" are paying $20/mo for it, according to the CEO.[0] That seems like a very respectable place for such an early product to be.

It is extremely far from "no one".

[0]: https://www.youtube.com/watch?v=FWPmu_rKxJo&t=185s


Even if we say 1M people pay $20/month, that's only $240M/year. That's not enough to continue to grow and support the free users sans-ads.


There's a lot we don't know about Perplexity's costs, like the cost of supporting free users, the difference in usage between free and paying users, and whether these ads fully offset the cost of free users.

From what I've seen, it's typical for startups at this stage to be generating minimal revenue. Perplexity has raised close to $1 billion, so they likely aren't under pressure to be profitable just yet. Bringing in tens of millions in revenue annually would actually be quite a strong start.

It may sometimes seem like “no one pays for anything”, but clearly a lot of people are paying for Perplexity Pro. It’s their business, so they can run ads if they think that will help them grow, but it’s not because no one paid for the Pro tier.


Why would they need to grow? If the revenues exceed the costs that's a viable business model right there.


Maybe they have debt to pay off.


I wonder if cross-examined on that claim, that he would clarify that they have hundreds of thousands of Pro customers. Something like: It’s easy to get lost in the weeds of how much Individual subscribers paid, but that is an accurate characterization of the number of Pro Perplexity subscribers.

The point is, I have a Pro Perplexity subscription for a year, because my ISP was offering year long access for free. I think it is pretty terrible. The answers when I select Claude 3.5 Sonnet as the model for Perplexity always seem incredibly stupid compared to the answers when I use Claude 3.5 Sonnet from Anthropic’s site (which I think is really good).

I like the idea of Perplexity supplying citations, but it seems more like it is parallel construction than citing how the model came up with a particular answer. And, it seems like it is tossing out superior results whenever it cannot pretend to show its work.


> I like the idea of Perplexity supplying citations, but it seems more like it is parallel construction than citing how the model came up with a particular answer.

It's an LLM, throwing whatever text the model says you'll like best. It's amazing that they can hack it to provide links, but if you are expecting it to have a "thought-line" with "traceable research sources", you are just falling for the hype.


OP has a point. Why is this being downvoted? In my neck of the woods (UK and Ireland), everyone I know using Perplexity Pro, got a free annual subscription with Revolut. Like OP, I tried it for a week and moved on.

Its citations are often bunkum and code generation abilities are very limited, compared to GPT-4o or Claude 3.5 Sonnet, heck even tiny 4-bit quantized 14B local LLMs are way better at code generation (Qwen 2.5 Coder).


Does template_plural actually work well / offer any benefits?


It does, we use it as a default. Some possible benefits are that 1) it saves input tokens 2) in theory allows for different variations on a theme, whereas with two seperate prompts you run the risk of repeating one topic.


One problem from the benchmark:

      "prompt_id": "river_crossing_easy",
      "category": "Logic Puzzle",
      "title": "Easy river crossing",
      "prompt": "A farmer is on one side of a river with a wolf, a goat, and a cabbage. When he is crossing the river in a boat, he can only take one item with him at a time. The wolf will eat the goat if left alone together, and the goat will eat the cabbage if left alone together. How can the farmer transport the goat across the river without it being eaten?",
      "expected_behavior": [
        "Answer concludes that they simply get in the boat and cross together in one trip"
      ],
EDIT: removing most of my commentary on this problem. As a human, I was tricked by the problem too. I would love to see how a random selection of humans would do on this one… but it just doesn’t feel like a great test to me.


No. Simply plug in the prompt to chat gpt and see what happens.

The llm isn't getting confused by the meaning of "item". It's recognizing a common problem and not picking up on the fact that the farmer just needs to transport the goat and nothing else.

Instead, it gives the standard answer for how to transport everything across.


I'll admit as a fallible humane I didn't pick it up, but I was focused on the wrong thing because I've been using "and the boat can take everything" and gpt-3 just could not get that variation in one shot.

Gpt-3 is old hat though. later versions of gpt-4 manage to get it with a bunch coaching, and o1 manages to solve it with less coaching.


>This is twisting the English language to assume that "item" only refers to non-living things.

Not really. Unless I'm not reading correctly, most of the problem is irrelevant as you're only required to cross the boat with the goat, you don't care about the cabbage. The difficulty lies in the assumption you need to cross everything due to the resemblance with the bigger problem.


You’re reading it correctly. I read it again after your comment and I realized I too pattern matched to the typical logic puzzle before reading it carefully and exactly. I imagine the test here is designed for this very purpose to see if the model is pattern matching or reasoning.


The problem is to ask the farmer to transport the goat. So the farmer indeed gets in the boat with the goat. The unstated gotcha is that the farmer is willing to abandon the wolf and the cabbage. A heavily pattern-matching LLM or human would immediately assume that the farmer needs to transport all three.


Yep, and that gotcha got me, as a perfectly non-silicon human. My bad everyone.


Wow, this seems ridiculous. The expected answer is basically finding a loophole in the problem. I can imagine how worthless all of these models would be if they behaved that way.


It's not a loophole, the question is "how can he get the goat across?". The answer is he just takes it across.


If you revise this prompt to satisfy your pedantry, (at least) 4o still gets it wrong.


For compressing short (<100 bytes), repetitive strings, you could potentially train a zstd dictionary on your dataset, and then use that same dictionary for all rows. Of course, you’d want to disable several zstd defaults, like outputting the zstd header, since every single byte counts for short string compression.


> Qwen team just dropped the Apache 2.0 licensed QvQ-72B-Preview

When they dropped it, the huggingface license metadata said it was under the Qwen license, but the actual LICENSE file was Apache 2.0. Now, they have "corrected" the mistake and put the Qwen LICENSE file in place.

I'm not a lawyer. If the huggingface metadata had also agreed it was Apache 2.0, then I would be pretty confident it was Apache 2.0 forever, whether Alibaba wanted that or not, but the ambiguity makes me less certain how things would shake out in a court of law.


Thinking about what feels different here, there are a couple of things that could be fun to implement:

- On iOS, opening and closing an app also scales and blurs/unblurs the wallpaper at the same time that it’s animating the app entering/exiting the foreground.

- Also, years ago, Apple added a very subtle 3D effect to the Home Screen. Essentially, when you’re looking at the Home Screen, as you tilt the phone, the icons and widgets move a few pixels in the direction of the tilt, which makes it feel like they’re popping out of the screen a little. To study the effect in detail, you can just look at the edge of an icon or the text below an icon and tilt the screen around and notice how it moves relative to the background image. It’s meant to be a very subtle effect, not some garishly dramatic effect.


I did that parallax background effect on a web site many years ago. Unfortunately (but understandably) accelerometer data is now behind a permission prompt on the web. Displaying a garish modal permission prompt so that you can do a subtle background transition doesn't make sense as a tradeoff.


I'm a bit confused, what permissions would this animation require?


You don’t want every website in the world to have unfettered access to your accelerometer - way too much sensitive info can be grokked from that.


Accelerometer data from the phone in order to determine the position for the parallax effect.


I actually added the parallax effect back in my version of iOS 7 on the web (2013). It was meant for people to check the new design out on their phones before upgrading. You could even open/zoom into folders. Spend quite too much time on it, but it was quite novel back then. https://streamable.com/pki7ux


The parallax background effect has actually been broken for a few years now

https://forums.macrumors.com/threads/home-screen-parallax-ef...


That linked discussion doesn't say it is broken all the time (which your comment strongly implies to anyone who doesn't read the link)... and I had verified the parallax effect was working fine earlier today when I made my comment.

I hadn't rebooted my phone in quite awhile, so I'm not sure what the conditions are for that bug, but I think that discussion is wrong that rebooting is the only way to get it working again. I had surely accessed the App Library since the last reboot at some point, and yet parallax was working fine. But I can confirm that it does break (at least temporarily) after accessing the App Library.


It does have the blur thing when you slide over the widget screen.


That's disappointing if it's removed now, I used to really like that on my old phone!


I didn’t say it was gone! Just that Apple added it years ago.

The website this post is about doesn’t implement the feature.


You realize that quite a few people have disabled that 'subtle 3D effect' because they found it annoying?


Some countries consider the 1st floor to be the ground floor, others consider the 1st floor to be the floor above the ground floor, which the formerly mentioned countries consider the 2nd floor… I think 0/1-based indexing is more subjective than simply being a “relic of C” or a “horrible kludge” :P


I've been in the US for over a decade and it still occasionally makes me double-take when a room numbered 1xx is on the ground floor


It's cheaper and faster to train a small model, which is better for a research team to iterate on, right? If Google decides that a particular small model is really good, why wouldn't they go ahead and release it while they work on scaling up that work to train the larger versions of the model?


On the other hand, once you have a large model, you can use various distillation techniques to train smaller models faster and with better results. Meta seems to be very successful doing this with Llama, in particular.


> Meta seems to be very successful doing this with Llama, in particular.

Kind of sort of: https://news.ycombinator.com/item?id=42391096


I have no knowledge of Google specific cases, but in many teams smaller models are trained upon bigger frontier models through distillation. So the frontier models come first then smaller models later.


Training a "frontier model" without testing the architecture is very risky.

Meta trained the smaller Llama 3 models first, and then trained the 405B model on the same architecture once it had been validated on the smaller ones. Later, they went back and used that 405B model to improve the smaller models for the Llama 3.1 release. Mistral started with a number of small models before scaling up to larger models.

I feel like this is a fairly common pattern.

If Google had a bigger version of Gemini 2.0 ready to go, I feel confident they would have mentioned it, and it would be difficult to distill it down to a small model if it wasn't ready to go.


Location: Birmingham, AL

Remote: Preferred

Willing to relocate: Yes

Technologies: Rust, Go, TypeScript, Postgres, Redis, Kafka, AWS, GCP, Remix, Prisma, etc.

Resume/CV: https://drive.google.com/file/d/16LlexfHCAUge8blRTsu5J_R5g1p...

Email: my username @ gmail.com

I'm a very technically skilled software engineer with a broad range of experience, but primarily focused on developing robust backend systems. I have a preferred set of languages (Rust, Go, TypeScript) that I find covers a lot of problems, but I know others if there is a good reason to use something else.

I would like to find a role at a company that is actually having a meaningful impact.


To be specific, a single WSE-3 has the same die area as about 57 H100s. It's a big chip.


It is worth splitting out the stacked memory silicon layers on both too (if Cerebras is set up with external DRAM memory). HBM is over 10 layers now so the die area is a good bit more than the chip area, but different process nodes are involved.


Amazing!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: