More

programjames · 2025-10-22T01:30:21 1761096621

Hmm, I think a mixture of beta distributions could work just as well as cateogrical here. I'm going to train it for PixelRNN, but it's going to take hours or days to train (it's a very inefficient and unparallelizable architecture). I'll report back tomorrow.

programjames · 2025-10-24T03:10:41 1761275441

Update 2:

After another 24 hours of training and around 100 epochs, we get down to 4.4 bits/dim and colors are starting to emerge[1]. However, an issue a friend brought up is that log-likelihood + beta distribution weights values near 0 and 1 much higher:

     log(Beta likelihood) = alpha * log(x) + beta * log(1-x)
                                      ^
                               log(0) --> oo

This means we should see most outputs be pure colors: black, white, red, blue, green, cyan, magenta, or yellow. 3.6% of the channels are 0 or 255, up from 1.4% after 50 epochs[2]. Apparently, an earth-mover loss might be better:

    E_{x ~ output distribution}[|correct - x|]

I could retrain this for another day or two, but PixelRNN is really slow, and I want to use my GPU for other things. Instead, I trained a 50x faster PixelCNN for 50 epochs with this new loss and... it just went to the average pixel value (0.5). There's probably a way to train a mixture of betas, but I haven't figured it out yet.

[1]: https://imgur.com/kGbERDg [2]: https://imgur.com/iJYwHr0

programjames · 2025-10-23T01:25:32 1761182732

Update 1: After ~12 hours of training and 45 epochs on CIFAR, I'm starting to see textures.

https://imgur.com/MzKUKhH

programjames · 2025-10-25T02:55:01 1761360901

Update 3:

Okay, so my PixelCNN masking was wrong... which is why it went to the mean. The earth-mover did get better results than negative log-likelihood, but I found a better solution!

The issue with negative log-likelihood was the neural network could optimize solely around zero and one because there are poles there. The key insight is that the color value in the image is not zero or one. If we are given #00, all we really know is the image from the real world had a brightness between #00 and #01, so we should be integrating the probability density function from 0 to 1/256 to get the likelihood.

It turns out PyTorch does not have a good implementation of Beta.cdf(), so I had to roll my own. Realistically, I just asked the chatbots to tell me what good algorithms there were and to write me code. I ended up with two:

(1) There's a known continued fraction form for the CDF, so combined with Lentz' algorithm it can be computed.

(2) Apparently there's a pretty good closed-form approximation as well (Temme [1]).

The first one was a little unstable in training, but worked well enough (output: [2], color hist: [3]). The second was a little more stable in training, but had issues with nan's near zero and one, so I had to clamp things there which makes it a little less accurate (output: [4], color hist: [5]).

The bits/dim gets down to ~3.5 for both of these, which isn't terrible, but there's probably something that can be done better to get it below 3.0. I don't have any clean code to upload, but I'll probably do that tomrrow and edit (or reply to) this comment. But, that's it for the experiments!

Anyway, the point of this experiment was because this sentence was really bothering me:

> But categorical distributions are better for modelling.

And when I investigated why you said that, it turns out the PixelRNN authors used a mixture of Gaussians, and even said they're probably losing some bits because Gaussians go out of bounds and need to be clipped! So, I really wanted to say, "seems like a skill issue, just use Beta distributions," but then I had to go check if that really did work. My hypothesis was Betas should work even better than a categorical distribution because the categorical model would have to learn nearby outputs are indeed nearby while this is baked into the Beta model. We see the issue show up in the PixelRNN paper, where their outputs are very noisy compared to mine (histogram for a random pixel: [6]).

[1]: https://ir.cwi.nl/pub/2294/2294D.pdf [2]: https://imgur.com/e8xbcfu [3]: https://imgur.com/z0wnqu3 [4]: https://imgur.com/Z2Tcoue [5]: https://imgur.com/p7sW4r9 [6]: https://imgur.com/P4ZV9n4

programjames · 2025-10-20T20:46:58 1760993218

I would prefer you refer to it as "courtesy" or "consideration" rather than "freedom".

programjames · 2025-10-18T05:40:31 1760766031

And paper money is a Chinese invention. Doesn't mean it's worthwhile to spend two weeks in an anthropology class talking about how much awesomer they are.

programjames · 2025-10-17T19:44:44 1760730284

That isn't a comparison to the state of the art, just a naive quantum clock.

programjames · 2025-10-14T20:17:09 1760473029

I believe the issue is measures of 'achievement' and 'excellence' typically use the 20th and 50th percentiles, respectively. For example, most education studies focus on what improves graduation rates, passing rates, or the bottom 10–20% on standardized assessments. Most school districts' financial incentives rely on these metrics as well. Occasionally you'll get a study that looks at the median (and even more rare, the 90th percentile), but no standardized assessment even showcases the bellends of the distribution. This is why you get claims like this

> High-achieving kids are doing roughly as well as they always have, while those at the bottom are seeing rapid losses

in the article, when they're actually doing much worse as well, which can be seen from the median (or top) scores in STEM competitions. All the financial incentives are for schools and teachers to focus on their bottom 20% of students, and even if they cared about their top students (culturally, most education departments do not), it's not even feasible to collect that data, short of signing everyone up for the AMC 12. So, naturally, the percent of elementary schools offering gifted programs has declined by fifteen (twenty?) percent in the last twenty years, and the general standards and curricula have been lowered to teach to the 20th percentile. This also creates a cultural issue where many students recognize they're bored, and that schools are not trying to teach them, so they become disaffected and stop caring. My friends in high school joked that school was for socializing, and frankly, what other purpose does it serve anymore for most kids?

jltsiren · 2025-10-14T20:34:40 1760474080

> My friends in high school joked that school was for socializing, and frankly, what other purpose does it serve anymore for most kids?

Schools have always been like that, at least if you ask the kids. The difference is whether the parents value education and expect their kids to work hard and study, or if they only see the school as a daycare center that allows them to work full time. And whether the wider society values education.

programjames · 2025-10-18T19:39:15 1760816355

I don't think schools have always been like that. I especially don't think schools have been like that for the top 10% of students until very recently (<20 years).

programjames · 2025-10-07T18:30:51 1759861851

It's a little easier to see what's happening if you fully write out the central flow:

    -1/η * dw/dt = ∇L - ∇S * ⟨∇L, ∇S⟩/‖∇S‖²

We're projecting the loss gradient onto the sharpness gradient, and subtracting it off. If you didn't read the article, the sharpness S is the sum of the eigenvalues of the Hessian of the loss that are larger than 2/η, a measure of how unstable the learning dynamics are.

This is almost Sobolev preconditioning:

    -1/η * dw/dt = ∇L - ∇S = ∇(I - Δ)L

where this time S is the sum of all the eigenvalues (so, the Laplacian of L).

lcnielsen · 2025-10-07T18:56:03 1759863363

Yeah, I did a lot of traditional optimization problems during my Ph. D., this type of expression pops up all the time with higher-order gradient-based methods. You rescale or otherwise adjust the gradient based on some system-characteristic eigenvalues to promote convergence without overshooting too much.

d3m0t3p · 2025-10-07T20:11:18 1759867878

This sounds a lot like what the Muon / Shampoo optimizer do.

d3m0t3p · 2025-10-07T21:42:29 1759873349

Would you have some literature about that ?

lcnielsen · 2025-10-08T19:20:39 1759951239

There's a ton but it's pretty scattered. Yurii Nesterov's a big name, for example.

programjames · 2025-10-07T02:01:38 1759802498

I think the take is: If 100k people watch the episode, spending $200 more for higher quality subtitling comes out to... a whopping 0.2 cents per person (per episode). Let's just say that would cost an extra $1/month per person. Are they price sensitive enough that they won't go to a competitor that's a few dollars more expensive per month if it has better subtitles? I don't know, but maybe some manager believed they were, and thus it was worth it to make the subscription a little cheaper.

varenc · 2025-10-07T03:43:10 1759808590

> Are they price sensitive enough that they won't go to a competitor that's a few dollars more expensive per month if it has better subtitles

Outside of Asia, Crunchyroll is a de-facto monopoly on legal anime. From the article, 70% of new releases are exclusive to Crunchyroll. They're not losing customers to platforms with better subs, because customers have no alternative.

(Besides pirating, but I assume the golden age of Tier 1 fan subs is over)

viraptor · 2025-10-07T06:29:03 1759818543

> I assume the golden age of Tier 1 fan subs is over

That's just because the legal options were easily available, right? Kind of like people stopped pirating as much when Netflix was actually decent. But now the tides are turning again, so maybe the fan subs will start coming back as well.

Sophira · 2025-10-07T08:13:21 1759824801

There used to be an unwritten rule in fansubbing that you should only fansub anime that didn't have a licensed release - but of course that was during the time when barely any anime got licensed.

Still, though, I wonder if that mindset is still going to be around.

joseda-hg · 2025-10-07T13:09:58 1759842598

Less now, but the bar is higher because now there's a baseline good enough product, so even if in the past you'd have done it anyway with more care, now unless the official sub is bad enough, why would you bother?

I remember seeing (I think Netflix release) of Komi-san can't communicate, noticing A lot of things being missed, like Komi's literal main manner of communication (A notebook where she writes) not getting any translation for some episodes, or a lot of things I'd have to fill others in that normally at least would have been a T/N in fansub

It was bad enough that I went looking elsewhere to see if I had missed more than I realized, and the fansub did have everything covered

Hamuko · 2025-10-07T09:09:48 1759828188

At the moment the threshold for a fansub getting made or not is whether or not the licensed releases are "good enough". If the official releases are terrible, expect someone to step up and at least fix the typesetting even if they use the script from the license.

ineedasername · 2025-10-07T16:53:36 1759856016

Also, it’s trivial to standup a minimal quality stt+translation workflow in something like comfyui, all freely available models, and run on modest consumer gpu, ~3050 is just fine. If you’re handy with this tech you can do a lot better. If crunchyroll is only going to have slightly better quality then it can be appealing even to moderate fans who wouldn’t spend the time doing things manually.

hnfong · 2025-10-07T08:57:50 1759827470

> Crunchyroll is a de-facto monopoly

Here's the answer right here...

snickerbockers · 2025-10-07T14:11:44 1759846304

I don't think that's accurate to the current market. Ten years ago it was true but in 2025 they have several competitors and not nearly as many exclusives. I can name several counterarguments.

* Shonen anime, which are consistently the most popular ones, are also on netflix and probably several other services. Eg, demon slayer, dandadan, etc.

* there are still shows that are japan-exclusive because nobody bothers to license them. Roboshinkalion is an entire franchise that nobody cares to import! We actually had to wait two extra years for gridman universe because nobody bothered to license it for English localization!

* just this year they failed to obtain the rights to Mobile Suit Gundam G-Quuuuuux and Panty and Stocking With Garterbelt because amazon outbid them. These are both new entries in well-established brands and they're both made by studios with large fan followings (khara for g-quuuuuux and trigger for panty and stocking).

* somehow Hulu ended up breaking harmony gold's 45-year blockade around the macross franchise and won exclusive streaming rights.

* netflix has a lot of exclusives these days, including Jojo stone ocean and the upcoming steelball run.

sleepybrett · 2025-10-07T16:26:30 1759854390

I've been hearing a lot about HiDive recently

jeron · 2025-10-07T05:41:47 1759815707

I was going to ask, crunchyroll has competitors for legal anime stremaing?

Tor3 · 2025-10-07T07:11:53 1759821113

At least in Europe, if CR has licensed a show or a season, then nobody else can license the same show or season. There's always exactly one place to watch one particular show or season. So, no competition - licensing goes to one, and only one place. Likewise, if Netflix has licensed something then CR isn't getting that license (e.g. Komi Can't Communicate - it's on Netflix, therefore not available on CR)

Erikun · 2025-10-07T07:28:33 1759822113

This may be true for current seasons but previous seasons and finished series are often available on other services. At least crunchyroll and Netflix have an overlap (in Sweden). Frieren is available on both as an example.

Tor3 · 2025-10-07T08:01:03 1759824063

Look at Railgun, for example.. they're all old, and each season is either on CR or on Netflix, never on both. Same with Index, and some others.

Erikun · 2025-10-07T12:44:00 1759841040

I’ll definetly give you that there is weird licensing schenanigans going on.

MSFT_Edging · 2025-10-07T12:31:16 1759840276

Who ever heard of an anime fan pirating media.....

Nicook · 2025-10-07T14:55:42 1759848942

By tier 1, we refer to commiesubs right?

slackfan · 2025-10-07T17:54:20 1759859660

[gg] of course.

chii · 2025-10-07T03:25:54 1759807554

i think you've counted it in a way that makes it sound cheap, but in reality isnt.

$100k per month is extra revenue, if they do a half-assed job. A customer actually has no competitor to move to - crunchyroll has a defacto monopoly (barring piracy).

The price of the subscription is already adjusted to be the maximum of what the market would bear for maximum revenue - presumably raising that price higher would lead to lower subscribers and revenue.

praxulus · 2025-10-07T04:27:09 1759811229

>A customer actually has no competitor to move to - crunchyroll has a defacto monopoly (barring piracy).

When fansubs were good, Crunchyroll was forced to compete with them on quality. It's hard to convince people to pay when the alternative is both free and much higher quality.

Now that they've driven fansubs groups "out of business", they no longer face the same degree of competitive pressure to deliver a quality product.

jcranmer · 2025-10-07T04:57:02 1759813022

My recollection is that, by the early days of Crunchyroll, fansubs weren't really competing on quality so much as speed. And with the legitimate licensors having access to the scripts slightly in advance of the Japanese release, the fansubs could never catch up to them in release speed.

extraduder_ire · 2025-10-07T08:41:19 1759826479

In the very early days they were a piracy site that hosted fansubs.

brigandish · 2025-10-07T03:35:28 1759808128

> barring privacy

That’s the key right there.

bl4ckneon · 2025-10-07T03:41:28 1759808488

The ironic part is that a large majority of the piracy is just crunchy roll rips or subtitles from crunchy roll.

gh02t · 2025-10-07T17:21:33 1759857693

Also worth remembering that CR itself started as a pirate site.

FooBarWidget · 2025-10-07T04:52:41 1759812761

Why is the $1 added to the subscription cost? They don't redo the subs every month. It's developing subs once and then enjoy the benefits forever. It should be a cost that's amortized over something.

thaumasiotes · 2025-10-07T07:04:37 1759820677

Well, it's not completely crazy. They don't redo the subs for an old show every month. But they do create new subs for new shows every month. They have constant, ongoing costs of subtitle development, and if they permanently increase those costs, they will be spending additional money (compared to the alternative) every month forever.

Loocid · 2025-10-07T03:40:16 1759808416

$1/month extra cost on $16/month of revenue is very significant though.

Retric · 2025-10-07T12:07:57 1759838877

1$/month was just wildly off.

They have 17 million paying subscribers. If they subtitled 1,000 episodes of content a month * 200$ = 200k / 17 million ~= 1 cent per subscriber per month. Actual cost per subscriber is well below that.

matheusmoreira · 2025-10-07T03:43:28 1759808608

> Are they price sensitive enough that they won't go to a competitor that's a few dollars more expensive per month if it has better subtitles?

They should probably consider that this competitor is actually mpv playing the DRM-free blu-ray quality fully subtitled mkv files obtained for a grand total of zero dollars from organized groups of people who simply care about anime to an absurd degree.

"Paying customer" is a synonym for "fool" in this context. Paying for inferior products is just foolish. It is damaging to one's self-respect. It is even more damaging for the reputation of the corporation. A bunch of fans regularly put them to shame by releasing better products on a daily basis. That's just pathetic.

what · 2025-10-07T05:15:40 1759814140

Without those “fools” paying, you would have no content to pirate.

matheusmoreira · 2025-10-07T05:36:20 1759815380

I'm actually one of the fools who tries to support creators by "buying" (licensing with 0 rights) their things. Why do you think I'm so angry at the shit quality of the products I receive in return? Anger doesn't even begin to describe what I feel when I pay for streaming services and get video so poorly encoded they have artifacts in black frames.

entropi · 2025-10-07T07:33:10 1759822390

I am also paying for crunchyroll and trying to support the creators in various ways.

But still, I often find myself watching anime from fansub groups even though I have a legitimate, official way of watching them. Paying for a streaming service that is objectively, significantly worse than even the shittier pirate offerings does make me feel like a fool.

DecoySalamander · 2025-10-07T07:31:36 1759822296

Anime will not disappear if CR implodes. It will still be funded by the Japanese market and other streamers. There will probably be fewer shows per season for a while, but that's not necessarily a bad thing.

GreenWatermelon · 2025-10-07T09:22:26 1759828946

Anime existed before Crunchyroll and will continue to exist after Crunchyroll.

Dylan16807 · 2025-10-07T10:08:11 1759831691

And sometimes it's more fun when there's no central source. Snarky chapter titles and leaving in a commercial for Morning Rescue when editing down the TV rip? Sure, why not.

promano · 2025-10-07T10:28:58 1759832938

Correlation is not causation and all that, but anime was better without Crunchyroll.

fhd2 · 2025-10-07T08:32:10 1759825930

I don't believe managers can operate with that kind of precision. I don't know how they'd execute the "let's spend 200$ more" idea. You're either in a quality or in a cost reduction mindset usually, these are _really_ difficult to mix for management. I know I've tried :) When you even bring up how long something takes, that can already have adverse effects on quality without you actually decreeing anything.

bonecrusher2102 · 2025-10-07T13:09:37 1759842577

Well, they can, and at least did. I know because I was one of them! The P&L that I rolled up to our execs was dead simple as well. I think everyone had a pretty clear picture of what was going on, down to the fraction of the hour.

programjames · 2025-10-06T17:10:04 1759770604

As someone in the Gen Z age bracket, I believe that they really do have a worse economy than their parents/grandparents due to no fault of their own, and that they're also much lazier than their parents/grandparents generations.

1. Social Security and housing has turned into a huge transfer of wealth from Gen Z to the Boomer generation. I don't believe I will ever receive Social Security benefits, and yet nearly 15% of my salary goes into Social Security. Also, people who own houses vote to keep their housing prices high (through zoning laws), which is... not how "investments" are supposed to work. It'd be like the US government hoarding all the gold when prices get too high. If a house is a place to live (in which case, zoning laws are perfectly fine), property taxes should be high enough that renting out a house is a much worse investment than say, government bonds. If a house is an investment, the protection racket needs to stop.

2. On the other hand, I've seen so many of my peers just... not study. Kids who easily got a 36 on the ACT, but would do the bare minimum in and outside of school. Now they're working pretty normal, white-collar jobs that pay about a median starting salary, but I know they could have easily made $150k+/year in tech if they'd just studied at any point between middle school and college. They probably won't struggle financially, but they won't ever really "make it" ($10m?) either. And, if this is the top of the class, you can imagine what it's like for those without the same natural ability.

So, on the one hand, I absolutely agree that Gen Z should have an easier time with housing, and shouldn't have to pay for their ancestors' unwise debts, but on the other hand, part of the reason they're struggling so much now is because they didn't spend the first twenty years of their life doing the only thing they were asked to do.

programjames · 2025-10-06T16:39:11 1759768751

I find it funny how people say GPT-5 "bombed". I noticed a significant improvement in maths and coding with GPT-5. To quantify were I've found the models useful:

- GPT 3.5: Good for finding reference terms. I could not trust anything it said, but it could help me find some general terms in fields I was unfamiliar with.

- GPT 4: Good for cached, obscure knowledge. I generally could trust the stuff it said to be true, but none of its logic or conclusions.

- GPT 4.5: Good for reference proofs/code. I cannot trust its proofs or code, but I can get a decent outline for writing my own.

- GPT 5: Good for directed thinking. I cannot trust it to come up with the best solution on its own, but if I tell it what I'm working on, it's pretty decent at using all the tricks in its repertoire (across many fields) to get me a correct solution. I can trust its proofs or code to be about as correct as my own. My main issues are I cannot trust it to point out confusion or ask me, "is this actually the problem we should be solving here?" My guess is this is mostly a byproduct of shallow human feedback, rather than an actual issue with intelligence (as it will often ask me at the end of spending a bunch of computation if I want to try something mildly different).

For me, GPT 5 is way more useful than the previous models, because I don't have a lot of paper-pushing problems I'm trying to solve. My guess is the wider public may disagree because it's hard to tell the difference between something better at the task than you, and something much better.

N70Phone · 2025-10-06T18:57:17 1759777037

> I find it funny how people say GPT-5 "bombed".

I used scare quotes for a reason. It didn't "bomb" in the sense of failing [insert metric], it bombed in the sense that OpenAI needed it to generate exponentially more hype and it just didn't. (And on a lesser level, GPT-5 was supposed to cut OpenAI's costs but has failed to do so)

> I can trust its proofs or code to be about as correct as my own.

I have little to say about this, as I find such claims to be broadly irreplicable. GPT-5 scores better on the metrics, but still has the same "classes" of faults.

CuriouslyC · 2025-10-06T17:12:00 1759770720

Gemini 2.5 was the first breakthrough model, people didn't know how to use it but it's incredibly powerful. GPT5 is the second true breakthrough model, it's ability to deal with math/logic/etc complexity and its depth of knowledge in engineering/science is amazing. Every time I talk to someone who stans Claude and is down on GPT5 I know they're building derivative CRUD apps with simple business logic in Python/Typescript.

programjames · 2025-10-05T18:03:51 1759687431

Shouldn't you expect Zoomers to be more than double millenials? They're 20 years younger, which means money is 4x cheaper.

lotsofpulp · 2025-10-06T10:48:16 1759747696

Yes, I am not seeing any problem with the $10M figure, assuming one wants to insure against old age health expenses, insure against loss of income between age 50 and 70, and be able to to travel here and there between the age of 60 and 80.

For a current 20 to 30 year old, they should expect SS benefits to come at a higher age, probably 72 or even 75, and whatever benefit amount they get will buy less than what it buys today. And they will have to shell out for better quality healthcare (i.e. a couple decades ago, they might have seen a doctor, but today, they see an NP/PA, and in a few more decades, they might not even get that).

The big expense is insuring against loss of income between age 50 and whatever age government benefits start. Most people will never make their way up the income ladder again, so they need to make sure they have adequate savings by then, otherwise they are cooked, and those are the ages that healthcare expenses start adding up.