Hacker Newsnew | past | comments | ask | show | jobs | submit | stuffoverflow's commentslogin

This seems like a massive improvement for openly available local ASR. Even the 300M model outperforms whisper-large-v3 according to the paper's benchmarks.

Not sure, I recorded 3 seconds of voice (a single sentence) and the hf demo misrecognized about half of the words.

This model is actually expected to be bad for popular languages, just like previous MMS it is not accurate at all, it wins by supporting something rare well but never had good ASR accuracy even for Swedish etc. It is more a research thing than a real tool. Unlike Whisper.

And moreover, you can not tune those models for practical applications. The model is originally trained on very clean data, so lower layers are also not very stable for diverse inputs. To finetune you have to update the whole model, not just upper layers.

In section 5.7.5, they fine-tune for "11 low-resource languages, with between 5-10 hours of training data and at least 1 hour of validation splits." "CTC fine-tuning takes ≈1 hour of walltime on 32 GPUs for the 300M scale." If that's too expensive, you also have the option of supplying additional context for the LLM-based model (section 5.5).

As for "very clean data," see section 5.7.4: "Omnilingual + OMSF ASR was intentionally curated to represent naturalistic (i.e., often noisy) audio conditions, diverse speaker identities, and spontaneous, expressive speech."


I can't tell if anthropic is serious about "model welfare" or if it's just a marketing ploy. I mean isn't it responding negatively because it has been trained that way? If they were serious, wouldn't the ethical thing be to train the model to respond neutrally to "harmful" queries?


"Protection against malicious use" isn't as cool as "model welfare". I'm renaming my authentication function to "examineCrest()".


VibeVoice-Large is the first local TTS that can produce convincing Finnish speech with little to no accent. I tinkered with it yesterday and was pleasantly surprised at how good the voice cloning is and how it "clones" the emotion in the speech as well.


Academictorrents has monthly dumps of all reddit submissions and comments even after the API restrictions.



Interesting. You don’t have to be an academic to access these I guess?


They have magnet links and torrent files right there on the pages, so no.


Archiveteam did a full site crawl[1] when Anandtech announced they were stopping. You can browse the warc.gz files like a regular web page using https://replayweb.page

Alternatively you could use solrwayback[2] to index and browse the warc files.

1: https://archive.fart.website/archivebot/viewer/job/202409012...

2: https://github.com/netarchivesuite/solrwayback


Also Kiwix[1] is an excellent app for browsing websites offline. You can use warc2zim[2] to convert the WARC files to ZIM files for use with Kiwix.

I was pleasantly surprised to find that the DWDS (digital dictionary of the German language) app is actually Kiwix!

[1]: https://www.kiwix.org/

[2]: https://github.com/openzim/warc2zim


> Kiwix

... I haven't heard this name in 15 years probably. Back then you could bring Wikipedia offline on a laptop, it was only around 20-25 GB.


You can still bring Wikipedia offline on a laptop (and mobile phone, for some of the larger ones), it is just that you'd need around 100GB instead. There is even a library[0] you can use to do your own wikipedia viewer.

[0] https://github.com/openzim/libzim


Yes and much less than 100 if you can do without images


I really like having the mobile version for fast searches, often faster than online. Useful for example while hiking or other out -of-network places. Even some big stores have zero signal inside and sometimes I want to look up things. You can also get almost any Stack Exchange site.

If you live in a low, but not zero, bandwidth environment... since the rise of LLMs it's now cheaper to have the models do your dirty work. Before, you might have to search through pages of results, load MBs of data and still not find the answer. Offloading that to a data center and getting a few hundred kB back is convenient. Coupled with Kiwix and you can do quite a lot with a lousy internet connection.


This is a bit tangential, but is there a good way to archive Discourse forums and turn them into regular websites? Anyone have experience to share?


That 27C limit seems to have been due to the energy crisis in 2022 and restrictions were lifted in 2023.

The last source you cited is AI slop and is not even related to your message.


This whole case is even more stupid when you take in to account how NYT's paywall is implemented. Anyone can bypass it just by refreshing the page and stopping it immediately after contents become visible.

I don't know what ChatGPT uses to browse web but it wouldn't surprise me if it repeated stuff from those paid articles because it uses wget or something similar that doesn't support js and therefore the paywalls weren't effective.


Didn't this stop working about 6 months ago when it added a delay around 4 paragraphs in saying it was checking your IP address before sending the rest of the article?


The paywall news sites want to have their cake and eat it to -- they want web crawlers, like Google, to read the full contents of the article but hide it from site visitors.

If they simply put all their content behind a paywall entirely and effectively then this would be a non-issue.

If ChatGPT is getting this content it's literally because they allow it.


>The paywall news sites want to have their cake and eat it to -- they want web crawlers, like Google, to read the full contents of the article but hide it from site visitors.

There's nothing contradictory about this? Plenty of companies give free access to journalist/reviewers/influencers, with the hope that they'd draw in paying customers. Wanting to only give free access to certain people isn't "want to have their cake and eat it to". It's standard business practice and well within the rights of publishers/rights holders to do.


Yes there is. They don't want ChatGPT to have access but they don't prevent access by ChatGPT. Technically they're actually giving everyone free access. By actually legitimately preventing access they would completely mitigate this problem.


> They don't want ChatGPT to have access but they don't prevent access by ChatGPT.

They don't want ChatGPT to do certain things after accessing it and they don't prevent access by ChatGPT. They don't mind if ChatGPT accesses it.


I mean you can also walk out of a store without paying before they can stop you. Doesn't change the nature of the offence.


The point is that there's no need to resort to using something like ChatGPT to do avoid the paywall, so most people who want to avoid the paywall wouldn't bother with using ChatGPT to do it.


I assume it will be implemented on DNS level and yeah it is possible to use a different DNS.


Google AI studio, ChatGPT and Claude all support this. Google AI studio is the only one that let's you branch to a separate chat though. For ChatGPT and claude you just edit the message you want to branch from.


Feels like a semi-simple UX fix could make this a lot more natural. Git-style forks but for chats.


Support: Yes. But the UX is not optimized for this.

Imagine trying to find a specific output/input that was good in the conversation tree.


Yes, it would be nice if you could at least bookmark a particular branch.


I have yet to see concrete evidence that disabling Windows update and windows defender would elevate risk of having the system compromised in any meaningful way.

I installed Windows 10 2016 ltsc on a VM at the end of last year out of curiosity to test that. Disabled wupdate and defender before letting it access the internet so that it was basically 8 years behind on any updates. I tried browsing all kinds of sketchy sites with Firefox and chrome, clicking ads etc. but wasn't able to get the system infected.

I would guess that keeping your browser updated is more important.


Correct! The browser is now the key vector because it's the most promiscuous and lascivious-for-code-and-data software on most devices.

Browser-zero days are why I factored out a way to distribute "web RPA agent creation" on any device, with no download - into its own product layer for browser-isolation. It's a legitimate defense layer but main barriers to adoption are operating friction, even tho it makes the task of hackers who want to compromise your network with browser 0-days much harder.

Because of that the RBI aspect is not as popular as ways its being used where you need a really locked down browser, with policies for preventing upload/download, even copy and paste, etc - for DLP (data loss prevention), for regulated enterprises.

Even so I think the potential applications of this tech layer are just starting.


Just the other day I went to a website to flash a new firmware on a zigbee dongle. Straight from a chrome tab. wild!

Then it hit me: the only thing keeping a rogue website from sweeping your entire life is a browser's permissions popup.


Crazy right? On the whole I think it’s great and wonderful that the web platform has grown into the gorgeous monster that it is. I mean what better than a unified technology to serve us all the worlds information from any device in a basically sandboxed environment. I’m even all for the beautiful way The platform has developed rapidly added capabilities on how the language JavaScript HTMLNCSS has evolved. I think all that is wonderful. And I really enjoyed the ride.

But all of that growth and integration comes with these vulnerabilities, and so the cyber and DLP control aspect of web browsers is a very important one.

If this resonates with you, i invite you to check out my company’s project BrowserBox on GitHub


> I have yet to see concrete evidence that disabling Windows update and windows defender would elevate risk of having the system compromised in any meaningful way.

It’s much less likely than it was 20 years ago. A lot of attack vectors have already been fixed. But hypothetically a bug in the network stack could still leave an internet connected machine vulnerable.


Do not connect it directly - use a dedicated router device.


You benefit from the fact that most machines are patched. If a lot more people used 2016 builds and didn’t patch you’d see a lot more exploits.


I use stock Win7 SP1 with just a couple updates (recently TLS and SHA-512, but only 27 hotfixes in total) and the only way to break something is if I deliberately run unverified executables that were manually downloaded from untrusted sources. And since I don't do this - my machine is still running the same installation that I did on December 24th 2014.



> browsing all kinds of sketchy sites with Firefox and chrome

How did you install those - downloaded via another system? Because with that old system, you are missing ssl certificates (Firefox and Chrome bring their own).


Maybe, but with good old Windows PKI you’re bound to still have a working chain of trust with Mozilla/Google.

…either that or the machine cheated and updated root CAs in the background (which isn’t Windows Update-controlled anymore).


How do you know your system weren't infected in that experiment?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: