Hacker Newsnew | past | comments | ask | show | jobs | submit | more xipix's commentslogin

Absolutely you can. With WebAsm SIMD you have near-native DSP performance. Downsides from my experience [1]:

- You are at the mercy of the browser. If browser engineers mess up the audio thread or garbage collection, even the most resilient web audio app breaks. It happens.

- Security mitigations prevent or restrict use of some useful APIs. For example, SharedArrayBuffer and high resolution clocks.

[1] https://bungee.parabolaresearch.com/bungee-web-demo


Intel, AMD and others also have chips for training that perform close to or sometimes better than Nvidia's. These are already in the market. Two problems: the CUDA moat, and, "noone gets fired for buying green".


Rent one?


Have you rented a car recently? I'd rather drive my own car for 3 hours. Then I'll also have an easier time securely bringing a whole bunch of stuff, which I sometimes do also.


Getting negative vibes from the name. Cadmium's a nasty substance. I mean, you wouldn't call an augmented reality app ARsenic, would you? Constructive, I hope, criticism.


I suspect naming it after a substance that's poisonous to humans isn't an issue for most average users, and for the users who are even aware what cadmium is, might just improve memory retention when trying to think of the name again further in the future.

Personally, I think it's a catchy name.


You could add even more punniness to the AR name by calling it ARscenic. I don’t think anyone wouldn’t use the software just because it’s named after a toxic substance.


sure, why not?


A great pun overrides all other concerns.


A proportional font really helps ergonomics too.


This is probably a snarky reply, but here is the serious answer: proportional fonts, with appropriate kerning, is a lot more legible than monospaced font. There is a reason why the press moved into that direction once it was technically feasible. But the same people that bring books as an example why 80 character line length should be enforced would gag at the notion of using proportional fonts for development. It just goes to show that none of these things actually matter, it’s just legacy patterns that remain in-place from sheer inertia, with really very little relevancy today other than the inertia of the past.


<snark> I'm glad you cleared this all up for us. </snark>

Other people disagree with you and it's best to not assume they are idiots.


So far, you are the only one making a fool of themselves.


Code uses much more punctuation than prose, and punctuation is hard to discern in a proportional font.


I agree. I've tried coding in C-like languages with proportional fonts a few times, and punctuation ends up feeling cramped, hurting legibility. We need more proportional fonts for programming where punctuation gets the same size and spacing as in monospaced fonts.


Depends on which language you are writing in. Historically Smalltalk UIs use proportional fonts, and they work just fine.


#1 will lead to #2

#2 will solve #3


How is local more private? Whether AI runs on my phone or in a data center I still have to trust third parties to respect my data. That leaves only latency and connectivity as possible reasons to wish for endpoint AI.


If you can run AI in airplane mode, you are not trusting any third party, at least until you reconnect to the Internet. Even if the model was malware, it wouldn’t be able to exfiltrate any data prior to reconnecting.

You’re trusting the third party at training time, to build the model. But you’re not trusting it at inference time (or at least, you don’t have to, since you can airgap inference).


Why is it that every audio waveform renderer looks super aliased? I also recently made a fast audio waveform renderer using the best bits of Wasm and WebGL to create an animatable waveform that looks a bit less 1990s: https://play.google.com/store/apps/details?id=com.parabolare...

Happy to open/share the code, just haven't had time.

PS (yours didn't work for me in Safari)


Since the number of pixels on the x-axis of the canvas is typically much smaller than the number of samples, you have to reduce the number of samples. This is very prone to aliasing if not done right, especially if you end up just drawing a line between the points. I found a good way to avoid aliasing by taking the min/max values for each chunk of samples and then filling the area in between rather than drawing lines between points. If you zoom in to a point where the window is only a few values, this will converge to the same result as just drawing lines between the samples. You can test it out by uploading audio files to our audio-to-midi web demo: https://samplab.com/audio-to-midi


My point is that everyone seems to draw waveforms using only two colours. Inevitably, this results in aliasing.

In the extreme we have the "Spotify" visualisations of vertical bars with gaps in between. I believe this is popular because it looks slightly better than a solid waveform lump with an aliased edge.

To avoid aliasing you need to use more than two pixel colours.


> To avoid aliasing you need to use more than two pixel colours.

The aliasing in question is of the audio not the pixels, so no more colours does not help.


Agreed, there is aliasing of the audio. But also aliasing in the way that the waveform is rendered.

Consider: would you draw a line, or a circle using only a foreground and a background pixel colour?


> would you draw a line, or a circle using only a foreground and a background pixel colour?

That's 99% simple pixelation, not aliasing. And far less of a problem than the true aliasing in question.


Mathematically it's aliasing. And it's fixed by antialiasing. I'll bet everything on the screen you're looking at right now is antialiased. Would be nice if audio waveform visualisations were too.


> Mathematically it is aliasing.

Not by e.g. Wikipedia.

https://en.m.wikipedia.org/wiki/Aliasing


Sure it is, it's the first thing to be said just after the title and widgets

> This article is about aliasing in signal processing, including computer graphics.

In computer graphics, the relevant aliasing is spatial aliasing, in fact mentioned in the article: the signal is the fundamental shape (such as a font glyph or a triangle mesh or whatever), and the samples are the pixels.

In the specific application of a waveform, a typical "CD quality" audio file has 44.1 thousand samples per second, and say, 16 bits per sample. If we want to display the waveform of one second of audio horizontally on an entire standard low-density full HD computer screen, we have 1920 samples to fit our 1 second of audio data, and 1080 samples of amplitude with which to paint the amplitude.

Putting it into signal processing terms, The signal frequency here is 44.1Khz, and the sampling frequency is 1.92Khz. Do you see how aliasing applies now? We want to represent f_Signal / f_Sample = 22.96875 samples of audio with 1 sample.

In practice you get an even worse ratio, because we usually want more than 1 second of waveform to be visible on a region that isn't the entire screen.


> the signal is the fundamental shape (such as a font glyph or a triangle mesh or whatever)

No. The signal components being aliased are frequencies e.g. repeating patterns.

"aliasing is the overlapping of frequency components resulting from a sample rate below the Nyquist rate."

That is why the example is a brick wall and the result is moire banding. Nothing like your shapes and jaggies.

What you've mistaken for aliasing is simply pixellation.


These are the same thing. A shape with a solid boundary is a a signal with a discontinuous step: If you Fourier it, it has infinite nonzero terms, therefore you can't represent it exactly with any finite amount of frequencies, and therefore a finite amount of samples.

In the case of Moiré patterns in pictures, we have lines in the real world that need to fit into pixels that fit a larger area than the Nyquist rate of those lines. The Moiré effect in pictures is just the interference pattern caused by this aliasing.

If you look at just a column of the image, and imagine the signal as being the brightness varying over the Y coordinates, you can imagine the mortar being an occasional regular pulse, and when your sampling rate (the pixel density) isn't enough, you get aliasing: you skip over, or overrepresent, the mortar to brick ratio, variably along the signal.

https://imgur.com/a/BiZcxG5

Now if you look at the graph in that picture, doesn't that look awfully similar to what happens if you try to sample an audio file at an inferior rate for display purposes?

In fact, try it right now, download Audacity, go to Generate>Tone, click OK with whatever settings it's fine, press Shift+Z to go down to sample level zoom, then start zooming out. Eventually, you'll see some interesting patterns, which are exactly the sort of aliasing caused by resampling I'm talking about:

https://i.imgur.com/bX2IFp8.png


> you'll see some interesting patterns, which are exactly the sort of aliasing caused by resampling I'm talking about

I see this and agree. This is true aliasing. I believe this is the OP's "super aliased" look.

However I disagree that this is "the same thing" as the jaggies that can be avoided by more colours.

This cannot be avoided by more colours. It can be avoided only by increased resampling rate.


How do we add more colours (besides just picking a random colour, which wouldn't be helpful)?

By sampling the signal more often ("multi-sample anti aliasing"), also known as increasing the resampling rate, then representing that with a wider bit depth (not just 1 bit "yes/no", but multiple bits forming a color/opacity), since we do have more than 1 bit per pixel that can be used already.

I'll give it to you that this is "anti aliasing", not "not having aliasing in the first place", but the Fourier argument above is the reason why in computer graphics we practically always have to "settle for" AA instead.


Its much easier to read the information with the aliasing


> Since the number of pixels on the x-axis of the canvas is typically much smaller than the number of samples

That's not the cause. The cause is simply that too few of the samples are plotted.

Your min/max solution succeeds by ensuring all significant samples are plotted.


If you're canvas has say 2000 pixels on the x-axis and you're trying to plot one second of 44.1kHz audio, you'll end up with more than 20 samples per pixel. You can then either reduce the number of samples or draw multiple lines within that pixel. Both approaches can result in aliasing. OP's approach seems to just draw lines between every sample, so using the second option. If you change the "Scale" in the example, you can clearly see how peaks appear/disappear due to aliasing (especially between 200 - 400 frames / px).


> You can then either reduce the number of samples or draw multiple lines within that pixel. Both approaches can result in aliasing.

I would disagree that the latter can result in the "super aliased" in question.

Drawing one line per sample leaves very little aliasing.


> Drawing one line per sample leaves very little aliasing.

It's definitely gonna look better than just skipping samples, but you're also gonna draw a lot of unnecessary lines.


Unnecessary lines are trivially avoided by the proposed min/max optimisation.


The min/max solution doesn't prevent aliasing. Consider that if you do manage to avoid aliasing, you'll be rendering something visually akin to a downscaled version of a fully-detailed, high-resolution plot of the waveform.


It renders a waveform that looks very similar to what it would look like if you drew a line between all points. If you have 100 samples per pixel and you draw lines between all of them, you'll essentially end up with a filled area that goes from the lowest sample to the highest. So practically the same thing as just taking the min and max and then filling everything in between. The advantage is that you're avoiding all those lines in between. If you zoom it, the signal changes very smoothly without those aliasing effects where peaks appear and disappear. The web demo currently doesn't allow zooming in, so you can't test it there, but if you download the whole Samplab app (https://samplab.com/download), you can try out what it looks like when zooming in.


A great explanation.


Min/max does prevent the "super aliased" result in question. I'd agree it leaves a little aliasing.


why not low pass filter / decimate the wave form to remove aliasing ?


You have to down sample by a lot (easily 10-100x). That would essentially remove a lot of the information in the signal. If you low pass filter you essentially remove all the peaks and the signal looks much "quieter" than it actually is.


Agreed. In the worst case, the audio is simply a high-frequency tone - which the LPF removes, leaving the plot showing silence.


ahhh didn't see that one. Perfectly makes sense ! Thx for explaining.


Oh this looks great, I'd love to have a look if you have time to share it. In my case, I'm just not very experienced with graphics. I'm sure with some tweaking it could look better, but I thought this was good enough for a v1.

Re Safari: Unfortunately WebGPU support is still WIP. I'll add a notice on the site haha. https://caniuse.com/webgpu


Thanks. I picked a fixed number of horizontal pixels per second for an internally cached image. I think it was 30 pixels/second so that scrolling at 1x playback rate would be smooth on 60/90/120Hz screens. So it can't zoom into the waveform like yours, but I don't need it to zoom.

There are two parts: first the C++/wasm that analyses the audio and generates a bitmap that has width 30*duration and not very high. It effectively draws a line from each audio sample to the next, with alpha, but this can be optimised like crazy to a point where the time it takes is unnoticeable.

The second part is the WebGL that the bitmap (aka texture) and renders to the canvas. In principle this could be rotated, spun in 3d, mapped to a sphere or any other crazy GPU manipulations. WebGL isn't too painful and works everywhere I tried including the Android webview in my Bungee Timewarp app.


In Safari, you can enable it under Develop > Feature flags.


This is beautiful! I'd love to learn from the code.


WebGPU only works in Chromium based browsers.


That's data size, not code. There's no fundamental reason that a program that can smoothly render unicode at 4k needs a GB download when kB could suffice.


We tried that in the Windows 9x days. We called that "DLL hell".

The idea was that programs would share libraries, and so why have a dozen identical frameworks on the same system? Install your libraries into system32. If it's already there but an earlier version, deploy your packaged one on top.

Turns out that nobody writes good installers, and binary level dependency requires too much discipline, and dependencies are a pain for users to deal with.

So shove the entire thing into a huge package, and things actually work at a cost of some disk space and memory.


> and things actually work at a cost of some disk space and memory.

I have ~10 000 .exe files on this machine, if none of them shared code and/or data (or were written in a ``modern`` language with 50+ MB hello worlds), they would not fit on my 1TB disk.


You can improve things significantly with a bit of coordination. That's how package managers work!


True, but I personally discovered this has limits.

What if you're working on something reasonably novel, like say, open source VR? Well, turns out you may want a quite eclectic mix of dependencies. Some you need the latest version, because it's state of the art stuff. Some is old because the new version is too incompatible. Some is dead.

Getting our work into a Linux distro is on my list, but even if dealing with all the dependencies works out, there's the issue of that we sometimes need to do protocol changes and upgrade on our own schedule, rather than whenever the new distro is released.

Distros are great for things that are supposed to integrate all together. They're less ideal for when you're working on something that is its own, separate thing like a game.

So for the time being, shoving it all into an AppImage it is.


You presume one option when the other option is a bundled but smaller renderer. The truetype renderer my terminal uses is about 700 lines of code. The C it's a translation of is about 1500. There's a sweet spot that might well be a bit higher to e.g. handle ligatures etc., but the payoff from going from that to some huge monstrosity is very small.


As somebody who actually works on a pretty large program, no, I'm absolutely not going to use your 700 LOC TTF renderer. I'm going to use the 128K LOC FreeType.

Why? Well, because it's the one everyone else uses. It's what comes with everyone's Linux distro. Therefore, if there's something wrong with it, it's pretty much guaranteed it'll break other stuff and somebody else is going to have to fix that. Also it probably supports everything anyone might ever want.

If your 700 LOC TTF renderer doesn't perform as it should, it might become my problem to figure out why, and I don't really want that.


I'm not suggesting you should. I'm pointing out that these things can be done with a whole lot less code. And a lot of the time so much less code that it is less of a liability to learn a smaller option. Put another way, I've had to dig into large font renderers to figure out problems before because they didn't work as expected and it became my problem, and I'd much prefer that to be the case with 700LOC I can be intimately familiar with than a large project. (I'm old enough to have had to figure out why Adobe's Type1 font renderer was an awful bloated mess, and in retrospect I should have just rewritten it from scratch, because it was shit; that it was used by others did not help us at all)

I ended up with this one in large part because it took less time to rewrite libschrift (the C option I mentioned) and trim it down for my use than figuring out how to make Freetype work for me. I now have a codebase that's trivially understandable in an hour or two of reading. That's what compact code buys you.

No, it won't do everything. That's fine. If I need Freetype for something where it actually saves me effort, I'll use Freetype. It's not about blindly rewriting things for the sake of it, but not lazily default to big, complex options whether or not they're the appropriate choice.

A lot of the time people pick the complex option because they assume their problem is complex, or because it's "the default", not on the merits.

There are tradeoffs, and plenty of times where the large, complex component is right, but far too often it is picked out of laziness and becomes a huge liability.


> We tried that in the Windows 9x days.

You say that as if it was some kind of failed one-off experiment of the 90s. We tried it in the Multics days, it caught on and the design philosophy is still popular to this day. It works quite well in systems with centrally managed software repositories, even if it doesn't in a system where software is typically distributed on a 3rd party shareware collection CD or download.com.


Quite simply: it lets you build portable apps with near-native performance. Same source for both your desktop browser and your phone. I really love using it to make audio apps like this one: https://bungee.parabolaresearch.com/bungee-web-demo, also it's great for other real time rendering / visualisation in the browser.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: