> 1. Qualcomm develops a chip that competitive in performance to ARM
Virtually all high performance processors these days operate on their own internal “instructions”. The instruction decoder at the very front of the pipeline that actually sees ARM or RISC-V or whatever is a relatively small piece of logic.
If Qualcomm were motivated, I believe they could swap ISAs relatively easily on their flagship processors, and the rest of the core would be the same level of performance that everyone is used to from Qualcomm.
This isn’t the old days when the processor core was deeply tied to the ISA. Certainly, there are things you can optimize for the ISA to eke out a little better performance, but I don’t think this is some major obstacle like you indicate it is.
> 2. The entire software world is ready to recompile everything for RISC-V
#2 is the only sticking point. That is ARM’s only moat as far as Qualcomm is concerned.
Many Android apps don’t depend directly on “native” code, and those could potentially work on day 1. With an ARM emulation layer, those with a native dependency could likely start working too, although a native RISC-V port would improve performance.
If Qualcomm stopped making ARM processors, what alternatives are you proposing? Everyone is switching to Samsung or MediaTek processors?
If Qualcomm were switching to RISC-V, that would be a sea change that would actually move the needle. Samsung and MediaTek would probably be eager to sign on! I doubt they love paying ARM licensing fees either.
But, all of this is a very big “if”. I think ARM is bluffing here. They need Qualcomm.
> Everyone is switching to Samsung or MediaTek processors?
Why not? MediaTek is very competitive these days.
It would certainly perform better than a RISC-V decoder slapped onto a core designed for ARM having to run emulation for games (which is pretty much the main reason why you need a lot of performance on your phones).
Adopting RISC-V is also a risk for the phone producers like Samsung. How much of their internal tooling (e.g. diagnostics, build pipelines, testing infrastructure) are built for ARM? How much will performance suffer, and how much will customers care? Why take that risk (in the short/medium term) instead of just using their own CPUs (they did it in some generations) or use MediaTek (many producers have experience with them already)?
Phone producers will be happy to jump to RISC-V over the long term given the right incentives, but I seriously doubt they will be eager to transition quickly. All risks, no benefits.
> Virtually all high performance processors these days operate on their own internal “instructions”. The instruction decoder at the very front of the pipeline that actually sees ARM or RISC-V or whatever is a relatively small piece of logic.
You're talking essentially about microcode; this has been the case for decades, and isn't some new development. However, as others have pointed out, it's not _as_ simple as just swapping out the decoder (especially if you've mixed up a lot of decode logic with the rest of the pipeline). That said, it's happened before and isn't _impossible_.
On a higher level, if you listen to Keller, he'll say that the ISA is not as interesting - it's just an interface. The more interesting things are the architecture, micro-architecture and as you say, the microcode.
It's possible to build a core with comparable performance - it'll vary a bit here and there, but it's not that much more difficult than building an ARM core for that matter. But it takes _years_ of development to build an out-of-order core (even an in-order takes a few years).
Currently, I'd say that in-order RISC-V cores have reached parity. Out of order is a work in progress at several companies and labs. But the chicken-and-egg issue here is that in-order RISC-V cores have ready-made markets (embedded, etc) and out of order ones (mostly used only in datacenters, desktop and mobile) are kind of locked in for the time being.
> Many Android apps don’t depend directly on “native” code, and those could potentially work on day 1.
That's actually true, but porting Android is a nightmare (not because it's hard, but because the documentation on it sucks). Work has started, so let's see.
> With an ARM emulation layer, those with a native dependency could likely start working too, although a native RISC-V port would improve performance.
I wonder what the percentage here is... Again, I don't think recompiling for a new target is necessarily the worst problem here.
> > Virtually all high performance processors these days operate on their own internal “instructions”. The instruction decoder at the very front of the pipeline that actually sees ARM or RISC-V or whatever is a relatively small piece of logic.
> You're talking essentially about microcode; this has been the case for decades, and isn't some new development.
Microcode is much less used nowadays than in the past. For instance, several common desktop processors have only a single instruction decoder capable of running microcode, with the rest of the instruction decoders capable only of decoding simpler non-microcode instructions. Most instructions on typical programs are decoded directly, without going through the microcode.
> However, as others have pointed out, it's not _as_ simple as just swapping out the decoder
Many details of an ISA extend beyond the instruction decoder. For instance, the RISC-V ISA mandates specific behavior for its integer division instruction, which has to return a specific value on division by zero, unlike most other ISAs which trap on division by zero; and the NaN-boxing scheme it uses for single-precision floating point in double-precision registers can be found AFAIK nowhere else. The x86 ISA is infamous for having a stronger memory ordering than other common ISAs. Many ISAs have a flags register, which can be set by most arithmetic (and some non-arithmetic) instructions. And that's all for the least-privileged mode; the supervisor or hypervisor modes expose many more details which differ greatly depending on the ISA.
> Many details of an ISA extend beyond the instruction decoder. For instance, the RISC-V ISA mandates specific behavior for its integer division instruction, which has to return a specific value on division by zero, unlike most other ISAs which trap on division by zero; and the NaN-boxing scheme it uses for single-precision floating point in double-precision registers can be found AFAIK nowhere else. The x86 ISA is infamous for having a stronger memory ordering than other common ISAs. Many ISAs have a flags register, which can be set by most arithmetic (and some non-arithmetic) instructions. And that's all for the least-privileged mode; the supervisor or hypervisor modes expose many more details which differ greatly depending on the ISA.
All quite true, and to that, add things like cache hints and other hairy bits in an actual processor.
1. That doesn't mean you can just slap a RISC-V decoder on an ARM chip and it will magically work though. The semantics of the instructions and all the CSRs are different. It's going to be way more work than you're implying.
But Qualcomm have already been working on RISC-V for ages so I wouldn't be too surprised if they already have high performance designs in progress.
That is a good comment, and I agree things like CSR differences could be annoying, but compared to the engineering challenges of designing the Oryon cores from scratch… I still think the scope of work would be relatively small. I just don’t think Qualcomm seriously wants to invest in RISC-V unless ARM forces them to.
> I just don’t think Qualcomm seriously wants to invest in RISC-V unless ARM forces them to.
That makes a lot of sense. RISC-V is really not at all close to being at parity with ARM. ARM has existed for a long time, and we are only now seeing it enter into the server space, and into the Microsoft ecosystem. These things take a lot of time.
> I still think the scope of work would be relatively small
I'm not so sure about this. Remember that an ISA is not just a set of instructions: it defines how virtual memory works, what the memory model is like, how security works, etc. Changes in those things percolate through the entire design.
Also, I'm going to go out on a limb and claim that verification of a very high-powered RISC-V core that is going to be manufactured in high-volume is probably much more expensive and time-consuming than the case for an ARM design.
edit: I also forgot about the case with Qualcomm's failed attempt to get code size extensions. Using RVC to approach parity on code density is expensive, and you're going to make the front-end of the machine more complicated. Going out on another limb: this is probably not unrelated to the reason why THUMB is missing from AArch64.
> verification of a very high-powered RISC-V core that is going to be manufactured in high-volume is probably much more expensive and time-consuming than the case for an ARM design.
That's not really comparable. Raspberry Pi added entirely separate RISC-V cores to the chip, they didn't convert an ARM core design to run RISC-V instructions.
What is being discussed is taking an ARM design and modifying it to run RISC-V, which is not the same thing as what Raspberry Pi has done and is not as simple as people are implying here.
I am fan of the Jeff Geerling Youtube series in which he is trying to make GPU (AMD/Nvidia) run on Raspbery Pi. It is not easy - and they have linux kernel source code available to modify. Now imagine all Qualcomm clients have to do similar stuff with their third party hardware, possibly with no access to source code of drivers. Then debug and fix for 3y all the bugs that pop up in the wild. What a nightmare.
Apple at least have full control on hardware stack (Qualcomm do not as they only sells chips to others).
Hardware drivers certainly can be annoying, but a hobbyist struggling to bring big GPUs’ hardware drivers to a random platform is not at all indicative of how hard it would be for a company with teams of engineers. If NVidia wanted their GPUs to work on Raspberry Pi, then it would already be done. It wouldn’t be an issue. But NVidia doesn’t care, because that’s not a real market for their GPUs.
Most OEMs don’t have much hardware secret sauce besides maybe cameras these days. The biggest OEMs probably have more hardware secret sauce, but they also should have correspondingly more software engineers who know how to write hardware drivers.
If Qualcomm moved their processors to RISC-V, then Qualcomm would certainly provide RISC-V drivers for their GPUs, their cellular modems, their image signal processors, etc. There would only be a little work required from Qualcomm’s clients (the phone OEMs) like making sure their fingerprint sensor has a RISC-V driver. And again, if Qualcomm were moving… it would be a sea change. Those fingerprint sensor manufacturers would absolutely ensure that they have a RISC-V driver available to the OEMs.
> If NVidia wanted their GPUs to work on Raspberry Pi, then it would already be done. It wouldn’t be an issue. But NVidia doesn’t care, because that’s not a real market for their GPUs.
It's weird af that Geerling ignores nVidia. They have a line of ARM based SBCs with GPUs from Maxwell to Ampere. They have full software support for OpenGL, CUDA, and etc. For the price of an RPi 5 + discreet GPU, you can get a Jetson Orin Nano (8 GB RAM, 6 A78 ARM cores, 1024 Ampere cores.) All in a much better form factor than a Pi + PCIe hat and graphics card.
I get the fun of doing projects, but if what you're interested in is a working ARM based system with some level of GPU, it can be had right now without being "in the shop" twice a week with a science fair project.
“With the PCI Express slot ready to go, you need to choose a card to go into it. After a few years of testing various cards, our little group has settled on Polaris generation AMD graphics cards.
Why? Because they're new enough to use the open source amdgpu driver in the Linux kernel, and old enough the drivers and card details are pretty well known.
We had some success with older cards using the radeon driver, but that driver is older and the hardware is a bit outdated for any practical use with a Pi.
Nvidia hardware is right out, since outside of community nouveau drivers, Nvidia provides little in the way of open source code for the parts of their drivers we need to fix any quirks with the card on the Pi's PCI Express bus.”
I mean in terms of his quest for GPU + ARM. He's been futzing around with Pis and external GPUs and the entire time you've been able to buy a variety of SBCs from nVidia with first class software support.
Naively, it would seem like it would be as simple as updating android studio and recompiling your app, and you would be good to go? There must be less than 1 in 1000 (probably less than 1 in 10,000) apps that do their own ARM specific optimizations.
Without any ARM specific optimizations, most apps wouldn’t even have to recompile and resubmit. Android apps are uploaded as bytecode, which is then AOT compiled by Google’s cloud service for the different architectures, from what I understand. Google would just have to decide to support another target, and Google has already signaled their intent to support RISC-V with Android.
I remember when Intel was shipping x86 mobile CPUs for Android phones. I had one pretty soon after their release. The vast majority of Android apps I used at the time just worked without any issues. There were some apps that wouldn't appear in the store but the vast majority worked pretty much day one when those phones came out.
I'm not sure how well it fits the timeline (i.e. x86 images for the Android emulator becoming popular due to better performance than the ARM images vs. actual x86 devices being available), but at least these days a lot of apps shipping native code probably maintain an x86/x64 version purely for the emulator.
Maybe that was the case back then, too, and helped with software availability?
> Android apps are uploaded as bytecode, which is then AOT compiled by Google’s cloud service for the different architectures, from what I understand.
No, Android apps ship the original bytecode which then gets compiled (if at all) on the local device. Though that doesn't change the result re compatibility.
However – a surprising number of apps do ship native code, too. Of course especially games, but also any other media-related app (video players, music players, photo editors, even my e-book reading app) and miscellaneous other apps, too. There, only the original app developer can recompile the native code to a new CPU architecture.
> No, Android apps ship the original bytecode which then gets compiled (if at all) on the local device.
Google Play Cloud Profiles is what I was thinking of, but I see it only starts “working” a few days after the app starts being distributed. And maybe this is merely a default PGO profile, and not a form of AOT in the cloud. The document isn’t clear to me.
> Virtually all high performance processors these days operate on their own internal “instructions”. The instruction decoder at the very front of the pipeline that actually sees ARM or RISC-V or whatever is a relatively small piece of logic.
If that's true, then what is arm licensing to Qualcomm? Just the instruction set or are they licensing full chips?
Qualcomm has historically licensed both the instruction set and off the shelf core designs from ARM. Obviously, there is no chance the license for the off the shelf core designs would ever allow Qualcomm to use that IP with a competing instruction set.
In the past, Qualcomm designed their own CPU cores (called Kryo) for smartphone processors, and just made sure they were fully compliant with ARM’s instruction set, which requires an Architecture License, as opposed to the simpler Technology License for a predesigned off the shelf core. Over time, Kryo became “semi-custom”, where they borrowed from the off the shelf designs, and made their own changes, instead of being fully custom.
These days, their smartphone processors have been entirely based on off the shelf designs from ARM, but their new Snapdragon X Elite processors for laptops include fully custom Oryon ARM cores, which is the flagship IP that I was originally referencing. In the past day or two, they announced the Snapdragon 8 Elite, which will bring Oryon to smartphones.
A well-designed (by apple [1], by analyzing millions of popular applications and what they do) instruction set. One, where there are reg+reg/reg+shifted_reg addressing modes, only one instruction length, and sane useful instructions like SBFX/UBFX, BFC, BFI, and TBZ. All of that is much better than promises of a magical core that can fuse 3-4 instructions into one magically.
I get that. I just work quite distantly from chips and find it interesting.
That said, licensing an instruction set seems strange. With very different internal implementations, you'd expect instructions and instruction patterns in a licensed instruction set to have pretty different performance characteristics on different chips leading to a very difficult environment to program in.
>Many Android apps don’t depend directly on “native” code, and those could potentially work on day 1. With an ARM emulation layer, those with a native dependency could likely start working too, although a native RISC-V port would improve performance.
This is only true if the application is written purely in Java/Kotlin with no native code. Unfortunately, many apps do use native code. Microsoft identified that more than 70% of the top 100 apps on Google Play used native code at a CppCon talk.
>I think ARM is bluffing here. They need Qualcomm.
Qualcomm's survival is dependent on ARM. Qualcomm's entire revenue stream evaporates without ARM IP. They may still be able to license their modem IP to OEMs, but not if their modem also used ARM IP. It's only a matter of time before Qualcomm capitulates and signs a proper licensing agreement with ARM. The fact that Qualcomm's lawyers didn't do their due diligence to ensure that Nuvia's ARM Architecture licenses were transferable is negligent on their part.
ARM already did the hard work. Once you've ported your app to ARM, you've no doubt made sure all the ISA-specific bits are isolated while the rest is generic and portable. This means you already know where to go and what to change and hopefully already have testing in place to make sure your changes work correctly.
Aside from the philosophy, lots of practical work has been done and is ongoing. On the systems level, there has already been massive ongoing work. Alibaba for example ported the entirety of Android to RISC-V then handed it off to Google. Lots of other big companies have tons of coders working on porting all kinds of libraries to RISC-V and progress has been quite rapid.
And of course, it is worth pointing out that an overwhelming majority of day-to-day software is written in managed languages on runtimes that have already been ported to RISC-V.
The thing about RISC-V is that they indirectly have the R&D coffers of the Chinese government backing them for strategic reasons. They are the hardware equivalent of Uber's scale-first-make-money later strategy. This is not a competition that ARM can win purely relying on their existing market dominance.
Everyone that doesn't want to write Java/Kotlin is using the NDK.
Although from Google's point of view the NDK only purpose is for enabling writing native methods, reuse of C and C++ libraries, games and real time audio, from point of view of others, it is how they sneak Cordova, React Native, Flutter, Xamarin,.... into Android.
That's what's magical about Apple. It was a decade-long transition. All the 32-bit code that was removed from macOS back in 2017 was all in preparation for the move in 2019.
Apple has done it multiple times now and has it down to a science.
68k -> PPC -> x86 -> ARM, with the 64 bit transition you mixed in there for good measure (twice!).
Has any other consumer company pulled a full architecture switch off? Companies pulled off leaving Alpha and Sparc but that was servers which has a different software landscape.
I don't believe any major company has done it. Even Intel failed numerous times to move away from x86 with iAPX432, i960, i860, and Itanium all failing to gain traction.
For Apple it was do or die the first few times. Until x86, if they didn’t move they’d just be left in the dust and their market would disappear.
The ARM transition wasn’t strictly necessary like the last ones. It had huge benefits for them, so it makes sense, but they also knew what they were doing by then.
In your examples (which are great) Intel wasn’t going to die. They had backups, and many of those seem guided more by business goals than a do-or-die situation.
In a way that's also true for the x86->ARM transition, isn't it? I had an MacbookAir 2018. And.. "it was crap" is putting it very, very mildly. Yes it was still better than any Windows laptop I got since and much less of a hassle than any Linux laptop that I'm aware of in my circle. But the gap was really, really small and it cost twice as much.
But the most important part for the working of the transition is probably that, in any of theses cases, the typical final user didn't even notice. Yes a lot of Hackernews-like people noticed as they had to recompile some of their programs. But most people :tm: didn't. They either use AppStore apps, which were fixed ~immediately or Rosetta made everything runnable, even if performance suffered.
But that's pretty much the requirement you have: You need to be handle to transition ~all users to the new platform with ~no user work and even without most vendors doing anything. Intel never could provide that, not even aim for it. So they basically have to either a) rip their market in pieces or b) support the "deprecated" ISA forever.
> Rosetta made everything runnable, even if performance suffered.
I think a very important part was that even with the Rosetta overhead, most x86 programs were faster on the m1 than on the machines which it would have been replacing. It wasn’t just that you could continue using your existing software with a perf hit; your new laptop actually felt like a meaningful upgrade even before any of your third party software got updated.
> I do think the ARM transition wasn’t strictly good
That’s a total typo I didn’t catch in time. I’m not sure what I tried to type, but I thought the transition was good. They didn’t have to but I’m glad they did.
But ARM and RISC-V are relatively similar and it's easy to add custom instructions to RISC-V to make them even more similar if you want so you could definitely do something like Rosetta.
Switches like that are major, but get easier every year, and are easier today than they were yesterday, as everyones tools at all levels up and down both the hardware and software stacks get more powerful all the time.
It's an investment with a cost and a payoff like any other investment.
Qualcomm's migration would be much easier than Apple's.
Most of the Android ecosystem already runs on a VM, Dalvik or whatever it's called now. I'm sure Android RISC-V already runs somewhere and I don't see why it would run any worse than on ARM as long as CPUs have equal horsepower.
Yeah, but Qualcomm doesn’t control Android or any of the phone makers. It’s hard for large corps to achieve the internal coordination necessary for a successful ISA change (something literally only Apple has ever accomplished), but trying to coordinate with multiple other large corps? Seems insane. You’re betting your future on the fact that none of the careerists at Google or Samsung get cold feet and decide to just stick with what works.
It's not about whether they can, it's whether they will. History has proven that well-resourced teams don't like doing this very much and will drag their feet if given the chance.
it's not about that, it's about running the apps whose makers are out of business or just find it easier to tell their customers to buy different phones
I think long term is doing a lot of heavy lifting here. How long until:
1. Qualcomm develops a chip that competitive in performance to ARM
2. The entire software world is ready to recompile everything for RISC-V
Unless you are Apple I see such a transition taking a decade easily.