Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Microsoft: invests 10 billion in company. Also Microsoft: here's the tools you need to DIY one of the premium features the company we just invested 10 billion in for free.

Not that reproducing GPT-4 is going to be easy with this, but it'll definitely get rid of some major hurdles. I read a report about the difficulties HuggingFace had with producing their Bloom model, and a lot of it was the sort of straight forward systems engineering that goes into tooling like this.

Is the Bloom model considered a failure by the community? If you read the introduction it was supposed to include improvements over GPT3, but it performs much worse, I guess because of lower quality training data? I wonder what sort of company would have high enough quality data that they could use this project to fine tune a public model to the point where it would be better in some scenario than plain old GPT4 would be. Especially when you can just inject extra info in to the GPT4 prompt, like phind does for example. What even is the use of fine tuning given GPT 4 exists?



> Microsoft: invests 10 billion in company. Also Microsoft: here's the tools you need to DIY one of the premium features the company we just invested 10 billion in for free.

In my mind, MSFT spent that money to acquire a head start on getting LLM-style capabilities into MS’s profitable product portfolio. This is money well spent:

1. MSFT can and will make money on these capabilities.

2. If MSFT didn’t do this, they would take the substantial risk of someone else pulling it off and attacking their moat.

I can’t really imagine today’s Google pulling this off with Google Docs. Adobe doesn’t target MS’s market directly enough to be an immediate risk. Apple doesn’t seem interested in competing with MS. Meta is doing its own thing in the corner. But someone really could attack MS with something amazing and make short-term sales, which could turn into a long term loss for MS. (Salesforce? They don’t seem able make things that normal people want to use.). But MS is now ahead of the curve, and they didn’t really spend that much money to get there.

Keep in mind that LibreOffice is not vastly less capable than Office 365, and Teams is not exactly an insurmountably strong piece of technology.


> In my mind, MSFT spent that money to acquire a head start on getting LLM-style capabilities into MS’s profitable product portfolio. This is money well spent:

personally I think in 10 years people will joke about the machine generated boilerplate in the same way they joke about Clippy today


Also, they will definitely be making some upside on the OpenAI stock…


My point is that MS’s investment in OpenAI may be a good deal for MS regardless of what happens to the valuation of OpenAI-the-company.

The LLM space is moving fast. OpenAI may stay on top for a long time, or it may not. But I expect Microsoft’s use of LLMs to be valuable for MS and likely market-leading in the office AI space for quite some time regardless of what happens to OpenAI.


Look at the HuggingFace codebase and you'll understand why Bloom is so subpar. Shame that funding money wasn't given to the Eleuther AI team instead.


Please elaborate? I’m not familiar with this.


can you elaborate or source?


Microsoft: invests 10 billion in company. Also Microsoft: here's the tools you need to DIY one of the premium features the company we just invested 10 billion in for free.

The idea is to get the people who arent willing to pay hooked on what they offer. Once you are used to a system you will probably want the same thing at your workplace, where they can charge a a prumium. Same thing was done with windows in asia.


> Microsoft: invests 10 billion in company. Also Microsoft: here's the tools you need to DIY one of the premium features the company we just invested 10 billion in for free.

This seems like more evidence that under the "commoditize your complement" framework, all intellectual property is the complement, and the only thing actually worth selling for Microsoft is subscriptions and server time.


Fine tuning an existing model won’t get you to GPT4 level quality. You need a larger base model, and lots of fine tuning to have a chance.


Yeah. Most of what is valuable to me about GPT-4 is its reasoning ability, not fact recall or writing quality. Fact recall has been mostly solved by Google search cards for years, and writing quality is not the most important thing now that I'm no longer a freelance writer, GPT-3.5 and some of the good OS models like Koala produce okay writing quality.

What nothing else can provide is something that will reason intelligently with the data you give it, with similar or better quality to paying for something like MTurk, for much cheaper and nearly instant delivery. That reasoning ability comes from the model size and training data quality, and in real applications using CoT, LangChain etc a lot of it comes from the context length. 8k is better than anything else I've tried at real use cases, and I very much want to try 32k because that opens up a lot of space to do new things (e.g. dump in a textbook on the domain you want the model to reason about). I want even longer context lengths than that too, but we'll have to see how it develops. From what I understand context length/block size is a pretty pure relationship to the amount of compute and memory they're willing to devote during training. RWKV's architectural changes may shake that up a bit, we'll see when Stability releases it.


Microsoft also has partnership with Databricks who is doing Dolly.

Databricks wants to have people use their compute to use other LLMs and Microsoft wants that compute to be Azure.


>>>Microsoft: invests 10 billion in company. Also Microsoft: here's the tools you need to DIY one of the premium features the company we just invested 10 billion in for free.

your point?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: