the cost would be depending on GPU type/serving system/traffic pattern. check out some throughput comparison from vllm's blog post https://vllm.ai/
if you serve 7B on cost-optimized GPUs (A10G/L4) and keep it busy, it can be a lot cheaper than gpt3.5 turbo. tho it's not a fair comparison as 3.5's quality is still far better.
Just want to add about hosting your own LLM vs using ChatGPT. Cost is definitely a thing to consider, but it also depends on whether it is ok to share the requests to your product with OpenAI.
Also, something you cannot do with ChatGPT is to custom it with your own data, such as internal documents, etc. As shown in the blog, the model trained by ourselves can easily know its identity.
> Can you name all of the soccer players whose names contain a Z
> Here are all the soccer players whose names contain a "Z" that I could find:
Zlatan Ibrahimovic
Zinedine Zidane
Zlatko Zahovic
Zeljko Kalac
Zoran Banovic
Zlatko Arsovic
Zoran Savic
Zoran Jovanovic
Zoran Djurdjic
Zoran Radosavljevic
Zoran Zekic
Zoran Mirkovic
Zoran Sretenovic
Zoran Sosic
Zoran Luburic
Zoran Vulin
Zlatan Ljubijankic
Zoran Lekovic
Zoran Culibrk
Zoran Bojovic
Please note that this list may not be exhaustive, and there may be other soccer players with names containing "Z" that are not included.
The Zoran repetition looked suspicious to me and from some quick googling of some of the ones in the middle of that I can't find "Zoran Vulin football" or soccer or "Luburic Zoran football" or soccer results. So it's less blantant but still seems to have lost the plot.