AI LLM Costs: Not for the faint hearted!

Deep Learning comes with Deep pocket but there are easy alternatives as you most probably you don't need Deep algorithms to solve your Infra problem through AIOps.

Zielbox AI Team

6/21/20232 min read

Zielbox AI Consulting team places great emphasis on addressing the increasing costs associated with IT Operation management. We strive to provide comprehensive insights from various perspectives to ensure our customers make well-informed decisions before diving into the competitive landscape. This trend is reminiscent of the transition from on-premises solutions to the cloud, where smaller companies, facing budget constraints, experienced the challenges of investing heavily in cloud infrastructure. As a result, many organizations are now shifting towards a hybrid model to strike a balance between on-premises and cloud solutions.

Below cost was indirectly calculated for LLM Training:
---------
• $2.5k - $50k (110 million parameter model)
• $10k - $200k (340 million parameter model)
• $80k - $1.6m (1.5 billion parameter model)

There are 3 types of models:
---------
1. LLM(Large Language Model/Deep Learning) and General Purpose
2. Fine Tuned Models
3. Edge Models

NOTE:
----
-> Other business cost like employee salaries, management, office space etc are excluded from above numbers.
-> There are improvements in GPU or TPU usage and companies are bringing cost down by parallelism and other programming novel techniques but still it is far away for small companies.

Key Terms:
------
-> Parameter value calculated by learning the weight of "feature" that predicts the probability of next word.
-> Feature - are like n-gram frequencies, word frequencies, part-of-speech tags and syntactic dependencies

Parameters Supported by top 3 Models:
------------
# GPT-3 supports 175 Billion Parameters
# GPT-4 supports 100 Trillion Parameters
# Google's PaLM 540 Billion Parameter

Key Facts:
-------
-> OpenAI didn't disclosed the cost of model training but above numbers were calculated through research and inputs from other industry players.

-> Microsoft in Collaboration with NVIDIA opensourced Megatron-Turing NLG and that supports 530 Billion Parameters to the LLM model

-> Google, have chosen to keep the large language models they’ve developed in house and under wraps. For example, Google recently detailed — but declined to release — a 540 billion-parameter model called PaLM that the company claims achieves state-of-the-art performance across language tasks.

-> In general, a LLM can be characterized by 4 parameters: size of the model, size of the training dataset, cost of training, performance after training.

-> IT Infra people those who are thinking to leverage might end up using either Fine-tuned or Edge models due to the cost of LLM.

-> Edge-Computing Models like Google Assistant, Apple's Siri is just a example how NLP is getting leveraged on endpoint without incuring any cost and that might be one of the area where companies might start exploring to leverage this amazing tech.

->Making a sustainable business model around LLM is going to be big challenge for Deep Learning companies as many companies has windup their business due to increase in cost/funding.

-> Funding/Business Model/Sustainability was the reason behind OpenAI's decision to move from non-profit to commercial business and that is the reason you can see listed business plans in respect to tokens used during prompt interaction.

->Re-Training is separate from above calculated cost.

->Model Decay - is another issue where already trained model where company spent millions of dollar will forget the learning and companies need to retrain the model.

NOTE- >STAY POSITIVE FOR THE FUTURE AS THERE ARE POSSIBILITY THAT COST WILL COME DOWN.