Meta’s Llama 3.1 Release: Changing the Competitive Dynamics of GenAI

Share: Print

Recent developments suggest that the race for AI market share is driving innovation that will bring about a greater range of practical and affordable applications for business computing.

The release of Meta’s Llama 3.1 open-source large language model brings it roughly on par with proprietary ones. The open-source computing model is cost-efficient and fosters rapid innovation through collaboration. Over time, the risks associated with open-source computing – such as security and intellectual property issues – have proven manageable. A major advantage of Llama 3.1 is that it permits users to train and fine-tune models of varying sizes on their own data without having to expose the data to third parties. Moreover, constantly training models on an enterprise’s real data prevents “model collapse,” which is a degenerative process caused when the model indiscriminately learns from data produced by other models. In essence, accurate modeling is forever in need of a reality check.

The ability to economically and efficiently train and fine-tune models of various sizes for a specific use case is essential for business applications. Size matters because, for business software, some tasks in a process need a large model while others can be more cost-effectively supported with a narrow language model or through an orchestration of narrow models. For example, a set of narrow models might be the best approach to automate the “reading” of a digital invoice attached to an email, summarizing elements of the email and recording and accounting the transaction. This automation not only boosts productivity at the front end of the process, but it can also substantially reduce data entry errors, provide important context that is otherwise lost and cut down on the need for internal audit or other quality control efforts.

Competition in the market is also having an impact on pricing. Following Llama 3.1’s release, OpenAI has made it free to fine-tune GPT 4o mini, a smaller version of the company’s flagship GPT-4o model. Open-source computing also creates pressure to develop efficiencies to drive down provider costs and to structure pricing on a value chain that better reflects market demand. Rather than being a one-price bundle, services and products are disaggregated. Some become free of charge while others are priced to reflect the value provided to the customer. These impacts on pricing reduce inefficiencies and accelerate adoption and consumption.

Until now, the ability to distill models has been prohibited by the terms set forth by service providers. Distillation is a technique whereby “student” models are trained to mimic a larger and more complex “teacher” model. Student models are smaller and simpler and therefore less expensive to operate because they consume fewer compute resources. They can make pre-training faster and less expensive. Llama 3.1 permits distillation with limited restrictions, which will likely force others to follow suit.

Share:

About the author

Robert Kugel

Robert Kugel

Rob leads and manages the business software research and advisory team focusing on the intersection of information technology and applications across the front- and back-office areas of enterprises. Rob leads the Office of Finance practice and the AI for Business efforts and is a book author and thought leader on integrated business planning (IBP). Prior to ISG and two decades at Ventana Research, he was an equity research analyst at several firms including Credit Suisse, Morgan Stanley and Drexel Burnham, and a consultant with McKinsey and Company. Rob was an Institutional Investor All-American Team member and on the Wall Street Journal All-Star list. Rob earned his BA in Economics/Finance at Hampshire College, an MBA in Finance/Accounting at Columbia University and is a CFA charter holder.