The need for developers skilled in large language models (LLMs) is a multifaceted issue influenced by a range of factors.
As a relatively novel technology, LLMs are still in the early stages of adoption, which means there is a limited pool of developers with experience in this area.
The total cost of training large language models is substantial and increases as the model grows.
Even with the recent advances in computational techniques and data infrastructure that have reduced the costs of training of LLMs, they still require vast computational resources, with costs becoming prohibitively high.
"In effect, only a handful of companies have the infrastructure and the resources in place to be able to train their LLMs," Davit Buniatyan, CEO of Activeloop, explained. "As a result, only a few developers could be involved in these projects at work."
The trend, however, is turning around, with LLM training techniques becoming more efficient and developers being able to train and run smaller LLMs with far more modest computational resources.
Themos Stafylakis, head of machine learning and voice biometrics at Omilia, said the release of ChatGPT created a boom or spark across the world — where now every business is talking about and looking to understand and integrate AI in some way.
LLM developers require training in several areas, specifically machine learning, including tokenization, token embeddings, transformers, encoder-decoder, decoder-only architectures, autoregressive models, adapters, and policy-based reinforcement learning, he said.
In addition, they need to be able to train very large architectures with multiple GPUs (methods of parallelism) and to be able to create datasets that reflect the way the LLM will be used.
"Lastly, they need to be able to perform prompt engineering," he said. "Developers who combine all these skills are hard to come by."
LLMs Driving Generative AI Tools
Large language models are the engines that are driving the generative AI tools being put out into the market, according to Seth Robinson, vice president for industry research at CompTIA.
"Software developers would be trained in certain algorithms around how these large language models work using technology called transformers," he said. "To the extent that an organization might be looking for someone that would develop a large language model, those skills are probably in relatively short supply since this is such a new technology."
On the other hand, he said, the demand for those developers might not be as drastic as some organizations think it is.
"You might have demand that would come from the business side of the house, saying, 'We need to build a large language model,'" Robinson said. "That's probably not the way that a lot of organizations are going to approach it — where they're going to be investing in their own company-specific large language model."
Buniatyan added that academia has focused on educating individuals in data science, computer vision, and natural language processing.
"However, universities beyond dedicated research labs have recently focused on training LLMs," he said.
To illustrate, according to Cornell University's arXiv service, in the last 12 months, 45.51% of all large language model papers have been published (out of a total of 10,729).
"We see universities trying to adapt quickly by releasing brief courses on large language models," Buniatyan said. "Still, the educational efforts are primarily spearheaded by the industry."
These include initiatives such as the Foundational Model Certification program released in collaboration with Activeloop, TowardsAI, and Intel Disruptor Initiative.
"Building applications with LLMs is not straightforward, and specific challenges are related to prompt engineering and handling the ambiguity of natural language processing," Buniatyan said.
Substantial Effort, Investment Needed
While developing a "shiny demo" with LLMs is easy, it takes substantial effort and resources to make it production-ready.
Universities are beginning to focus on providing their students with the skills needed to take LLMs into production, but these efforts are still very nascent, according to Buniatyan.
"Developing a GPT-3 type model can easily reach $5 million in computing costs each week, with smaller versions still costing millions to train," he said.
As such, universities will aim to train their top talent to be familiar with efficient, high-performance computing (HPC).
"This starts with learning about data structures and various HPC programming techniques in colleges," Buniatyan explained.
The University of California, Berkeley, recently announced its new College of Computing, Data Science, and Society, whose focus is to prepare its students for a world powered by AI.
Robinson noted that with every institution with some kind of software development curriculum, most of that curriculum continues to be relevant.
"For someone specializing in large language models, that's going to be on the tail end or maybe on an advanced level after you've gone through an awful lot of software development, training, and curriculum," he said.
Universities and institutions today offer machine learning and data science/engineering courses and are gradually adapting them to incorporate LLMs and transformers, Stafylakis added.
"Though the AI journey started a long time ago for some, the boom happened very quickly, so I anticipate these universities will work to keep pace and increase their offerings in this space," he said.
An Evolving Job Market for LLM Developers
As the field of LLMs continues to advance, the job market for LLM developers is expected to evolve.
"Given the increasing adoption of LLMs across various applications, the demand for LLM developers will likely increase," Buniatyan said. "In addition, as LLMs become more specialized and are used for more complex tasks, the skillset required for LLM developers will also evolve."
In the short term, he expects a demand for prompt engineers, noting that companies like Anthropic are hiring prompt engineers with a base annual salary of $280,000 to $375,000.
Another short-term demand is expected in reinforcement learning from human feedback (RLHF) specialists across dedicated domains (legal, medical, etc.).
"These domain experts would be engaged in providing feedback to the LLM output to make it better," Buniatyan said.
However, while the field of LLMs is proliferating, several challenges may impact the future job market, he added.
"The high computational costs associated with training these models and the relative scarcity of developers with the necessary expertise could potentially restrict the number of available jobs in this field," he cautioned.
Luring Top LLM Talent Comes at a Price
Stafylakis said he sees four ways organizations can lure top LLM developer talent in today's day and age.
"Having a coherent and challenging vision about LLMs is top of the list for the important role it plays in exciting the developer talent," he explained.
Additionally, companies need to have a means to provide good, clean data or the means to collect it. Without this, the job of the LLM becomes mostly about working backward to move forward.
"If you're serious about advancing in the AI space, you need to have a strong MLOps team to support these LLM developers — no one wants to come in as a one-man show," Stafylakis said.
Lastly, although it's a challenging economy, the role of an LLM developer requires a hefty list of skills so it's important to recognize this and offer a competitive salary.
"My outlook is that the job market for LLM developers will continue thriving for several years, as companies realize that developing and serving their own LLMs is the way to get ahead and stay ahead in this competitive market," Stafylakis said.
About the authorNathan Eddy is a freelance writer for ITPro Today. He has written for Popular Mechanics, Sales & Marketing Management Magazine, FierceMarkets, and CRN, among others. In 2012 he made his first documentary film, The Absent Column. He currently lives in Berlin.