Since 2015, Google has been using its own internally-developed Tensor Processing Unit (TPU) chips to run machine-learning applications inside its data centers at speeds 15 to 30 times faster than traditional CPUs and GPUs.
Though the company unveiled the existence of the TPU chips back in May of 2016, it only now is revealing more of the technological details of the chips, which were developed by Google to continue and advance the compute-intensive machine learning processes it has been using in its products for more than 15 years.
"The need for TPUs really emerged about six years ago, when we started using computationally expensive deep learning models in more and more places throughout our products," Norm Jouppi, a Distinguished Hardware Engineer at Google, wrote in an April 5 post on the Google Cloud Platform Blog. "The computational expense of using these models had us worried. If we considered a scenario where people use Google voice search for just three minutes a day and we ran deep neural nets for our speech recognition system on the processing units we were using, we would have had to double the number of Google data centers."
That need led to the creation of TPU chips to help increase compute power without requiring an expansion of data centers, wrote Jouppi.
And speed is what the TPU chips bring to the process, he wrote, with performance 15 to 30 times above the compute power of typical CPUs and GPUs on production artificial intelligence workloads, along with vastly increased energy efficiency. The TPU chips offer 30 to 80 times higher energy efficiency compared with standard chips using the TOPS/Watt measure (tera-operations [trillion or 1012 operations] of computation per Watt of energy consumed), according to Jouppi.
"TPUs allow us to make predictions very quickly, and enable products that respond in fractions of a second," he wrote. "TPUs are behind every search query; they power accurate vision models that underlie products like Google Image Search, Google Photos and the Google Cloud Vision API; they underpin the groundbreaking quality improvements that Google Translate rolled out last year; and they were instrumental in Google DeepMind's victory over Lee Sedol, the first instance of a computer defeating a world champion in the ancient game of Go."
Google released its expanded details of its TPU development in a presentation at a National Academy of Engineering meeting at the Computer History Museum in Silicon Valley, where it also released a study about the inner workings of the custom TPU chips, wrote Jouppi. More than 70 authors contributed to the report, which details the intricacies, design and thought processes that went into the development of the chips. "It really does take a village to design, verify, implement and deploy the hardware and software of a system like this," he wrote.
A TPU is an application-specific integrated circuit (ASIC) that was built specifically for machine learning. It is designed to use the TensorFlow open source software library, which was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization to conduct research on machine learning and deep neural networks, according to Google.
TPU chips are tailored to machine learning applications, allowing the chips to be more tolerant of reduced computational precision, which means it requires fewer transistors per operation, according to Google. Due to that inherent efficiency, the chips can squeeze more operations per second into the silicon using more sophisticated and powerful machine learning models to get results more rapidly.
Each TPU chip can be installed in a data center rack on a board that fits into a hard disk drive slot.