ELT with Pig: Managing Data Transformations in "Load First" Environments

Speaker: Joshua Fennessy One of the key distinctions of so-called data lake (or enterprise data hub) scenarios is that data is first loaded to the environment, then transformed into useful information. This is often opposite of traditional EDW systems where the data must be transformed before it's loaded. In this Hadoop-focused session, we'll use an Apache project called Pig to load unformatted data and wrangle it into a fashion that can be used by data architects who might be

Register to view the full article

You have reached some of our most popular content! Register or log in to view.


Registering gives you access to more exclusive content like this article.