Why do we need to data profile

Before we can use a data source, it is imperative that we first check the data for completeness. This means looking at each column we wish to use and check it for things like missing values, inconsistent data and more. If the data is not profiled first, then valuable time can be wasted down the track. Doing data profiling can protect the ETL team from unforseen dirty data. In the words of Kimball:

Do the data profiling up front!