The Evolution of the Cloud Data Warehouse

Teradata's Data Warehouse

Data Warehousing:
The Analytic Foundation

The origins of the data warehouse

The data warehouse concept started in 1988 when Barry Devlin and Paul Murphy published their groundbreaking paper in the IBM Systems Journal. Their vision sparked a need for more specific definitions of database implementations, which Bill Inmon and Ralph Kimball provided in the early 1990s – and Gartner further clarified definitions in 2005. Now any discussion on data warehousing also includes how or where a data warehouse is implemented, such as within the cloud, or spanning on-premises and cloud in a hybrid manner.

A data warehouse isn’t a collection of tables or measured in terabytes. It’s a design pattern, a data architecture with many characteristics:


Reflects business entities and processes that the organization works with daily. The level of detail in the subject area is what is important: if detailed data is there, it is a data warehouse. If summary or only limited data is there, it is a data mart.

Integrated, Consistent

Data formats and values are standardized across all tables to ensure complete, accurate data that users can understand. It must also have integrity: e.g., it cannot have purchasing transactions without a corresponding customer record.

Nonvolatile History

A warehouse captures data changes and tracks data changes over time. All data is kept and does not change with transactional updates. Whether traditional, hybrid, or cloud, a data warehouse is effectively the “corporate memory” of its most meaningful data.

How Does Data Get into the Data Warehouse?

Data Sources

It’s common to have hundreds of applications sending data to the warehouse, which consolidates it all into subject areas. The warehouse gets input from applications such as enterprise resource planning (ERP), customer relationship management (CRM), supply chain management (SCM), internet of things (IoT) sensor data, and website clickstreams.

Data Integration

Before the data goes into the data warehouse database, it passes through the data integration step, a complex process that rationalizes data from multiple sources into a single result. Originally this was called extract, transform, and load (ETL) because the data had to be pulled from the source, refined, then loaded into data warehouse relational tables.

Data Cleansing

Modern integration processes include data cleansing, which involves detecting and correcting corrupt or inaccurate records. Errors occur due to faulty inputs, hardware corruption, or simple human error. The data integration task combines the best, most accurate and most complete data from multiple applications into a clean, reliable “golden record” in the warehouse.

Data Loading

Data is loaded into the warehouse in a continuous process – typically all day long. Data loading leads to the business purpose of the warehouse: the foundation for finding answers to questions. Data scientists apply advanced mathematics to find patterns and anomalies, while business analysts use reports and dashboards with visualization.

The Cloud Data Warehouse and Teradata Vantage

Both Teradata and the industry at large evolved to incorporate the benefits of cloud deployment and scalability. Teradata Vantage, the company’s flagship offering, builds on the strong foundation of Teradata Database and incorporates advanced analytic capabilities acquired with Aster Data in 2011.

Vantage is available for Amazon Web Services (AWS), Microsoft Azure, Google Cloud, Teradata infrastructure (Teradata Cloud or Customer Cloud), and commodity hardware running VMware virtualization software.

AWS and Vantage Data Warehouse Solutions
Amazon Web Services Use AWS infrastructure with Teradata Vantage
Azure and Vantage Data Warehouse Solutions
Microsoft Azure Combine Azure resources with Teradata Vantage
Google Cloud and Vantage Data Warehouse Solutions
Google Cloud Leverage Google Cloud with Teradata Vantage
Get started now with cloud analytics

Take your analytics into the cloud

Download White Paper