The amazing adventures of Doug Hughes

This post will be the first in a short series of blog posts showing an overview of Data Warehousing with a slant on how we as ColdFusion developers may encounter it. In these articles we will use a simple database example.

The first major stepping stone in understanding Data Warehousing is to grasp the concepts and differences between the two overall database categories. The type most of us are used to dealing with is the On Line Transactional Processing (OLTP) category. I go into greater detail below. The other major category is On Line Analytical Processing (OLAP), in this design we are flattening out the data and the examples below should help to clarify this.

Accessing data to produce required reports is a crucial need in most organizations; in this example we will take data contained in an OLTP format and create two notional OLAP modeled tables from the OLTP format of the data. The OLAP format is how we might hold that same data if our goal was to produce an efficient version of the same data for reporting on large numbers of records. This need for large reports with lots of data can sometimes conflict with the needs of storing data in the first place as the database structures needed to accept, store and easily manipulate data efficiently can be markedly different from the structures needed to produce effective reports from that same data. This is clearly illustrated if we look at the differences between two basic kinds of database categories:


This is what we would characterize as the ongoing day-to-day functional copy of the database. It is where data is added and updated but never overwritten or deleted. The main needs of the OLTP operational database being easily controlled insertion and updating of data with efficient access to data manipulation and viewing mechanisms. Typically only single record or small record-sets should be manipulated in a single operation in an OLTP designed database. The main thrust here is to avoid having the same data in different tables. This basic tenet of Relational Database modeling is known as normalizing data.

Here is an example of a simple OLTP normalized database. (By the way I use Visio to create this and it is a bit quirky in accurately assigning Foreign Keys (FK)). The main thing to note is the number of tables and the relationships. This is really a simple database with in essence, people, companies and related items. In order to pull meaningful reports we have to join tables at the query level. In larger databases that are not well designed this can be a very challenging and inefficient process.



OLAP is a broad term that also encompasses data warehousing. In this model data is stored in a format which enables the efficient creation of data mining/reports. OLAP design should accommodate reporting on very large record sets with little degradation in operational efficiency. The overall term used to describe taking data structures in an OLTP format and holding the same data in an OLAP format is Dimensional Modeling It is the primary building block of Data Warehousing and we will go into greater detail on what that is and how it is used in a later blog post. Data in an OLAP database should always be read-only.

Here is an example of how some of the data shown above might be stored in an OLAP format for the purpose of producing large reports of a particular extract of the OLTP data format above. Here we created a dimensional model of two types of user reports. One for details of where users live the other of where they work, note that the same data is now in a de-normalized state. The OLTP and OLAP data could be kept synchronized by using Transactional Replication. It is important to note that all day-to-day manipulation of the data is carried out in the OLTP format database; the OLAP version being a replicated, read-only copy.


We have two tables here that contain data from the OLTP database but in this case it is in a much more flattened state. In this case we could and probably would have lots of repeated data particularly in the ReportPeopleWork table. In an OLAP model repetition is not necessarily a bad thing.

In this first article we have demonstrated some differences between OLTP and OLAP database types.

Firstly all transactions on the data are carried out in the OLTP, transactional, copy of the database. The normalized structure is best for ensuring that little or no data is repeated. Basic user information is contained in the People table, the home and work address for the user is maintained in the Location table and the home address work address for each person is determined by the Locationtype. These different tables are related via Link tables. All adding and updating of data is done in the OLTP database.

Secondly data which needs to be used in large recordsets in a read-only format, such as for reporting, is best held in an OLAP format. Here each table can be thought of in the same way as we might think of an object in OO programming paradigms, with each row in the table being a single complete object. There still may be reasons for linking tables but in the OLAP model this should be kept to a bare minimum. An excessive need for linking tables in the OLAP database probably denotes a structural weakness in how the OLAP database was modeled.

Comments on: "Data Warehousing Part 1 OLAP and OLTP" (11)

  1. doug boude said:

    Doug, very intriguing and clarifying post. The natural question that began nagging me after the third paragraph though is, what are some common methodologies for migrating your data from the OLTP structure to the OLAP warehouse and keeping that current?? Perhaps it is already in your list of topics to cover with this series…forgive my anticipation! The storing and mining of large amounts of data is a subject near and dear to my heart recently. 🙂


  2. Nathan Mische said:

    Interesting post. While I’ve built a few data warehouses in the past, I’ve never taken the approach outlined here. All of the data warehouses I’ve built have used a dimensional data model with central fact tables supported by many lookup tables. Do you see any performance issues using a flat model with large amounts of data?

    @ Doug – A dimensional data model is different than the relational model used in an OLTP database, so you still have to do some sort of extract, transform, and load (ETL) process to get data from your OLTP database to your OLAP warehouse. It depends on your needs, but I’ve generally done this via some sort of nightly process.


  3. Mike Brunt said:

    Doug, yes this series will go into setting up transactional replication to create a read-only copy of the data. The next posting will cover Dimensional Modeling which touches on how to design the OLAP version of the OLTP database.


  4. Mike Brunt said:

    Nathan, thanks for your post and of course Dimensional Modeling or an overview of it will be part of this series. This first post was really just to introduce the differences between OLAP and OLTP data models rather than a detailed article on all involved in Data Warehousing which of course is a very dense topic. Thanks very much for your comments.


  5. Nathan Mische said:

    @Mike – After I read your post I started to write my own post about dimensional modeling, but now that I know you plan to cover it I may wait to finish it.

    Anyway, I’m glad to hear you are going to touch on the topic and I’m looking forward to your next post!


  6. please write more samples about normalization,i am working on a project of stock management.and i want to know more about normalization, i have got almost 25 tables so far.


  7. more data need


  8. i want to sent comparsation on oltp , olap
    thank u


  9. normalization said:

    plz give more information about olap&oltp and also normalization


  10. Gr8 post.Thanks for the information.


  11. aiting for NEWS? Welcome! See about Download, Kazaalite free download, Download free softwear, Download media player free


Comments are closed.

Tag Cloud

%d bloggers like this: