SQL


The most common data access method is structure query language (SQL). This language is designed to allow easy access to relational data. If you are familiar when set theory or Venn diagrams SQL will seem very intuitive. SQL allows you to quickly and simply combine relational data. Below is an example of a database containing two tables, customers and purchase order.  The Customer table has customerid (the unique identifier for the customer), name (The customer’s name) and age (the customer’s age).  The PurchaseOrder table has POID (the unique id for the purchase order), customerid (which refers back to the customer who made the purchase) and Purchase (what the purchase was).

Example:
Customer
CustomerID Name Age
1 Joe 23
2 Mika 45
3 Lin 34
4 Sara 56
5 Susan 18
PurchaseOrder
POID CustomerID Purchase
1 3 Fiction
2 1 Biography
3 1 Fiction
4 2 Biography
5 3 Fiction
6 4 Fiction
 
     
SELECT
SELECT * FROM MyDataBase.Customer

CustomerID Name Age
1 Joe 23
2 Mika 45
3 Lin 34
4 Sara 56
5 Susan 18
ORDER BY
SELECT * FROM MyDataBase.Customer ORDER BY by age
CustomerID Name Age
5 Susan 18
1 Joe 23
3 Lin 34
2 Mika 45
4 Sara 56
WHERE
SELECT * FROM MyDataBase.Customer WHERE age = 18
CustomerID Name Age
5 Susan 18
INNER JOIN
SELECT * FROM MyDataBase.Customer INNER JOIN MyDataBase.PurchaseOrder ON Customer.CustomerID = PurchaseOrder.CustomerID
CustomerID Name Age POID Purchase
1 Joe 23 2 Biography
1 Joe 23 3 Fiction
2 Mika 45 4 Biography
3 Lin 34 1 Fiction
3 Lin 34 5 Fiction
4 Sara 56 6 Fiction
LEFT OUTER JOIN
SELECT * FROM MyDataBase.Customer LEFT OUTER JOIN MyDataBase.PurchaseOrder ON Customer.CustomerID = PurchaseOrder.CustomerID
CustomerID Name Age POID Purchase
1 Joe 23 2 Biography
1 Joe 23 3 Fiction
2 Mika 45 4 Biography
3 Lin 34 1 Fiction
3 Lin 34 5 Fiction
4 Sara 56 6 Fiction
5 Susan 18 NULL NULL
GROUP BY
SELECT Name, count(*) as Orders FROM Customer INNER JOIN PurchaseOrder ON Customer.CustomerID = PurchaseOrder.CustomerID GROUP BY Name
Name Age Orders
Joe 23 1
Mika 45 2
Lin 34 2
Sara 56 1
UPDATE
UPDATE Customer SET Age = 26 WHERE CustomerID = 1
Customer
CustomerID Name Age
1 Joe 26
2 Mika 45
3 Lin 34
4 Sara 56
5 Susan 18
INSERT
INSERT INTO Customer Values (6, Terry , 50)
Customer
CustomerID Name Age
1 Joe 23
2 Mika 45
3 Lin 34
4 Sara 56
5 Susan 18
6 Terry 50
DELETE
DELETE FROM Customer WHERE CustomerID = 1
Customer
CustomerID Name Age
2 Mika 45
3 Lin 34
4 Sara 56
5 Susan 18

OLAP

1. Intro

Online Analytical processing (OLAP) allows rapid ad hoc analysis of large datasets via a drag and drop programming. OLAP has two main components, a back end which stores multi-dimensional data and a front end which allows users to drill through the data. A pivot chart is an example of a front end.  The data can also be accessed via MDX, a multi-dimensional query language rooted in SQL. Microsoft, Cognos and Hyperion are the leaders in this market.  An OLAP cube can be considered a data warehouse.OLAP cubes can have two to many dimensions.  The maximum number of dimensions if determined software used but I seldom see a cube with more than 6 dimensions. The diagram below is of a 3 dimensional OLAP cube.

2. For Modeling

OLAP cubes allow users to quickly run complex queries using a graphical interface.  They do not find relationships in the data the user must drill through the data to find those relationships.  A good use for OLAP Cubes is verifying heuristic rules uncovered by other techniques.

3. Example

Table: Customer

Fields : Name, State, Age, Gender, Total Purchases, Total Expenditures, Total Fiction, Total Nonfiction
Dimensions

A dimension is any field of collection of fields you may want to group by. In the above example good candidates for dimensions are State, Age and Gender.

Facts

Fields you wish to report on are considered facts. In the above example good candidates for dimensions are Total Nonfiction, Total Fiction, Total Purchases and Total Expenditure.  Some field may be considered both facts and dimensions.  If a common question is, “what is the profile of are biggest customers?” then purchases and Expenditure should be added as Dimensions and State, Age and Gender added as Facts.   In practice most if not all Dimensions will also be included as Facts.  Notice Name is excluded as both a Dimension and Fact.  Name is the lowest level of the data so should not be included unless required.