**1. Frequencies\Charts\Plots**

Simple frequencies and plots can tell you quickly if a relationship exists between two or more variables. However, reliance solely on graphs as a diagnostic or research tool, as with any technique, potentially blind you to discovering the true underline relationship.

**Example code (R)**

library(tcltk) data(longley) hist(longley$Unemployed, breaks= Sturges , col= darkgray ) boxplot(longley$Unemployed, ylab= Unemployed) scatter3d(longley$Unemployed, longley$GNP, longley$Year, fit= linear , bg= white , grid=TRUE)

**2. Correlations**

Correlations measure the influence one variables has on another. The values range from 1 to 0 with 1 indicating perfect correlation. Remember correlations do not show causality only if the variations in two or more variables are related. Also, non-linearities and interactions can obscure the relationship.

**Example Code (R)**

data(swiss) results.Corr

**3. ANOVA**

Analyses of Variance is a powerful tools to show correlation between two or more variables. While it may not lead directly to a forecast model it can help a research gain knowledge of the relationship between the data elements. It is also useful in seeing how variables in a system are related to one another.

**Example Code (R)**

data(Seatbelts) anova(lm(DriversKilled ~ PetrolPrice, data= Seatbelts))

**4. Cluster Analysis**

Cluster analysis is often times confused with principle components(factor analysis). Both are powerful unsupervised data reduction tools. While a principle components is concerned with grouping columns in a dataset together cluster analysis is concerned with grouping rows together. It can be a powerful tool in building rules and dummy variables. For example, if a strong group merges from cluster analysis for young males it would be prudent to test this subgroup either by splitting the data or adding a dummy variables based on it.

**Example code (R)**

data(swiss) library(DCluster) library(cluster) hc

**5. OLAP Cubes\Pivot Charts**

Online Analytic Processing (OLAP) is a power data mining tool. It allows uses to run ad hoc queries on a database quickly with little understanding of data access languages such as SQL. The end results are frequencies. OLAP requires an intelligent machine (preferably a statistician) to wield it and will not uncover relationship by itself.

Most OLAP tools come with a graphic interface (GUI). OLAP can be thought of more as a substitute for SAS or SQL. It allows users to program complex queries using drag and drop interface that is intuitive to use.