The S-Language


The S programming language of statistical programming language was developed  Bell laboratories specifically for statistical modeling. There are two versions of  S.  One was developed by insightful under the name S-Plus.  The other is an open-source initiative called R.  S allows you to create objects and is very extendable and has power graphing capabilities.

Tips
Tip 1

Set Memory Size

memory.size(max = TRUE)
Tip 2

Today’s Date

Today <- format(Sys.Date(), %d %b %Y )
Tip 3

Set Working Directory

setwd( C:// )
Tip 4

Load In Data

ExampleData.path    <- file.path(getwd(), USDemographics.CSV ) 
ExampleData.FullSet  <- read.table( ExampleData.path, header=TRUE, sep= , , na.strings= NA , dec= . , strip.white=TRUE)
Tip 5

Split Data

ExampleData.Nrows <-  nrow(ExampleData.FullSet) ExampleData.NCol= ncol(ExampleData.FullSet) 
ExampleData.SampleSize <- ExampleData.Nrows /2
ExampleData.Sample <- sample(nrow(ExampleData.FullSet ),size = ExampleData.SampleSize ,
replace=FALSE, prob = NULL )
ExampleData.HoldBack  <- ExampleData.FullSet[ExampleData.Sample, c(5,1:ExampleData.NCol)]
ExampleData.Run   <- ExampleData.FullSet[-ExampleData.Sample, c(5,1:ExampleData.NCol)  ]
Tip 6

Create Function

Confusion <- function(a, b){
                  tbl <- table(a, b)
                  mis <- 1 - sum(diag(tbl))/sum(tbl)
                  list(table = tbl, misclass.prob = mis)
                   }
Tip 7

Recode Fields

ExampleData.FullSet$Savings 
ExampleData.FullSet$SavingsCat <- recode(ExampleData.FullSet$Savings, 
, -40000.00:-100.00 = HighNeg ; -100.00:-50.00  = MedNeg ; -50.00:10.00 = LowNeg ; 10.00:50.00 = Low ; 50.00:100.00 = Med ; 100.00:1000.00 = High ;;;  , as.factor.result=TRUE)
Tip 8

Summarize Data

Summary(ExampleData.FullSet)
Tip 9

Save output

save.image(file = c:/test.RData , version = NULL, ascii = FALSE,  compress = FALSE, safe = TRUE)
Tip 10

Subset

MyData.SubSample <- subset(MyData.Full, MyField ==0)
Tip 11

Remove Object From Memory

remove(list = c(‘MyObject’));
Tip  12

Create a Dataframe

TmpOuput <- data.frame ( Fields = c( Field1 , ‘Field2 , ‘Field3’),  Values   = c( 1 , 2 ,  2  ) )
Tip 13

Cut

data(swiss)
x <- swiss$Education  
swiss$Educated= cut(x, breaks=c(0, 11, 999), labels=c( 0 , 1 ))
Tip 14

Create Directories

dir.create( c:/MyProjects )