Python Data

Python scripting language with a heavy emphasis on code reuse and simplicity. It is a popular language with a large and active user group.  Like Perl, it has a rich library set and is popular with projects like MapReduce with Hadoop and is rapidly becoming the default language in areas like computational intelligence.  One interesting feature of the Python language is, indentation acts as block delimiters.  So, when copying code do not change the indentation of the code will not operate as intended.  It is also a powerful language meaning it can do a lot with very little code.  Below is an example of a database containing two tables, customers, and purchase order.  The Customer table has customerid (the unique identifier for the customer), name (The customer’s name), and age (the customer’s age).  The PurchaseOrder table has POID (the unique id for the purchase order), customerid (which refers back to the customer who made the purchase), and Purchase (what the purchase was).

Example:
Customer
CustomerID Name Age
1 Joe 23
2 Mika 45
3 Lin 34
4 Sara 56
5 Susan 18
PurchaseOrder
POID CustomerID Purchase
1 3 Fiction
2 1 Biography
3 1 Fiction
4 2 Biography
5 3 Fiction
6 4 Fiction
 
     
SELECT
Create the file select.py with the following code:

import sys
import fileinput
# Loop through file and print lines
for line in fileinput.input(sys.argv[1]):
    print (line)

Then run the code:
python select.py customer.txt

CustomerID Name Age
1 Joe 23
2 Mika 45
3 Lin 34
4 Sara 56
5 Susan 18
ORDER BY
Create the file orderby.py with the following code:

import sys
import fileinput
import string
import array

# Initialize variables

listLines =[]
i =0

# Load file into an array

for line in fileinput.input(sys.argv[1]):
listLines.append(line)

# Create a custom sort based on second field

def getCustId(listLines):
return listLines.split( , , 2)[-1]

# Sort the array

listLines.sort(key=getCustId)

# Print arrys lines

for object in listLines:
    print (object)

Then run the code:
python orderby.py customer.txt

CustomerID Name Age
5 Susan 18
1 Joe 23
3 Lin 34
2 Mika 45
4 Sara 56
WHERE
Create the file select_by_id.py with the following code:
import sys

import fileinput
import string

# Loop through file

for line in fileinput.input(sys.argv[1]):

# Split line using a comma

tokens = string.split( line, , )

# If ID matches passed ID then print

if tokens[0] == sys.argv[2]:
print line,

Then run the code:
python select_by_id.py customer.txt 1

CustomerID Name Age
5 Susan 18
INNER JOIN
Create the file innerjoin.py with the following code:

import sys
import fileinput
import string
import array

# Initialize variables

listB =[]
i =0

# Load second file into an array to loop through

for lineB in fileinput.input(sys.argv[2]):
listB.append(lineB)

# Loop through first file

for lineA in fileinput.input(sys.argv[1]):

# Split line using a comma

tokensA = string.split( lineA, , )

# Loop through array

for object in listB:

# Split line using a comma

tokensB = string.split(object, , )

#If there is a match print

if tokensA[0] == tokensB[1]:

# Remove newline character with strip method

print lineA.strip() + , + object,

Then run the code:
python innerjoin.py customer.txt orders.txt

CustomerID Name Age POID Purchase
1 Joe 23 2 Biography
1 Joe 23 3 Fiction
2 Mika 45 4 Biography
3 Lin 34 1 Fiction
3 Lin 34 5 Fiction
4 Sara 56 6 Fiction
LEFT OUTER JOIN
Create the file leftouterjoin.py with the following code:

import sys
import fileinput
import string
import array

# Initialize variables

listB =[]
iFound =0

# Load second file into an array to loop through

for lineB in fileinput.input(sys.argv[2]):
listB.append(lineB)

# Loop through first file

for lineA in fileinput.input(sys.argv[1]):

# Split line using a comma

tokensA = string.split( lineA, , )
iFound =0

# Loop through array

for object in listB:

# Split line using a comma

tokensB = string.split(object, , )

#If there is a match print

if tokensA[0] == tokensB[1]:

# Remove newline character with strip method

print lineA.strip() + , + object,
iFound=1

#If there is no match print

if iFound ==0:
print lineA.strip() + , + , + , + , ,

Then run the code:
python leftouterjoin.py customer.txt orders.txt

CustomerID Name Age POID Purchase
1 Joe 23 2 Biography
1 Joe 23 3 Fiction
2 Mika 45 4 Biography
3 Lin 34 1 Fiction
3 Lin 34 5 Fiction
4 Sara 56 6 Fiction
5 Susan 18 NULL NULL
GROUP BY
Create the file groupby.py with the following code:

import sys
import fileinput
import string
import array

# Initialize variables

iCnt =1
iLoop =0

# Load and loop through file

for lineA in fileinput.input(sys.argv[1]):

# Split line using a comma

tokensA = string.split( lineA, , )

# Adjust with header and first line

if iLoop <2:
priorTokens = tokensA
iCnt=0

if tokensA[0] == priorTokens[0]:
iCnt =iCnt+1
else:
print priorTokens[1] + , + priorTokens[2].strip() + , +str(iCnt)
iCnt=1 iLoop = iLoop +1
priorTokens = tokensA

# print last line

print priorTokens[1] + , + priorTokens[2].strip() + ,

Then run the code:
python groupby.py customer.txt

Name Age Orders
Joe 23 1
Mika 45 2
Lin 34 2
Sara 56 1
UPDATE
Create the file update with the following code:

import sys
import fileinput
import string

# Loop through file

for line in fileinput.input(sys.argv[1]):

# Split line using a comma

tokens = string.split( line, , )

# If ID is not passed ID then print else replace age with passed parameter

if tokens[0] != sys.argv[2]:
print line,
else:
print tokens[0]+ , + tokens[1] + , + sys.argv[3]

Then run the code:
python groupby.py customer.txt 1 23

Customer
CustomerID Name Age
1 Joe 26
2 Mika 45
3 Lin 34
4 Sara 56
5 Susan 18
INSERT
Create the file insert.py with the following code:

import sys
import fileinput

# Loop through file and print lines

for line in fileinput.input(sys.argv[1]):
print line,

# Add new line from passed arguments

print sys.argv[2] + , + sys.argv[3] + , + sys.argv[4],

Then run the code:
python insert.py customer.txt 6 Joe 34

Customer
CustomerID Name Age
1 Joe 23
2 Mika 45
3 Lin 34
4 Sara 56
5 Susan 18
6 Terry 50
DELETE
Create the file delete.py with the following code:

import sys
import fileinput
import string

# Loop through file

for line in fileinput.input(sys.argv[1]):

# Split line using a comma

tokens = string.split( line, , )

# If ID is not passed ID then print

if tokens[0] != sys.argv[2]:
print line,

Then run the code:
python delete.py customer.txt 1

Customer
CustomerID Name Age
2 Mika 45
3 Lin 34
4 Sara 56
5 Susan 18