I gave a workshop/talk at SIGGRAPH geared toward introducing people to Python. There were ~25 people on PCs following along, and awkwardly enough, many more than that standing and watching. I prefaced my talk with the fact that I am self-taught and by no means an expert. That said, I have created many python tools people use every day at industry-leading companies.
Starting from zero, in the next hour I aimed to not only introduce them to Python, but get them doing cool, usable things like:
- Iterating through batches/lists
- Reading / writing data to excel files
- Wrangling data from one format to another in order to create a ‘tag cloud’
Many people have asked for the notes, and I only had rough notes. I love Python, and I work with this stuff every day, so I have had to really go back and flesh out some of what I talked about. This tutorial has a lot less of the general chit-chat and information. I apologize for that.
Installation / Environment Check
Let’s check to see that you have the tools properly installed. If you open the command prompt and type ‘python’ you should see this:
So Python is correctly installed, for the following you can either follow along in the cmd window (more difficult) or in IDLE, the IDE that python ships with (easier). This can be found by typing IDLE into the start menu:
Variables
Variables are pieces of information you store in memory, I will talk a bit about different types of variables.
Strings
Strings are pieces of text. I assume you know that, so let’s just go over some quick things:
string = 'this is a string'
print string
#>>this is a string
num = '3.1415'
print num
#>>3.1415 |
string = 'this is a string'
print string
#>>this is a string
num = '3.1415'
print num
#>>3.1415
One thing to keep in mind, the above is a string, not a number. You can see this by:
print num + 2
#>>Traceback (most recent call last):
#>> File "basics_variables.py", line 5, in
#>> print num + 2
#>>TypeError: cannot concatenate 'str' and 'int' objects |
print num + 2
#>>Traceback (most recent call last):
#>> File "basics_variables.py", line 5, in
#>> print num + 2
#>>TypeError: cannot concatenate 'str' and 'int' objects
Python is telling you that you cannot add a number to a string of text. It does not know that ‘3.1415’ is a number. So let’s convert it to a number, this is called ‘casting’, we will ‘cast’ the string into a float and back:
print float(num) + 2
#>>5.1415
print str(float(num) + 2) + ' addme'
#>>5.1415 addme |
print float(num) + 2
#>>5.1415
print str(float(num) + 2) + ' addme'
#>>5.1415 addme
Lists
Lists are the simplest ways to store pieces of data. Let’s make one by breaking up a string:
txt = 'jan tony senta michael brendon phillip jonathon mark'
names = txt.split(' ')
print names
#>>['jan', 'tony', 'senta', 'michael', 'brendon', 'phillip', 'jonathon', 'mark']
for item in names: print item
#>>jan
#>>tony
#>>senta
#>>michael
... |
txt = 'jan tony senta michael brendon phillip jonathon mark'
names = txt.split(' ')
print names
#>>['jan', 'tony', 'senta', 'michael', 'brendon', 'phillip', 'jonathon', 'mark']
for item in names: print item
#>>jan
#>>tony
#>>senta
#>>michael
...
Split breaks up a string into pieces. You tell it what to break on, above, I told it to break on spaces txt.split(‘ ‘). So all the people are stored in a List, which is like an Array or Collection in some other languages.
You can call up the item by it’s number starting with zero:
print names[0], names[5]
#>>jan phillip |
print names[0], names[5]
#>>jan phillip
TIP: [-1] index will return the last item in an array, here’s a quick way to get a file from a path:
path = 'D:\\data\\dx11_PC_(110)_05_09\\Tools\\CryMaxInstaller.exe'
print path.split('\\')[-1]
#>>CryMaxInstaller.exe |
path = 'D:\\data\\dx11_PC_(110)_05_09\\Tools\\CryMaxInstaller.exe'
print path.split('\\')[-1]
#>>CryMaxInstaller.exe
Dictionaries
These store keys, and the keys reference different values. Let’s make one:
dict = {'sascha':'tech artist', 'harry': 142.1, 'sean':False}
print dict['sean']
#>>False |
dict = {'sascha':'tech artist', 'harry': 142.1, 'sean':False}
print dict['sean']
#>>False
So this is good, but these are just the keys, we need to know the values. Here’s another way to do this, using .keys()
dict = {'sascha':'tech artist', 'harry': 142.1, 'sean':False}
for key in dict.keys(): print key, 'is', dict[key]
#>>sean is False
#>>sascha is tech artist
#>>harry is 142.1 |
dict = {'sascha':'tech artist', 'harry': 142.1, 'sean':False}
for key in dict.keys(): print key, 'is', dict[key]
#>>sean is False
#>>sascha is tech artist
#>>harry is 142.1
So, dictionaries are a good way to store simple relationships of key and value pairs. In case you hadn’t notices, I used some ‘floats’ and ‘ints’ above. A float is a number with a decimal, like 3.1415, and an ‘int’ is a whole number like 10.
Creating Methods (Functions)
A method or function is like a little tool that you make. These building blocks work together to make your program.
Let’s say that you have to do something many times, you want to re-use this code and not copy/paste it all over. Let’s use the example above of names, let’s make a function that takes a big string of names and returns an ordered list:
def myFunc(input):
people = input.split(' ')
people = sorted(people)
return people
txt = 'jan tony senta michael brendon phillip jonathon mark'
orderedList = myFunc(txt)
print orderedList
#>>['brendon', 'jan', 'jonathon', 'mark', 'michael', 'phillip', 'senta', 'tony'] |
def myFunc(input):
people = input.split(' ')
people = sorted(people)
return people
txt = 'jan tony senta michael brendon phillip jonathon mark'
orderedList = myFunc(txt)
print orderedList
#>>['brendon', 'jan', 'jonathon', 'mark', 'michael', 'phillip', 'senta', 'tony']
Basic Example: Create A Tag Cloud From an Excel Document
So we have an excel sheet, and we want to turn it into a hip ‘tag cloud’ to get people’s attention.
If we go to http://www.wordle.net/ you will see that in order to create a tag cloud, we need to feed it the sentences multiple times, and we need to put a tilde in between the words of the sentence. We can automate this with Python!
First, download the excel sheet from me here: [info.csv] The CSV filetype is a great way to read/write docs easily that you can give to others, they load in excel easily.
file = 'C:\\Users\\chris\\Desktop\\intro_to_python\\info.csv'
f = open(file, 'r')
lines = f.readlines()
f.close()
print lines
#>> ['always late to work,13\n', 'does not respect others,1\n', 'does not check work properly,5\n', 'does not plan properly,4\n', 'ignores standards/conventions,3\n'] |
file = 'C:\\Users\\chris\\Desktop\\intro_to_python\\info.csv'
f = open(file, 'r')
lines = f.readlines()
f.close()
print lines
#>> ['always late to work,13\n', 'does not respect others,1\n', 'does not check work properly,5\n', 'does not plan properly,4\n', 'ignores standards/conventions,3\n']
‘\n’ is a line break character, it means ‘new line’, we want to get rid of that, we also want to just store the items, and how many times they were listed.
file = 'C:\\Users\\chris\\Desktop\\intro_to_python\\info.csv'
f = open(file, 'r')
lines = f.readlines()
f.close()
dict = {}
for line in lines:
split = line.strip().replace(' ','~').split(',')
dict[split[0]] = int(split[1])
print dict
#>>{'ignores~standards/conventions': 3, 'does~not~respect~others': 1, 'does~not~plan~properly': 4, 'does~not~check~work~properly': 5, 'always~late~to~work': 13} |
file = 'C:\\Users\\chris\\Desktop\\intro_to_python\\info.csv'
f = open(file, 'r')
lines = f.readlines()
f.close()
dict = {}
for line in lines:
split = line.strip().replace(' ','~').split(',')
dict[split[0]] = int(split[1])
print dict
#>>{'ignores~standards/conventions': 3, 'does~not~respect~others': 1, 'does~not~plan~properly': 4, 'does~not~check~work~properly': 5, 'always~late~to~work': 13}
Now we have the data in memory in an easily readable way, let’s write it out to disk.
output = ''
for key in dict.keys():
for i in range(0,dict[key]): output += (key + '\n')
f = open('C:\\Users\\chris\\Desktop\\intro_to_python\\test.txt', 'w')
f.write(output)
f.close() |
output = ''
for key in dict.keys():
for i in range(0,dict[key]): output += (key + '\n')
f = open('C:\\Users\\chris\\Desktop\\intro_to_python\\test.txt', 'w')
f.write(output)
f.close()
There we go. In one hour you have learned to:
- Read and write excel files
- Iterate over data
- Convert data sets into new formats
- Write, read and alter ascii files
If you have any questions, or I left out any parts of the presentation you liked, reply here and I will get back to you.