Bi 1x 2015: Introduction to Python

This tutorial was generated from an IPython notebook. You can download the notebook here.

We have already installed Anaconda. Anaconda contains most of what we need to do scientific computing with Python. At the most basic level, it has Python 2.7. It contains other modules we will make heavy use of, the three most important ones being NumPy, matplotlib, and IPython. We will also make heavy use of SciPy and scikit-image throughout the course.

In this tutorial, we will first learn some of the basics of the Python programming language at the same time exploring the properties of NumPy's very useful (and as we'll see, ubiquitous) ndarray data structure. Finally, we'll load some data and use matplotlib to generate plots.

Getting Started with Anaconda

To launch Anacona, simply double click on the Anaconda icon and a window with four options will appear.

The second option is to launch an IPython notebook. IPython notebooks are great for creating tutorials - in fact the tutorial you're following right now is in an IPython notebook! The beauty of using an IPython notebook is that you can combine professional typesetting with individual sections of code. The code can be run section by section, or the whole document can be run at once.

For most of your programming, you will use the Integrated Development Environment called Spyder. To lauch Spyder, click on the last option.

Open Spyder and you will see that the window is divided into two sides. On the left is the Editor. This what you will use to type up your code.

On the right is the console. Commands can by typed into the console and will be run immediately (when you hit enter). Conveniently, by default the console uses IPython, which is much easier to work with than a standard Python prompt.

It is possible to run your entire program by typing it line by line into the console, but this is strongly discouraged. While it is often convenient when you first start programming to type a command into the console to see if it has worked or failed, the commands you need are often lost, and going back to rerun parts of the program require that you type the commands in again, or seach through your command history. This is time consuming and can lead to mistakes. Use the console as a resource to make sure individual lines of code are working correctly while building your code, but always build and run your code from the Editor. You can run your code in the console by using run my_code.py, where the file my_code.py contains your script.

SETTING THE PATH: This is important, because it's the most common reason many people's code fails when they first open Spyder. You have to set the path, that way Spyder knows where to go looking to find the files you want it to run.

To set the path, cick on the large blue folder on the top right of Spyder and navigate to the directory where your files are located.

Getting Started Programming

To get started, we're going to learn some basic commands. These can be run within this IPython notebook, but I would suggest that you type them into the console if you can, as typing them will help you remember the commands.

Hello, world.

As a new programmer, your first task is to greet the world with your new set of skills! We'll start by printing Hello, world. In the console, type the following:

In [1]:
print('Hello, world.')
Hello, world.

Do you see "Hello, world." printed to the screen right below your command? Python is taking your input and printing it out in the console. If you were to type this into your editor and run the .py document, "Hello, world." would still be shown in the console—this is where Python will print anything you ask to see.

Here we see the syntax for function calls in Python. Function arguments are enclosed in parentheses. We also learned another important piece of Python syntax. Strings are enclosed in single (or double) quotes.

Now print the following into your console:

In [2]:
# This prints hello world to the console
print ('Hello, world.')  #this is the command
Hello, world.

Notice how even though you added words to the line, only "Hello, world." is printed. The # starts a comment string, anything after the #, will not be read or interpreted by Python. This is how you add notes to your program about what certain lines do. Including comments in your code is essential so that other people can read your code and understand what you are trying to accomplish.

Python 2 and Python 3

There are two versions of Python that are currently in wide use, Python 2 and Python 3. Python 3 is the future, but many (many, many) packages are still written in Python 2. We will use Python 2 for this course. However, Python 3 has two changes that are very useful and not backwards compatible. The most important is that division in Python 3 is different. We'll demonstrate first by dividing two numbers in Python 2.

In [3]:
5 / 3
Out[3]:
1

That's right, division of integers returns the floor. This is not the case in Python 3, and we will use Python 3's division operator. We haven't talked about modules yet (we will below), but to ensure that Python 3 division happens, you need to have the following at the beginning of every bit of code you write.

In [4]:
# Set up Python 2 so that it divides and prints like Python 3
from __future__ import division, print_function

We have also specified that we will use Python 3 style printing, which we have implicitly done already. This is cosmetic, so we won't talk about it more here.

Warning: Integer division is a very common source of bugs in Python. Keep this in mind when debugging code. Make sure you have imported division from "the __future__!"

Variable Assignment

To assign a variable in python, we use an =, just like you would expect from any math class. Arithmetic operations are also as expected: +, -, *, /.

Try the following:

In [5]:
a = 3

b = a**3

c = (b + 2*a) / 2

d = (b + 2*a) // 2

print('a is', a)
print('b is',  b)
print('c is', c)
print ('d is', d)
a is 3
b is 27
c is 16.5
d is 16

In the assignment of the variable b, we used the ** operator. This means "raise to the power of." In the assignment of the variable d, we used the // operator. This does Python 2-style integer division, returning the floor of the result of the division.

Lists, tuples, slicing, and indexing

A list is a mutable array, meaning it can be edited. Let's explore below! Notice that a list is created using [].

In [6]:
my_list = [1,2,3,4]
print('my_list is', my_list)

my_list[0] = 3.14
print('my_list is now', my_list)
print('the last element in my_list is', my_list[-1])
my_list is [1, 2, 3, 4]
my_list is now [3.14, 2, 3, 4]
the last element in my_list is 4

Notice that indexing is done with brackets, []. Notice also that in Python, indexing starts are zero! Also notice that we can index the last element of a list with -1.

A tuple is just like a list, but it's immutable, meaning that once created, it cannot be changed. Let's try the same thing as above, but with a tuple, created with (). But notice that the indexing of the tuple is still denoted with [ ] in line 3.

In [7]:
# Create a tuple and print it
my_tuple = (1,2,3,4)
print('tuple 1 is', my_tuple) 

# This will make Python scream at us because tuples are immutable.
my_tuple[0] = 3.14
tuple 1 is (1, 2, 3, 4)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-0b9158471d6a> in <module>()
      4 
      5 # This will make Python scream at us because tuples are immutable.
----> 6 my_tuple[0] = 3.14

TypeError: 'tuple' object does not support item assignment

What's the error! Python is objecting because it cannot replace the 1 in my_tuple[0] with 3.14; that operation is not supported. If you try printing out my_tuple again, you will see it hasn't been changed.

A string is just a bunch of letters, or letters and numbers, strung together. It can also be indexed, like we did above with the list and the tuple. Let's look at our favorite phrase.

In [9]:
my_string = 'Hello, world.'
print('The fifth letter in the phrase is', my_string[4])
print('The first four letters are', my_string[0:4])
The fifth letter in the phrase is o
The first four letters are Hell

IMPORTANT! Python interprets [0:4] as $[0,4)$, so be careful when pulling out strings of specific length. Pulling small strings out of our larger string is called slicing, and can also be done with lists and tuples. This can be very powerful, as we can even pull out pieces at regular intervals.

In [10]:
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

a = my_list[0:5]
print(a)

b = my_list[5:]
print(b)

c = my_list[0:10:2]
print(c)

d = my_list[1:10:3]
print(d)
[1, 2, 3, 4, 5]
[6, 7, 8, 9, 10]
[1, 3, 5, 7, 9]
[2, 5, 8]

Make sure you notice how we create lists c and d. We select the entries in the list from position 0 to 9, selecting every 2 or 3, respectively.

Objects, types, and methods

Python is object-oriented, and all values in a program are objects. An object consists of an identity (where it is stored in memory), a type (a definition of how the object is represented), and data (the value of the object). An object of a given type can have various methods that operate on the data of the object. How do we keep track of what our variables are? Fotunately, python has a function for this, called type. Let's try it out. First, we'll create some new objects.

In [11]:
a = 4
b = 4.6

my_list = [1, 3.49, 'bi1x']

print('the type of a is', type(a))
print('the type of b is', type(b))
print('the type of my_list is', type(my_list))
print('the type of my_list[0] is', type(my_list[0]))
print('the type of my_list[1] is', type(my_list[1]))
print('the type of my_list[2] is', type(my_list[2]))
the type of a is <type 'int'>
the type of b is <type 'float'>
the type of my_list is <type 'list'>
the type of my_list[0] is <type 'int'>
the type of my_list[1] is <type 'float'>
the type of my_list[2] is <type 'str'>

What is most important to notice here is that my_list is a list, and that it can contain many different objects, from numbers to strings.

The data are very stright forward—they are the numbers and values that you associate with your variable.

Finally, objects have methods that can perform operations on the data. A method is called similarly to a function. This is best seen by example.

In [12]:
my_list = [1 , 5 , 4 , 13 , 3 , 5 , 19 , 31 , 3 , 1 , 17]

print('the number of 5\'s in the list is', my_list.count(5))
print('the number of 4\'s in the list is', my_list.count(4))

# Sort the list in place
my_list.sort()

print(my_list)

# Tack on a string to the end of the list
my_list.append('bi1x')
print(my_list)
the number of 5's in the list is 2
the number of 4's in the list is 1
[1, 1, 3, 3, 4, 5, 5, 13, 17, 19, 31]
[1, 1, 3, 3, 4, 5, 5, 13, 17, 19, 31, 'bi1x']

As you can see, an object has several methods including count and sort. They are called just like functions, with arguments in parentheses. The name of the method comes after the object name followed by a dot (.).

The count method takes a single argument and returns the number of times that argument appears in the list. The sort function takes no arguments (but still requires open and closed parentheses to be called), and sorts the list in place. Note that my_list has been changed and is now sorted. We also use the method append, which adds another element to the end of a list.

IPython conveniently allows you to see what methods and data are available by entering the object name followed by a dot, and then pressing tab. Try it!

Modules

It is common that scientific software packages such as Matlab and Mathematica are optimized for a flavor of scientific computing (such as matrix computation in the case of Matlab) and are rather full-featured. On the other hand, Python is a programming language. It was not specifically designed to do scientific computing. So, plain old Python is very limited in scientific computing capability.

However, Python is very flexible and allows use of modules. A module contains classes, functions, attributes, data types, etc., beyond what is built in to Python. In order to use a module, you need to import it to make it available for use. So, as we begin working on data analysis and simulations in Bi 1x, we need to import modules we will use.

The first things we will import come from the __future__ module, which we have already seen. This is a special module that enables use of Python 3 standards while running Python 2. Having these in place in your code will help you in the future when you eventually migrate to Python 3. In addition to division and print_function, you can also import unicode_literals and absolute_import to make your code more fully Python 3 compliant. The latter two are not necessary for Bi 1x, though.

In [13]:
from __future__ import division, print_function, \
                                    absolute_import, unicode_literals

The construction from <module> import <attributes> puts <attributes> (the things you imported) in the namespace. This construction allows you to pick and choose what attributes you want to import from a given module.

It is important to note that until we imported the __future__ module, its capacities were not available. Keep that in mind: Plain old Python won't do much until you import a module.

Let's now import one of the major workhorses of our class, NumPy!

In [14]:
# Importing is done with the import statement
import numpy as np

# We now have access to some handy things, ie np.pi
print('circumference / diameter = ', np.pi)
print('cos(pi) = ', np.cos(np.pi))
circumference / diameter =  3.14159265359
cos(pi) =  -1.0

Notice that we used the import ... as construction. This enabled us to abbreviate the name of the module so we do not have to type numpy each time.

Also, notice that to access the (approximate) value of $\pi$ in the numpy module, we prefaced the name of the attiribute (pi) with the module name followed by a dot (np.). This is generally how you access attributes in modules.

We're already getting dangerous with Python. So dangerous, in fact, that we'll write our own module!

Writing your own module (and learning a bunch of syntax!)

Modules are stored in files ending in .py. As an example, we will create a module that finds the roots of the quadratic equation

\begin{align} ax^2 + bx + c = 0. \end{align}

Using the Anaconda editing window, we'll create a file called quadratic.py containing the code below. The file should be saved in a directory that is part of your PYTHONPATH environment variable (which usually contains the present working directory) so that the interpreter can find it.

In [15]:
"""
*** This should be stored in a file quadratic.py. ***

Quadratic formula module
"""
from __future__ import division, print_function
import numpy as np


# ############
def discriminant(a, b, c):
    """
    Returns the discriminant of a quadratic polynomial
    a * x**2 + b * x + c = 0.    
    """
    return b**2 - 4.0 * a * c


# ############
def roots(a, b, c):
    """
    Returns the roots of the quadratic equation
    a * x**2 + b * x + c = 0.
    """ 
    delta = discriminant(a, b, c)
    root_1 = (-b + np.sqrt(delta)) / (2.0 * a)
    root_2 = (-b - np.sqrt(delta)) / (2.0 * a)
    
    return root_1, root_2

There is a whole bunch of syntax in there to point out.

  • Even though we may have already imported NumPy and items from __future__ in our Python session, we need to explicitly import it (and any other module we need) in the .py file. This ensures that any time we call the function it has the operations it needs to run.
  • A function is defined within a module with the def statement. It has the function prototype, followed by a colon.
  • Indentation in Python matters! Everything indented after the def statement is part of the function. Once the indentation goes back to the level of the def statement, you are no longer in the function.
  • We can have multiple functions in a single module (in one .py file).
  • The return statement is used to return the result of a function. If multiple objects are returned, they are separated by commas.
  • The text within triple quotes are doc strings. They say what the function or module does. These are essential for people to know what your code is doing.

Now, let's test our new module out! Note that because this tutorial was prepared in an IPython notebook, we do not import the module because it was not created in a separate file. We have noted how the syntax changes in the comments.

In [16]:
# When you run from the console, you will need to import the module.
# Uncomment the line below.
# import quadratic as qd

# Python has nifty syntax for making multiple definitions on the same line
a, b, c = 3.0, -7.0, -6.0

# Call the function and print the result.  You will call qd.roots,
# since you imported the function from a module
root_1, root_2 = roots(a, b, c)
print('roots:', root_1, root_2)
roots: 3.0 -0.666666666667

Very nice! Now, let's try another example. This one might have a problem....

In [17]:
# Specify a, b, and c that will give imaginary roots
a, b, c = 1.0, -2.0, 2.0

# Call the function and print the result (again, call qd.roots)
root_1, root_2 = roots(a, b, c)
print('roots:', root_1, root_2)
roots: nan nan
/Users/Justin/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:26: RuntimeWarning: invalid value encountered in sqrt
/Users/Justin/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:27: RuntimeWarning: invalid value encountered in sqrt

Oh no! It gave us nan, which means "not a number," as our roots. It also gave some warning that it encountered invalid (negative) arguments for the np.sqrt function. The roots should be $1 \pm i$, where $i = \sqrt{-1}$. We will use this opportunity to introduce Python's control flow, starting with an if statement.

Control flow: the if statement

We will decree that our quadratic equation solver only handles real roots, so it will raise an exception if an imaginary root is encountered. So, we modify the contents of the file quadratic.py as follows.

In [18]:
"""
*** This should be stored in a file quadratic.py. ***

Quadratic formula module
"""
from __future__ import division, print_function
import numpy as np


# ############
def discriminant(a, b, c):
    """
    Returns the discriminant of a quadratic polynomial
    a * x**2 + b * x + c = 0.    
    """
    return b**2 - 4.0 * a * c


# ############
def roots(a, b, c):
    """
    Returns the roots of the quadratic equation
    a * x**2 + b * x + c = 0.
    """ 
    delta = discriminant(a, b, c)

    if delta < 0.0:
        raise ValueError('Imaginary roots!  We only do real roots!')
    else:
        root_1 = (-b + np.sqrt(delta)) / (2.0 * a)
        root_2 = (-b - np.sqrt(delta)) / (2.0 * a)
    
        return root_1, root_2

We have now exposed the syntax for a Python if statement. The conditional expression ends with a colon, just like the def statement. Note the indentation of blocks of code after the conditionals. (We actually did not need the else statement, because the program would just continue without the exception, but I left it there for illustrative purposes. It is actually preferred not to have the else statement.)

Now if we re-import the module (we can use the Python function reload in the console for this), the if statement will catch our imaginary roots and raise an exception. Note that you must reload (or start over again and import) the module before your changes take effect.

In [19]:
# Reload the quadratic module using its abbeviated name we already defined
# Uncomment below)
# reload(qd)

# Pass in parameters that will give imaginary roots (use qd.roots)
a, b, c = 1.0, -2.0, 2.0
root_1, root_2 = roots(a, b, c)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-19-1e1c4853cfbe> in <module>()
      5 # Pass in parameters that will give imaginary roots (use qd.roots)
      6 a, b, c = 1.0, -2.0, 2.0
----> 7 root_1, root_2 = roots(a, b, c)

<ipython-input-18-e9749de8054d> in roots(a, b, c)
     26 
     27     if delta < 0.0:
---> 28         raise ValueError('Imaginary roots!  We only do real roots!')
     29     else:
     30         root_1 = (-b + np.sqrt(delta)) / (2.0 * a)

ValueError: Imaginary roots!  We only do real roots!

This threw the appropriate exception.

Congrats! You wrote a functioning module. But now is an important lesson....

Loops

If we want to do something many times, we can use a for loop. This is again best learned by example. Let’s make a function that counts the number of times a subsequence is present in a sequence of DNA. We created a new file called dna_counter.py.

In [24]:
def n_subseq (seq, subseq) :
    """
    Given a sequence seq , returns the number of occurrances of subseq.
    """
    # Determine the lengths of the sequence and subsequence
    len_subseq = len(subseq)
    len_seq = len(seq)
    
    # First make sure the length of subseq is shorter than seq.
    if len_subseq > len_seq :
        return 0

    # We loop through the sequence to check for matches
    num_subseq = 0
    for i in range(0, len_seq - len_subseq + 1):
        if seq [i:i+len_subseq] == subseq:
            num_subseq += 1 # The += 1 increases the value of a variable by 1

    # We are done looping now , so return the number of subsequences
    return num_subseq

Let's see how it works!

In [23]:
# Uncomment line below
# import dna_counter as dnac

seq = 'ACTGTACGATCGAGCGATCGAGCGAGTCATTACGACTGAGATCC'

subseq = 'GAT'

# Call dnac.nsubseq
n_gat = n_subseq (seq, subseq)

# Print the result
print('There are %d GATs in the sequence.' % n_gat)
There are 1 GATs in the sequence.

Now that we know it works, let’s look at how the loop was constructed. In the statement at the beginning of the loop, we use the function range to define an iterator. Calling range(n) creates a list of integers from 0 to n-1. The for statement says that the variable i will successively take the values of the iterator.

Keyword arguments

Before concluding our quick trip through the very basics of Python and on to NumPy, I want to show a very handy tool in Python, keyword arguments. Before, when we defined a function, we specified its arguments as variable names separated by commas in the def statement. We can also specify keyword arguments. Here is an example from our quadratic equation solver.

In [ ]:
"""
Quadratic formula module
"""
from __future__ import division, print_function
import numpy as np


# ############
def discriminant(a, b, c):
    """
    Returns the discriminant of a quadratic polynomial
    a * x**2 + b * x + c = 0.    
    """
    return b**2 - 4.0 * a * c


# ############
def roots(a, b, c, print_discriminant=False,
          message_to_the_world='Bi 1x rules'):
    """
    Returns the roots of the quadratic equation
    a * x**2 + b * x + c = 0.
    
    If print_discriminant is True, prints discriminant to screen
    """ 
    delta = discriminant(a, b, c)
    if print_discriminant: 
        print('discriminant =', delta)
        
    if message_to_the_world is not None: 
        print('\n' + '*'*len(message_to_the_world))
        print(message_to_the_world)
        print('*'*len(message_to_the_world) + '\n')

    if delta < 0.0:
        raise ValueError('Imaginary roots!  We only do real roots!')
    else:
        root_1 = (-b + np.sqrt(delta)) / (2.0 * a)
        root_2 = (-b - np.sqrt(delta)) / (2.0 * a)
    
        return root_1, root_2

The function quadratic_roots now has two keyword arguments. They are signified by the equals sign.

The function has three required arguments, a, b, c. If the keyword arguments are omitted in the function call, they take on the default values, as specified in the function definition.

In the example, the default for print_discriminant is False and the default for message_to_the_world is 'Bi 1x rules!'. Furthermore, ordering of keyword arguments does not matter when calling the function. They are called in the function similarly to the way they are defined in the function definition.

Let's try it!

In [ ]:
a, b, c = 3.0, -7.0, -6.0

# Call qd.roots
root_1, root_2 = roots(a, b, c)
print('roots:', root_1, root_2)

Since we did not specify the keyword arguments, the defaults were used. We can specify other values.

In [ ]:
root_1, root_2 = roots(a, b, c, print_discriminant=True,
                      message_to_the_world='Bi 1x TAs are the best!')
print('roots:', root_1, root_2)

Intro to NumPy, SciPy, and Matplotlib

If you are trying to do a task that you think might be common, it's probably part of NumPy or some other package. Look, or ask Google, first. In this case, NumPy has a function called roots that computes the roots of a polynomial. To figure out how to use it, we can either look at the doc string, or look in the NumPy and SciPy documentation online (the documentation for np.roots is available here). To look at the doc string, you can enter the following at an IPython prompt:

In [ ]:
np.roots?

We see that we need to pass the coefficients of the polynomial we would like the roots of using an "array_like" object. We will discuss what this means in a moment, but for now, we will just use a list to specify our coefficients and call the np.roots function.

In [ ]:
# Define the coefficients in a list (using square brackets)
coeffs = [3.0, -7.0, -6.0]

# Call np.roots.  It returns an np.ndarray with the roots
roots = np.roots(coeffs)
print('Roots for (a, b, c) = (3, -7, -6):', roots)

# It even handles complex roots!
roots = np.roots([1.0, -2.0, 2.0])
print('Roots for (a, b, c) = (1, -2, 2): ', roots)

Some array_like data types

In the previous example, we used a list as an array_like data type. Python has several native data types. We have already mentioned ints and floats. We just were not very explicit about it. Python's native array_like data types are lists and tuples. Internally, these things are converted into NumPy arrays, which is the most often used array_like data type we will use. NumPy arrays are your new best friend.

The np.ndarray: maybe your new best friend

Lists and tuples can be useful, but for many many applications in data analysis, the np.ndarray, which we will colloquially call a "NumPy array," is most often used. They are created using the np.array function with a list or tuple as an argument. Once created, we can do all sorts of things with them. Let's play!

Let's make some arrays to see what they look like:

In [ ]:
array_1 = np.array([1, 2, 3, 4])
print('array 1:')
print(array_1, '\n')

array_2 = np.array([[1, 2], [1, 2]])
print('array_2:')
print(array_2, '\n')

array_3 = np.array([[1, 2, 3], [1, 2, 3], [1, 2, 3]])
print('array_3:')
print(array_3)

Sometimes you want an array of all zero values, and that can also be done with numpy.

In [ ]:
zero_array = np.zeros((3,4))
print(zero_array, '\n')
print('the dimesions are', zero_array.shape)

Now let's see how we can do operations on some arrays.

In [ ]:
a = np.array([1, 2, 3])
b = np.array([4.0, 5.0, 6.0])

# Arithmetic operations are done elementwise
print('a:      ', a)
print('b:      ', b)
print('a + b:  ', a + b)
print('a * b:  ', a * b)
print('1.0 + a:', 1.0 + a)
print('a**2:   ', a**2)
print('b**a:   ', b**a)

We can check the data type of our matrix.

In [ ]:
print(a.dtype)
print(b.dtype)

We can also change the type of the entries of our arrays, for example, from integers to floating.

In [ ]:
array_a = a.astype(float)
print(array_a)

array_b = b.astype(int)
print(array_b)

Slicing is also intuitive.

In [ ]:
a = np.array ([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])

print(a, '\n')
print(a[0:3, 2:3])
In [ ]:
# view second column
a[:,1]
In [ ]:
# see all entries below the value of 10
a[a<10]

Using slices, we can reassign values to the entries in an np.ndarray. For example, say we wanted the third row to be all zeros.

In [ ]:
a [2, :] = np.zeros (a[2 ,:].shape)

print(a)

We can also reshape arrays.

In [ ]:
a = a.reshape (2 ,8)
print(a, '\n')  

a = a.reshape(4,4)
print(a)

Subpackages in NumPy

When you import NumPy, you get a set of core functions, such as np.dot. However, it would be wasteful to import all that NumPy offers into the namespace. Therefore, some of NumPy’s functionality must be separately imported. For example, if we wanted to do some random number generation, we would need to import numpy.random.

In [ ]:
from numpy import random
a = random.rand(4,4)
print(a)

NumPy functions

Here are some useful NumPy functions we think you might want to use! Go through these one-by-one in the console.

In [ ]:
# create evenly spaced points
np.linspace(0,1,10)

# matrix or vector dot products
np.dot(a,a)

# concatenate in row dimensions
np.concatenate((a,a))

# concatenate in the column dimension
np.concatenate((a,a), axis=1)

# transpose (omit semicolon to see output)
np.transpose(a);

matplotlib: our primary plotting tool

matplotlib is a tool for plotting the data stored in NumPy arrays. We will mostly use the interactive plotting module, matplotlib.pyplot, which we will abberviate as plt. Its syntax is quite simple, and best learned by example.

We will now write a script to plot some functions. Make a file called my_first_mpl_plot.py

In [ ]:
""" 
Make some plots!
"""
import numpy as np
from numpy import random
import matplotlib.pyplot as plt

# Import magic function for graphics in IPython notebook
%matplotlib inline

# Make an x-variable for plotting
x = np.linspace (0, 2*np.pi, 200)

# This is a nice function
y_1 = np.exp(np.sin(x))

# We can make another one
y_2 = np.exp(np.cos(x))


# We can make some random data to plot as well
x_rand = random.rand(20) * 2 * np.pi
y_rand = random.rand(20) * 3.0

# Now plot them .
plt.plot(x, y_1, '-') # The ’-’ means to use a line plot
plt.plot(x, y_2, '-')
plt.plot(x_rand, y_rand, 'ko') # The ’ko ’ mean plot as black circles

# Label the axes.  
plt.xlabel('x')
plt.ylabel('y')

# We can save it as a PDF as well
plt.savefig('my_first_mpl_plot.pdf')

Programming style

PEPs (Python Enhancement Proposals) document suggestions and guidelines for the Python language, its development, etc. My personal favorite is PEP8, the Style Guide for Python Code. This was largely written by Guido von Rossum, the inventor of Python and its benevolent dictator for life. It details how you should style your code. As Guido says, code is much more often read than written. I strongly urge you to follow PEP8 the best you can. It's a lot to read, so I will highlight important points here. If you follow these guidelines, your code will be much more readable than otherwise. This is particularly useful because you are working in groups and the TAs need to grade your work.

  • Limit line widths to 79 characters (the line break character for Python is \).
  • Use spaces between all operators except **. E.g., a = b + c * d**2.
  • In function calls, use a space after each comma. Use no spaces before and after the equals sign when using keyword arguments. E.g., my_fun(a, b, c=True).
  • Do not use excess space when indexing. E.g., a[i], not a [ i ].
  • Function names should be lowercase, with words separated by underscores as necessary to improve readability.
  • Class names should be CamelCase.
  • Comment lines should appear immediately before the code they describe. Use in-line comments sparingly.

Conclusions

This concludes our introductory tour of Python with some NumPy, SciPy, and matplotlib thrown in for good measure. There is still much to learn, but you will pick up more and more tricks and syntax as we go along.

For the next tutorial, we will use Python to do some image processing. That is, we will write code to extract data of interest from images. As you get more and more proficient, coding (particularly in Python, in my opinion) will be more and more empowering and FUN!