© Valentina Porcu 2018
Valentina PorcuPython for Data Mining Quick Syntax Referencehttps://doi.org/10.1007/978-1-4842-4113-4_6

6. Other Basic Concepts

Valentina Porcu1 
(1)
Nuoro, Italy
 

In this chapter, we learn about some important programming concepts (such as modules and methods), list comprehension and class creation, regular expressions, and management of errors and exceptions.

Object-oriented Programming

As mentioned, Python is an object-oriented programming language, so let’s look at some basic concepts of object-oriented programming. Some important concepts are
  • Objects

  • Classes

  • Inheritance

More on Objects

Objects are all the data structures we have created, from smaller ones containing only a single number to larger ones containing large datasets. We apply objects to operations using features or methods that are preinstalled in libraries, that we ’import, or that we create.

Classes

As far as the class concept is concerned, this is a new topic for us in this book; one we have not yet examined. Python allows us to create structures linked to our needs through the class concept. Classes are abstract representations of an object that we can fill with real instances from time to time. When a new class is defined, we can define instances of that class. A class is defined by its own characteristics. For example, if we create a Dog class, we can stipulate that it be defined by features such as the shape of the head, muzzle, hair type (short or long), and so on. A Book class could include features such as book genre, the number of pages, the main topic, the type of cover, ISBN and so on.

Inheritance

Another concept behind object-oriented programming is inheritance. It is possible to create new classes from existing classes. New classes inherit the characteristics of the original classes, but they can extend them with new features. Inheritance is convenient, because it allows us to extend old classes without having to change them. Inheritance can be single and multiple. Through single inheritance, a subclass can inherit member data and methods from one existing class; in the case of multiple inheritance, a subclass can inherit characteristics from more than one existing class.

Let’s look at a simple example that creates a Cat class.
>>> class Cat:
    ... def __init__(self, name, color, age, race):
    ...     self.name = name
    ...     self.color = color
    ...     self.age = age
    ...     self.race = race
Cat is defined by name, color, age, and breed. We can create a cat instance:
>>> cat1 = Cat("Fuffy", "white", 3, "tabby")
which let’s see its features:
>>> print(cat1.name)
Fuffy
>>> print(cat1.color)
white
We can modify the class by adding methods:
>>> class Cat:
    ... def __init__(self, name, color, age, race):
    ...     self.name = name
    ...     self.color = color
    ...     self.age = age
    ...     self.race = race
    ... def cry(self):
    ...     print("meow")
    ... def purr(self):
    ...     print("purr")
# we create a cat instance with this new class
>>> cat2 = Cat("Candy", "Red", "6", "Balinese")
Now we can query the instance not only on the basis of the features, as we did in the first example,
>>> print(cat2.age)
6
but also by using the methods:
>>> cat2.cry()
meow
>>> cat2.purr()
purr
Let’s create a subclass for tabby, a cat breed, so it inherits its features.
>>> class tabby(Cat):
...     def character(self):
...         print("warm")
# we create a tabby instance
>>> tabby1 = tabby("Pallina", "black", 4, "tabby")
>>> tabby1.purr()
purr
>>> tabby1.character()
warm

Thus we can query the instance not only with regard to the characteristics of the tabby subclass but also with regard to the existing Cat class.

Modules

Modules are collections of functions that are generally related to a given topic (graphics, data analysis). Forms can belong to one of the following categories:
  • Python modules

  • Precompiled modules

  • Built-in modules

To use a module, we must first import it by using the import instruction:
import ...
For instance,
import numpy

Using “”import,”” we imported the entire module.

However, we can import part of a module:
from ... import ...
For example,
from matplotlib import cm
To simplify and speed up the writing of code, we can import a form with another name:
import numpy as np
We can then check our modules with the following command:
help('modules')
After a module is imported—say, in Jupyter—we can access the help section by displaying all the methods of that particular module by pressing the Tab key (Figure 6-1).
../images/469457_1_En_6_Chapter/469457_1_En_6_Fig1_HTML.jpg
Figure 6-1

pandas’ methods on Jupyter

In addition, we can import very specific elements of a module. For instance:
from math import sqrt
In this case, we are not importing the entire module—only the function for the square root. In this case, we don’t need to call sqrt as a method of the math module. We can call the square root directly:
>>> sqrt(9)
3.0
If we import the entire module, we must specify sqrt as the math module method:
>>> import math
>>> math.sqrt(12)
3.4641016151377544
Some programming languages, such as Anaconda, install a whole series of modules and packages automatically (Figure 6-2). However, if we need to install a particular module, we can do it from the computer terminal by typing:
$ pip install nome_modulo
../images/469457_1_En_6_Chapter/469457_1_En_6_Fig2_HTML.jpg
Figure 6-2

Installing a package on the computer

As you can see in Figure 6-2, we are on the terminal, not in a Python window, because as the “>>>” symbol is missing at the prompt. The “$” symbol indicates that the terminal is being used.

Depending on the package we are installing (for example, from GitHub), we may find different instructions for installation in the package documentation itself. Packages are a collection of modules, often on the same subject. For instance, SciPy and NumPy contain dozens of data analysis modules.

Methods

In Python, everything is an object and, depending on the type to which it belongs, different methods (or functions) can be applied to each object. Methods are sort of like functions, but they are related to particular classes. This means that lists have their own methods, tuples have different methods, and so on. Each method performs an operation on an object, similar to a function.

Depending on the tool we use to program or for our Python data analyses, we may have some suggestions on methods associated with a particular object. This is the case when using Spyder and Jupiter. Figure 6-3 displays an example of methods for an object using Spyder; Figure 6-4 shows the methods using Jupyter.
../images/469457_1_En_6_Chapter/469457_1_En_6_Fig3_HTML.jpg
Figure 6-3

Example on Spyder

../images/469457_1_En_6_Chapter/469457_1_En_6_Fig4_HTML.jpg
Figure 6-4

Example on Jupyter

An example of a function is print(); an example of a method is .upper. Let’s look at an example:
>>> string1 = "this is a string"
>>> print(string1)
this is a string
>>> string1.upper()
"THIS IS A STRING"
To get information about a specific method, we use the help() function. For example, we can create a list and activate help with the .append method:
>>> x = [1,2,3,4,5,6]
>>> help(x.append)
../images/469457_1_En_6_Chapter/469457_1_En_6_Fig5_HTML.jpg
Figure 6-5

Output of the code above

List Comprehension

List comprehension is a syntax construct that allows us to create new lists from other lists. Let’s create a list and then apply a function to the elements of another list. We can do this using a for loop:
# we create a number list
>>> numbers = [12, 23, 34, 57, 89, 97]
# in Python2
# using the for loop we add the number 10 to each item on the list
>>> for i in range(len(numbers)):
...         numbers[i] = numbers[i] + 10
>>> numbers
[22, 33, 44, 67, 99, 107]
# we can create a new list by adding a certain number even through the list comprehension
# we will overwrite the first list, number, with a new one where every item is added to 10 using Python3
>>> numbers = [number + 10 for number in numbers]
[32, 43, 54, 77, 109, 117]
# the way we call each element of the list is random
>>> numbers = [n + 10 for n in numbers]
>>> numbers
[42, 53, 64, 87, 119, 127]
Thus, list comprehension allows us to simplify an iteration that turns our list into a new one. We can apply list comprehension to a list of numbers, as we just saw, we can also apply it to strings. For example, we can iterate the operation and turn everything in it into uppercase type:
>>> strings = ['this', 'is' ,'a', 'string']
>>> strings2 = [string.upper() for string in strings]
>>> strings2
['THIS', 'IS', 'A', 'STRING']

Regular Expressions

Imagine trying to search for a word in a text document. When we do the search, it returns all instances of the word we are looking for. However, if we do this type of search, we may skip some occurrences—for example, if they start with an uppercase letter or if they are followed or preceded by punctuation. If the word we are looking for also exists within another word, we must also consider the empty spaces of the word itself to find it.

The use of regular expressions, also known as regex, makes it much easier to identify all these options. Regular expressions are patterns that allow us to describe generic strings, which make this type of search more elaborate. They are very useful for searching, replacing, or extracting strings or substrings. Regular expressions can, for example, be used to extract dates, e-mail addresses, and physical mail addresses, because they do not just extract the single e-mail address. For example, we can insert an e-mail address in a search box with a structure similar to [email protected], which identifies the structure of an e-mail address and extracts multiple e-mail addresses from the same document according to its structure.

In Python, regular expressions are handled via the module re, which we can import:
>>> import re
# we create a sample string
>>> str1 = "Try searching for a word using regular expressions and the Python module kernel"
We search for an occurrence using re.search():
>>> re.search('word', str1)
<_sre.SRE_Match object at 0x10280bed0>
# this result tells us that the word we were looking for is present in the string
# to get the same result in its simplest form, save the previous line of code in an object and query it with the bool() function:
>>> exre1 = re.search('word', str1)
>>> bool(exre1)
True
We can search for an occurrence by using .findall():
>>> re.findall('Try', str1)
['Try']
# if the item is not present, we receive an empty list
>>> re.findall('some', str1)
[]
# this function is case sensitive, so "'some"' is different from "'Some"'
We can also divide a string into the elements that comprise it:
>>> re.split(' ', str1)
['Try', 'searching', 'for', 'a', 'word', 'using', 'regular', 'expressions', 'and', 'the', 'Python', 'module', 'kernel']
# in the previous code, we used a space as a splitting element; next we split the string into words using the conjunction ' and ' (and insert spaces on either side to avoid searching for internal recurrences of a word)
>>> re.split(' and ', str1)
['Try searching for a word using regular expressions', 'the Python module kernel']
# we get the previous result by dividing the string in two according to the position of the conjunction "'and'"
Regular expression symbols (such as *, +, and ?) allow us to search for a character one or more times, or followed by other letters:
>>> re.findall('ea*', str1)
['ea', 'e', 'e', 'e', 'e', 'e', 'e', 'e']
>>> re.findall('ea+', str1)
['ea']
>>> re.findall('ea?', str1)
['ea', 'e', 'e', 'e', 'e', 'e', 'e', 'e']
>>> re.findall('ea+?', str1)
['ea']
# we can, for instance, extract all the words with a capital letter
>>> re.findall('[A-Z][a-z]*', str1)
['Try', 'Python']
# or all the words in lowercase letters
>>> re.findall('[a-z]*', str1)
[",
 'ry',
 ",
 'searching',
 ",
 'for',
 ",
 'a',
 ",
 'word',
 ",
 'using',
 ",
 'regular',
 ",
 'expressions',
 ",
 'and',
 ",
 'the',
 ",
 ",
 'ython',
 ",
 'module',
 ",
 'kernel',
 "]
# or all uppercase or lowercase words
>>> re.findall('[a-z]*', str1, re.IGNORECASE)
['Try',
 ",
 'searching',
 ",
 'for',
 ",
 'a',
 ",
 'word',
 ",
 'using',
 ",
 'regular',
 ",
 'expressions',
...
# or we can do this
>>> re.findall('[^.-! ]+', str1) >>> ['Try', 'searching', 'for', 'a', 'word', 'using', 'regular', 'expressions', 'and', 'the', 'Python', 'module', 'kernel']
# d find the numbers
>>> str2 = "We are going to meet today at 14:15"
>>> re.findall('d', str2)
# we can use other symbols to help us target the extraction more explicitly
>>> re.findall('d+', str2)
['14', '15']
# for example, let's look for all the words that include p
>>> re.findall(r'[p]S*', str1)
['pressions']
# or for the letter 'p' in either lowercase or uppercase letters
>>> re.findall(r'[p]S*', str1, re.IGNORECASE)
['pressions', 'Python']
# or we can extract e-mails from a string
>>> str3 = "my email is [email protected], my second email is [email protected]"
>>> re.findall("[w.-]+@[w.-]+", str3, re.IGNORECASE)
We can perform a regular expression test by using tools on the Web, such as http://pythex.org . (http://​pythex.​org/​)More information about regular expressions can be found at https://docs.python.org/2/library/re.html for the re module. Table 6-1 lists the various symbols of regular expressions.
Table 6-1

Symbols and Regular Expressions

Symbol

Description

\d

Digit, 0, 1, 2, . . . 9

\D

Not digit

\s

Space

\S

Not space

\w

Word

\W

Not word

\t

Tab

\n

New line

^

Beginning of the string

$

End of the string

Escape special characters—for example,\ is “”, + is “+”

|

Alternation match—for example /(e|d)n/ is “”en”” and “”dn””

*

Any character, except or a line terminator

[ab]

a or b

[^ab]

Any character except a and b

[0-9]

All digits

[A-Z]

All uppercase letters from A to Z

[a-z]

All lowercase letters from a to z

[A-z]

All uppercase and lowercase letters from a to z

i+

i at least one time

i*

i zero or more times

i?

i zero or one time

i{n}

i that occurs n times in sequence

i{n1,n2}

i that occurs n1 - n2 times in sequence

i{n1,n2}?

Nongreedy match, see previous example

i{n,}

i occurs ≥n times

[:alnum:]

Alphanumerical characters: [:alpha:] and [:digit:]

[:alpha:]

Alphabetical characters: [:lower:] and [:upper:]

[:blank:]

Blank characters, such as space and tab

[:cntrl:]

Control characters

[:digit:]

Digits: 0 1 2 3 4 5 6 7 8 9

[:graph:]

Graphical characters: [:alnum:] and [:punct:]

[:lower:]

Lowercase letters in the current locale

[:print:]

Printable characters: [:alnum:], [:punct:] and space

[:punct:]

Punctuation characters such as ! “” # $ % & “ ( ) * + , - . / : ; < = > ? @ [ ] ^ _ ‘ { | } ~

[:space:]

Space characters: tab, new line, vertical tab, form feed, carriage return, space

[:upper:]

Uppercase letters in the current locale

[:xdigit:]

Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f

User Input

The input function (raw_input in Python2) is used to let us talk to a program that has to handle responses depending on the type of input (we studies this when setting functions). In Python2, we handle user input with the raw_input() function:
>>> name = raw_input("What is your name? ")
What is your name? Valentina
>>> print(name)
Valentina
>>> print("Nice to meet you, " + name)
Nice to meet you, Valentina

Caution

Python2 and Python3 handle user input differently.

In Python3, we use input() instead of raw_input():
>>> name = input("What is your name? ")
What is your name? Valentina
>>> print(name)
Valentina
>>> print("Nice to meet you, " + name)
Nice to meet you, Valentina
When input is entered in this way, it is read as a string. So, for example, if we want to enter a number, we have to write the code a bit differently. We must specify that what we are entering must be read as a number:
>>> num1 = input('add a number: ')
>>> num2 = input('add a second number ')
>>> print(num1 + num2)
# the result will be an integer number resulting from the addition and depending from the number you choose:
3725
Numbers are not summed; they are attached, as happens with two strings. To add them, specify that the value we are entering is a number, and proceed as follows:
>>> num1 = int(input('enter a number: '))
>>> num2 = int(input('add a second number '))
>>> print(num1 + num2)

Errors and Exceptions

Errors and exceptions in Python are nothing more than abnormal or unexpected events that change the normal running of our code. An exception may be the result of invalid inputs (for example, we ask users to enter a number and they enter a letter), hardware issues, or files or objects are not found. There are three main types of errors:
  1. 1.

    Syntactic

     
  2. 2.

    Semantic

     
  3. 3.

    Logical

     
Syntax errors are mistakes we make when writing code. They are either spelling mistakes or syntax errors in the code.
# example of a syntax error message due to the absence of the quotation mark at the bottom of the string
>>> print 'Hello World
  File "<stdin>", line 1
    print 'Hello World
                     ^
SyntaxError: EOL while scanning string literal

Errors and exceptions usually cause error messages, which we can then use to identify the error and determine whether we can remedy it by modifying the code or handling an exception.

When we expect an exception to occur (called a handled exception ), the way to remedy it is to write suitable code. Unexpected exceptions are called unhandled exceptions .

To handle errors and exceptions in Python, we typically use try(), except(), and raise().

For example, let’s sum two items that cannot be summed, such as a number and a string:
>>> 37 + 'string'
TypeError                                 Traceback (most recent call last)
<ipython-input-1-5dc2db43a4bf> in <module>()
----> 1 37 + 'string'
TypeError: unsupported operand type(s) for +: 'int' and 'str'
# clearly, the result is an error
Let’s look at the type of error in the message: TypeError. We need to create a way to handle this error. For example, let’s ask users to enter two numbers and then return the sum of the two numbers.
>>> try:
    ... num1 = int(input('enter a number: '))
    ... num2 = int(input('enter a second number '))
    ... print(num1 + num2)
except TypeError:
    ... print("There is something wrong! Check again!")
>>> enter a number: 37
>>> enter a second number 25
62

We managed the TypeError error. If users insert two numbers, they are summed correctly.

Let’s see what happens if an incorrect value is entered rather than a number:
>>> try:
    ... num1 = int(input('enter a number: '))
    ... num2 = int(input('enter a second number '))
    ... print(num1 + num2)
except TypeError:
    ... print("There is something wrong! Check again!")
>>> enter a number: 37
>>> enter a second number string
ValueError                                Traceback (most recent call last)
<ipython-input-16-566345f8fed9> in <module>()
      1 try:
      2     num1 = int(input('enter a number: '))
----> 3     num2 = int(input('enter a second number '))
      4     print(num1 + num2)
      5 except TypeError:
ValueError: invalid literal for int() with base 10: 'string'
There is a problem! We managed the TypeError error, but now there is a different error. We can handle this as follows:
>>> try:
    ... num1 = int(input('enter a number: '))
    ... num2 = int(input('enter a second number '))
    ... print(num1 + num2)
except TypeError:
    ... print("There is something wrong! Check again!")
except ValueError:
    ... print("There is something wrong! Check again!")
enter a number: 37
enter a second number string
There is something wrong! Check again!
Or, we manage the exception with Exception:
>>>try:
    ... num1 = int(input('enter a number: '))
    ... num2 = int(input('enter a second number '))
    ... print(num1 + num2)
except Exception:
    ... print("There is something wrong! Check again!")
>>> enter a number: 37
>>> enter a second number test
There is something wrong! Check again!

Exception is a class of basic errors that includes most errors. Other common types, as we just saw, are TypeError and ValueError, but there are others: AttributeError, EOFError, IOError, IndexError, KeyError, KeyboardInterrupt, NameError, StopIteration, and ZeroDivisionError.

Summary

In this chapter we studied basic concepts of programming in Python: modules and methods, list comprehension and class creation, regular expressions, and errors and exceptions. In Chapter 7, we learn about importing files.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.50.87