© Valentina Porcu 2018
Valentina PorcuPython for Data Mining Quick Syntax Referencehttps://doi.org/10.1007/978-1-4842-4113-4_4

4. Functions

Valentina Porcu1 
(1)
Nuoro, Italy
 

An object-based programming language is structured around two major concepts: objects and functions. An object is everything we create in a work session using a programming language such as Python. Functions allow us to assign one or more actions to these objects. Let’s learn how to create a function.

Some words about functions in Python

With Python, we basically have two types of functions:
  1. 1.

    The built-in functions that are part of Python and are loaded automatically when we run Python

     
  2. 2.

    The functions we can build and use (user defined)

     

A function is a piece of code that performs one or more operations on an object and returns an output result. Functions are especially useful when we have to do the same thing over multiple objects. We can do this without repeating the same line of code several times.

The two types of functions are also supported by those in the many libraries available for installation on Python. Whenever we need a particular function (or a package, that is a family of functions), we can install it and use it. Anaconda does not allow us to install many of the packages we need because they already exist in the suite.

If a package is not included in Anaconda, we can always install it using generic terms:
$ conda install package_name
Or, we can use pip:
$ pip install package_name

In any case, the exact wording for installing a package is always included in the official documentation of the package itself.

Some Predefined Built-in Functions

Default functions are within the builtins module. Although there are many, some of the most commonly used ones are dir, help, type, and print. We can display them by typing
>>> dir(__builtins__)
['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException', 'BlockingIOError', 'BrokenPipeError', 'BufferError', 'BytesWarning', 'ChildProcessError', 'ConnectionAbortedError', 'ConnectionError', 'ConnectionRefusedError', 'ConnectionResetError', 'DeprecationWarning', 'EOFError', 'Ellipsis', 'EnvironmentError', 'Exception', 'False', 'FileExistsError', 'FileNotFoundError', 'FloatingPointError', 'FutureWarning', 'GeneratorExit', 'IOError', 'ImportError', 'ImportWarning', 'IndentationError', 'IndexError', 'InterruptedError', 'IsADirectoryError', 'KeyError', 'KeyboardInterrupt', 'LookupError', 'MemoryError', 'NameError', 'None', 'NotADirectoryError', 'NotImplemented', 'NotImplementedError', 'OSError', 'OverflowError', 'PendingDeprecationWarning', 'PermissionError', 'ProcessLookupError', 'RecursionError', 'ReferenceError', 'ResourceWarning', 'RuntimeError', 'RuntimeWarning', 'StopAsyncIteration', 'StopIteration', 'SyntaxError', 'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError', 'TimeoutError', 'True', 'TypeError', 'UnboundLocalError', 'UnicodeDecodeError', 'UnicodeEncodeError', 'UnicodeError', 'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning', 'ValueError', 'Warning', 'ZeroDivisionError', '__build_class__', '__debug__', '__doc__', '__import__', '__loader__', '__name__', '__package__', '__spec__', 'abs', 'all', 'any', 'ascii', 'bin', 'bool', 'bytearray', 'bytes', 'callable', 'chr', 'classmethod', 'compile', 'complex', 'copyright', 'credits', 'delattr', 'dict', 'dir', 'divmod', 'enumerate', 'eval', 'exec', 'exit', 'filter', 'float', 'format', 'frozenset', 'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex', 'id', 'input', 'int', 'isinstance', 'issubclass', 'iter', 'len', 'license', 'list', 'locals', 'map', 'max', 'memoryview', 'min', 'next', 'object', 'oct', 'open', 'ord', 'pow', 'print', 'property', 'quit', 'range', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice', 'sorted', 'staticmethod', 'str', 'sum', 'super', 'tuple', 'type', 'vars', 'zip']
The dir() function is important because it allows us to display a list of the attributes or methods of the objects we insert inside it. For example,
>>> test1 = ["object1", "object2", "object3", "object4", "object5"]
>>> dir(test1)
['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__delslice__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
Attributes or methods are nothing more than actions we can take on that particular object, such as adding an item to a list, as we saw in Chapter 3:
>>> test1.append("pippo")
>>> test1
['object1', 'object2', 'object3', 'object4', 'object5', 'pippo']
We can use the type() function, which shows the type of object inserted inside it.
>>> type(test1)
<type 'list'>

It is important to remember that when bracketing an object (such as a list, tuple, dictionary, and so on) using the dir() function, we get a list of actions we can assign to that particular object.

When we work with packages and functions written by other data scientists, it is useful to obtain information about their functions and their parameters. Let’s see how to do this.

Obtain Function Information

Within a function, we can find all the parameters specific to that function. To get information about a function and its parameters, type
>>> help(print)
Help on built-in function print in module builtins:
print(...)
    print(value, ..., sep=' ', end=" ", file=sys.stdout, flush=False)
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.
# the help() function is only available for Python3
Thus, we get a series of information about that function. To quit, press q. We can also get help regarding a particular method:
>>> help(test1.append)
Help on built-in function append:
append(...)
    L.append(object) -- append object to end

You can find the built-in functions for Python 2.7 at https://docs.python.org/2/library/functions.html . You can find the built-in features for version 3 at https://docs.python.org/3/library/functions.html .

If you are using Jupyter, you can display the methods by pressing the Tab key. Press Shift+Tab to display the parameters of a function (Figures 4-1 and 4-2).
../images/469457_1_En_4_Chapter/469457_1_En_4_Fig1_HTML.jpg
Figure 4-1

Methods in Jupyter 1

../images/469457_1_En_4_Chapter/469457_1_En_4_Fig2_HTML.jpg
Figure 4-2

Parameters of a function in Jupyter 2

When using Spyder, the information in Figures 4-3 and 4-4 appears automatically.
../images/469457_1_En_4_Chapter/469457_1_En_4_Fig3_HTML.jpg
Figure 4-3

Methods in Spyder 1

../images/469457_1_En_4_Chapter/469457_1_En_4_Fig4_HTML.jpg
Figure 4-4

Parameters of a function in Spyder 1

Create Your Own Functions

In addition to using the default features or importing them from other libraries, we can also create our own functions. As mentioned, functions are pieces of code that tell Python how to do something. A function has three parts: name, parameters, and body (Figure 4-5). The statement that allows us to create a function is def:
>>> def goal_fun(x):
...        "'(x) -> y
...        here we will write the documentation of the function, then
        what the function performs
...
        "'
...        return(x+y)
../images/469457_1_En_4_Chapter/469457_1_En_4_Fig5_HTML.jpg
Figure 4-5

How to write a function

Let’s create a function that sums the number 5 to any x value:
>>> def sum1(x):
...        "'sum x to 5
...        ""
...        return(x+5)
>>> sum1(10)
15
>>> sum1(130)
135
In this function, we entered one parameter, but we can enter more than one:
>>> def mult_xy(x, y):
...     "'multiply x and y
...     ""
...     return(x*y)
>>> mult_xy(5,6)
30

To help us see the path taken by one of our functions, we can use online tools such as Python Tutor ( http://pythontutor.com/ ).

Save and run Your Own Modules and Files

We’ve seen how to create .py scripts and put them in a work directory, which we can find by importing the os module and typing the following:
>>> import os
>>> os.getcwd()
We can create a file from any text editor, which we must rename so it includes .py:
example_script.py
After the script is placed in the work directory, we run it by typing
# type on the computer terminal
$ python example_script.py
If the script is not in the work directory, we need to change the directory from the computer terminal:
# type on the terminal
$ cd directory_address
For instance,
# type on the terminal
$ cd /~/Downloads
After the directory has changed, proceed as we did earlier:
$ python example_script.py

The Python shell is convenient for testing on the fly, but for a complex script, it is always better to write it using an editor and then run it that way, or copy the script and run it in the Python shell.

Summary

Writing functions is a very important task for a data scientist. Languages such as R have a large number of packages and functions for every statistical need. With Python, however, we often need to write our functions, as detailed in this chapter. In Chapter 5, we look at more tools that we can use to build a useful function.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.163.175