Building function compositions for data processing

One of the major parts of any machine learning system is the data processing pipeline. Before data is fed into the machine learning algorithm for training, we need to process it in different ways to make it suitable for that algorithm. Having a robust data processing pipeline goes a long way in building an accurate and scalable machine learning system. There are a lot of basic functionalities available, and data processing pipelines usually consist of a combination of these. Instead of calling these functions in a nested or loopy way, it's better to use the functional programming paradigm to build the combination. Let's take a look at how to combine these functions to form a reusable function composition. In this recipe, we will create three basic functions and look at how to compose a pipeline.

How to do it…

  1. Create a new Python file, and add the following line:
    import numpy as np
  2. Let's define a function to add 3 to each element of the array:
    def add3(input_array):
        return map(lambda x: x+3, input_array)
  3. Let's define a second function to multiply 2 with each element of the array:
    def mul2(input_array):
        return map(lambda x: x*2, input_array)
  4. Let's define a third function to subtract 5 from each element of the array:
    def sub5(input_array):
        return map(lambda x: x-5, input_array)
  5. Let's define a function composer that takes functions as input arguments and returns a composed function. This composed function is basically a function that applies all the input functions in sequence:
    def function_composer(*args):
        return reduce(lambda f, g: lambda x: f(g(x)), args)

    We use the reduce function to combine all the input functions by successively applying the functions in sequence.

  6. We are now ready to play with this function composer. Let's define some data and a sequence of operations:
    if __name__=='__main__':
        arr = np.array([2,5,4,7])
    
        print "
    Operation: add3(mul2(sub5(arr)))"
  7. If we were to use the regular method, we apply this successively, as follows:
        arr1 = add3(arr)
        arr2 = mul2(arr1)
        arr3 = sub5(arr2)
        print "Output using the lengthy way:", arr3
  8. Let's use the function composer to achieve the same thing in a single line:
        func_composed = function_composer(sub5, mul2, add3)
        print "Output using function composition:", func_composed(arr) 
  9. We can do the same thing in a single line with the previous method as well, but the notation becomes really nested and unreadable. Also, this is not reusable! You have to write the whole thing again if you want to reuse this sequence of operations:
        print "
    Operation: sub5(add3(mul2(sub5(mul2(arr)))))
    Output:", 
                function_composer(mul2, sub5, mul2, add3, sub5)(arr)
  10. If you run this code, you will get the following output on the Terminal:
    How to do it…
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.119.229