252 | Big Data Simplied
comm.Send([data, MPI.INT], dest=1, tag=77)
elif rank == 1:
data = numpy.empty(1000, dtype=i)
comm.Recv([data, MPI.INT], source=0, tag=77)
Did You Know?
The pickle library in Python is used to serialize and deserialize Python object (for example,
list, dict, etc.) structure so that it can be saved on disk. So, in essence, pickling converts an
object to a character stream containing all information needed to reconstruct the object.
Below is a brief code snippet on how to work with Python pickle library.
import pickle
def savedata():
# Having data ready to be pickled ...
data = np.arange(15, dtype=int)
data.resize((5,3))
pklfile = open(SavePkl, ab)
# Save data from array object to pickle file ...
pickle.dump(data, pklfile)
pklfile.close()
def readdata():
# Read from the pickle file ...
pklfile = open(SavePkl, rb)
data = pickle.load(pklfile)
print(data)
pklfile.close()
# Run the functions to save data in an object to a file and then
retrieve and print it...
>>> savedata()
>>> readdata()
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]
[12 13 14]]
9.4 PYTHON-MAPREDUCE USING HADOOP STREAMING
9.4.1 What is Hadoop Streaming?
‘Hadoop Streaming’ is a very good utility that comes with the Hadoop distribution package as a
specic library. Hadoop streaming can be performed using different languages, like Python, Java,
M09 Big Data Simplified XXXX 01.indd 252 5/10/2019 10:22:59 AM
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.206.25