Special note for Windows installation

Spark (really Hadoop) needs a temporary storage location for its working set of data. Under Windows this defaults to the mphive location. If the directory does not exist when Spark/Hadoop starts it will create it. Unfortunately, under Windows, the installation does not have the correct tools built-in to set the access privileges to the directory.

You should be able to run chmod under winutils to set the access privileges for the hive directory. However, I have found that the chmod function does not work correctly.

A better idea has been to create the tmphive directory yourself in admin mode. And then grant full privileges to the hive directory to all users, again in admin mode.

Without this change, Hadoop fails right away. When you start pyspark, the output (including any errors) are displayed in the command line window. One of the errors will be insufficient access to this directory.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.6.114