Other BigData Tools andTechnologies | 189
Short-answer Type Questions (5 Marks Questions)
1. Explain what is Sqoop in Hadoop? Please
explain the usage.
2. What are the components used in Hive
query processor?
3. What is Bucket in Hive?
4. For each Sqoop copying into HDFS, how
many MapReduce jobs and tasks will be
submitted? Please explain.
5. I am having around 500 tables in a data-
base. I want to import all the tables from
the database except the tables named Table
498, Table 323 and Table 199. How can
we do this without having to import the
tables one by one?
6. Explain the significance of using split-by
clause in Apache Sqoop.
7. I want to see the present working directory
in UNIX from Hive. Is it possible to run
this command from Hive?
8. What is the use of explode in Hive?
9. Is it possible to change the default location
of managed tables in Hive, if so how?
10. Why do we need Hive?
Long-answer Type Questions (10 Marks Questions)
1. What is partitioning? When we may need
to customize the default partition? Please
explain the scenario with an example.
2. If you run a select * query in Hive, why does
it not run MapReduce? Please explain it.
3. What is the difference between external
table and managed table?
4. Why do we perform partitioning in Hive?
Please explain the advantage of it.
5. Suppose, we create a table that contains
details of all the transactions done by the
customers of year 2018. CREATE TABLE
customer_transaction_details (cust_id INT,
amount FLOAT, month STRING, country
STRING) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’ ;
Now, after inserting 50,000 tuples in this
table, we want to know the total revenue
generated for each month. But the problem
is, Hive is taking too much time in pro-
cessing this query. How will you solve this
problem and list the steps that we will be
taking in order to do so?
6. Explain the data flow in Hive with a dia-
gram. Please describe each and every step.
7. What is the usage of Metastore in Hive. If
Metastore is not present in Hive, then what
will be the problem?
8. How will you update the rows that are
already exported? Write Sqoop command
to show all the databases in MySQL server.
9. I am getting connection failure exception
during connecting to MySQL through
Sqoop, what is the root cause and fix for
this error scenario?
10. How to create a table in MySQL and how
to insert the values into the table? Please
import this table into Hive/HDFS using
Apache Sqoop.
11. Please explain how apache Flume works.
Also please describe a flow about how
to extract a log file from source path and
ingest into HDFS by Flume.
M07 Big Data Simplified XXXX 01.indd 189 5/17/2019 2:50:16 PM
190 | Big Data Simplied
12. What is the usage of Oozie and what are
the main components in it. Please explain.
13. Please explain about the main functional-
ity of the ZooKeeper in Hadoop ecosystem
inside different components. What will be
the main problem that we will face if the
ZooKeeper is not present in the cluster?
14. Explain the core components of Apache
Flume. What is Agent and Channel?
15. Please explain the several benefits of
ZooKeeper. Also explain the CLI in
ZooKeeper.
M07 Big Data Simplified XXXX 01.indd 190 5/17/2019 2:50:16 PM
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.71.28