Import complete database

Sqoop executable also comes with the capability to import all tables from a relational database to HDFS. The typical syntax to perform this is as given:

sqoop import-all-tables (generic-args) (import-args)

With respect to the previous import command, the only difference here would be to use the import-all-tables option of Sqoop command. A sample command is as shown:

bin/sqoop import-all-tables --connect jdbc:postgresql://<DB_SERVER_ADDRESS>/<DB_NAME>?schema=<SCHEMA> --m 1 --username <DB_USER_NAME> --password <DB_PASSWORD> --as-avrodatafile

As we come to this point, we also realize that in order to see this command in action, we should have multiple tables in the relational database. Hence, I would suggest creating an additional table of addresses related to customers:

CREATE TABLE public.address 
(
id integer NOT NULL,
custumer_id integer NOT NULL,
street1 character varying COLLATE pg_catalog."default",
street2 character varying COLLATE pg_catalog."default",
city character varying COLLATE pg_catalog."default",
state character varying COLLATE pg_catalog."default",
country character varying COLLATE pg_catalog."default",
zip_pin_postal_code character varying COLLATE pg_catalog."default",
CONSTRAINT address_pkey PRIMARY KEY (id),
CONSTRAINT customer_fkey FOREIGN KEY (custumer_id)
REFERENCES public.customer (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;

ALTER TABLE public.address
OWNER to postgres;

Once the table is created, the following scripts may be used to load sample data:

INSERT INTO address (id, custumer_id, street1, street2, city, state, country, zip_pin_postal_code) 
VALUES (0, 0, 'd-40', 'chavez street', 'trivandrum', 'kerala', 'india', '778908');
INSERT INTO address (id, custumer_id, street1, street2, city, state, country, zip_pin_postal_code)
VALUES (1, 1, 'l-90', 'cooper street', 'mumbai', 'maharashtra', 'india', '400056');
INSERT INTO address (id, custumer_id, street1, street2, city, state, country, zip_pin_postal_code)
VALUES (2, 2, 'a-47', 'sector-11', 'noida', 'uttar pradesh', 'india', '201311');
INSERT INTO address (id, custumer_id, street1, street2, city, state, country, zip_pin_postal_code)
VALUES (3, 3, 'r-98', 'sector-37', 'gurgaon', 'haryana', 'india', '122021');
INSERT INTO address (id, custumer_id, street1, street2, city, state, country, zip_pin_postal_code)
VALUES (4, 4, '201', 'high street', 'austin', 'texas', 'us', '41101');

If we want to see the import-all-tables in action, we will have to remove existing files in HDFS otherwise this command would fail with a message similar to: Output directory  hdfs://localhost:9000/user/<username>/customer already exists.

In order to remove the contents from HDFS, we may run the following command in the shell:

hdfs dfs -rm -r /user/<username>/*

After the import of tables are done, we see that the tables were traversed in a cascading manner to retrieve all the information as shown in the following screenshot:

Figure 17: Import all tables in a database into HDFS
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.165.70