You have values in a datafile that are not in a format that is suitable for loading into a table. For example, values are in the wrong units, or two input fields must be combined and inserted into a single column.
LOAD
DATA
has the capability of performing limited preprocessing
of input values before inserting them. This enables you to map input
data onto more appropriate values before loading them into your
table.
Specifying Input Column Order shows how you can
specify a column list for LOAD
DATA
to indicate how input fields
correspond to table columns. The column list also can name
user-defined variables, such that for each input record, the input
fields are assigned to the variables. You can then perform
calculations with those variables before inserting the result into the
table. These calculations are specified in a SET
clause that
names one or more col_name
=
expr
assignments, separated by commas.
Suppose that you have a datafile that has the following columns, with the first line providing column labels:
Date Time Name Weight State 2006-09-01 12:00:00 Bill Wills 200 Nevada 2006-09-02 09:00:00 Jeff Deft 150 Oklahoma 2006-09-04 03:00:00 Bob Hobbs 225 Utah 2006-09-07 08:00:00 Hank Banks 175 Texas
Suppose also that the file must be loaded into a table that has these columns:
CREATE TABLE t ( dt DATETIME, last_name CHAR(10), first_name CHAR(10), weight_kg FLOAT, st_abbrev CHAR(2) );
There are several mismatches between the datafile fields and the table columns that must be addressed to be able to import the file:
The file contains separate date and time columns that must
be combined into date-and-time values for insertion into the
DATETIME
column.
The file contains a name field, which must be split into
separate first and last name values for insertion into the
first_name
and last_name
columns.
The file contains a weight in pounds, which must be
converted to kilograms for insertion into the weight_kg
column. (The conversion factor
is that 1 lb. equals .454
kg.)
The file contains state names, but the table contains
two-letter abbreviations. The name can be mapped to the
abbreviation by performing a lookup in the states
table.
To handle these conversions, assign each input column to a
user-defined variable, and write a SET
clause to perform the calculations.
(Remember to skip the first line that contains the column
labels.)
mysql>LOAD DATA LOCAL INFILE 'data.txt' INTO TABLE t
->IGNORE 1 LINES
->(@date,@time,@name,@weight_lb,@state)
->SET dt = CONCAT(@date,' ',@time),
->first_name = SUBSTRING_INDEX(@name,' ',1),
->last_name = SUBSTRING_INDEX(@name,' ',-1),
->weight_kg = @weight_lb * .454,
->st_abbrev = (SELECT abbrev FROM states WHERE name = @state);
As a result of this import operation, the table contains these rows:
mysql>SELECT * FROM t;
+---------------------+-----------+------------+-----------+-----------+
| dt | last_name | first_name | weight_kg | st_abbrev |
+---------------------+-----------+------------+-----------+-----------+
| 2006-09-01 12:00:00 | Wills | Bill | 90.8 | NV |
| 2006-09-02 09:00:00 | Deft | Jeff | 68.1 | OK |
| 2006-09-04 03:00:00 | Hobbs | Bob | 102.15 | UT |
| 2006-09-07 08:00:00 | Banks | Hank | 79.45 | TX |
+---------------------+-----------+------------+-----------+-----------+
LOAD
DATA
can perform data value reformatting, as
shown in this section. Other examples showing uses for this capability
occur later in the chapter. For example, Importing Non-ISO Date Values uses it to perform rewriting of
non-ISO dates to ISO format during data import. However, although
LOAD
DATA
can map input values to other values,
it cannot outright reject an input record that is found to contain
unsuitable values. To do that, you can either preprocess the input
file to remove these records or issue a DELETE
statement after loading the
file.
3.135.206.254