Thursday, December 23, 2010

How to prepare Oracle insert queries for data present in flat file



    In this article, we are going to see how to load text data present in a file into an Oracle table. Data can be loaded into an Oracle table from a flat file in  two ways:

1. Using SQL Loader.
2. By Preparing Insert queries.

 Let us discuss about the second method of preparing insert queries:

 Assume we have an EMP table in the Oracle database with the following fields:
EMP_ID          NUMBER(4)
EMP_NAME        VARCHAR2(20)
EMP_SAL         NUMBER(5)

1. Let us consider a data file, emp.dat, with the following content:
234, Sada,1000
235, Nilesh,2000
236, Ravi,3000
  Now, we want to prepare the insert queries for the above data.

   The following awk one liner will do the needful for us:
#awk '{print "INSERT INTO EMP VALUES(" $1 c, q $2 q, c $3, ");";}' q="'" c="," FS=, emp.dat
INSERT INTO EMP VALUES(234, 'Sada', 1000);
INSERT INTO EMP VALUES(235,'Nilesh', 2000);
INSERT INTO EMP VALUES(236,'Ravi', 3000);
#
  The above code is self explanatory. The variable q and c is used to put the single quote and comma at appropriate places. The above output can be re-directed to a file and run in any sqlplus session to insert data into the table.

2. Let us consider another file, emp.dat, with the following content:
234, Sada, 1000
235, Nilesh, 2000
236, Ravi, 
  Please note for the employee Ravi the salary is not present. The above awk command will leave it blank in the INSERT statement as well which will fail when run in the Oracle. And hence the salary part for the employee Ravi should be set to NULL while preparing the query.

The following awk one liner will do this:
#awk '{print "INSERT INTO EMP VALUES(" $1 c, q $2 q, c, ($3)?$3:"NULL" , ");";}' q="'" c="," FS=, emp.dat
INSERT INTO EMP VALUES(234, 'Sada', 1000);
INSERT INTO EMP VALUES(235,'Nilesh', 2000);
INSERT INTO EMP VALUES(236,'Ravi', NULL);
#
  The ternary operator is used to check the value of the 3rd field in the above.

3. Let us consider another input file with a different format:
 Amol, 1000
 Timone,2000
 Patil, 3000
   The employee number is not present in the file. The requirement is to generate the insert queries with sequential order of employee ids.

The following awk one liner will generate the needful:
#awk '{print "INSERT INTO EMP VALUES(" ++i,  c, q $1 q, c $2, ");";}' q="'" c="," FS=, emp.dat
INSERT INTO EMP VALUES(1, 'Amol', 1000);
INSERT INTO EMP VALUES(2,'Timone', 2000);
INSERT INTO EMP VALUES(3,'Patil', 3000);
#
4. Along with the above output, if you want to have the line 'COMMIT' at the end:
#awk '{print "INSERT INTO EMP VALUES(" ++i,  c, q $1 q, c $2, ");";}END{print "COMMIT;"}' q="'" c="," FS=, emp.dat
INSERT INTO EMP VALUES(1, 'Amol', 1000);
INSERT INTO EMP VALUES(2,'Timone', 2000);
INSERT INTO EMP VALUES(3,'Patil', 3000);
COMMIT;
#
  The END statement is being used to get the COMMIT line.

In one of our future articles, we will see how to achieve the same using the sql loader.

1 comment:

  1. Interesting, but not needed specifically as the functionality to read data from a comma delimited record set is currently available via Oracle's external table facility as well as other options such as SQL*Loader, utl_file package, etc..

    ReplyDelete