Monday, March 12, 2012

Retrieve table names from Oracle queries




    Sometimes, you might have a requirement where you have a file full of Oracle sql queries, and you want to filter out the table names alone. In this article, we are going to look into a Perl script which is for this very purpose.  Given a file with sql queries which could be comprising of SELECT, UPDATE, INSERT or DELETE statements, the script will parse the sql query file and list out the table names alone.


Test data:
  The following is the test file which contains many sql queries in it.
$ cat test.sql

SELECT ENAME, SAL FROM EMP;

DELETE FROM
   BILL_PLAN
   WHERE PLAN='A';

SELECT ENAME, SAL
  FROM EMP1, EMP2;

SELECT ENAME, SAL
  FROM EMP3 WHERE SAL=30;

SELECT ENAME, SAL FROM EMP4
  WHERE SAL=30;

SELECT ENAME, SAL FROM EMP3 E3,
                       EMP4 E4
                       WHERE E3.EMPNO=E4.MGR AND E3.ENAME EXISTS (
                       SELECT 1 FROM MGR
                       WHERE MGR.SUBORD = E3.ENAME);

SELECT SUM(SALES) FROM STORE_INFORMATION
WHERE STORE_NAME IN
(SELECT STORE_NAME FROM GEOGRAPHY
WHERE REGION_NAME = 'WEST');

DELETE FROM BILL;

UPDATE XYZ SET A=10;

INSERT
INTO INSTB VALUES('A','B','C');

The Perl script below is the one which is going to separate the tables names from the above test data.
Script:
$ cat GetTable.pl
#!/usr/bin/perl
use warnings;

#Function which gets the table names and formats and prints them.

sub printTable {
        my $tab=shift;
        $tab=~s/,\s+/,/g;
        $tab=~s/\s+,/,/g;
        my @out=split(/,/,$tab);
        foreach (@out){
                $_ =~s/ .*//;
                print $opr,$_,"\n";
        }
}

#Function which gets the indivdual queries and separtes the table
#names from the queries. Sub-Queries, co-related queries, etc..
# will also be handled.

sub process {
        local $opr;
        my $line=shift;
        $line=~s/\n/ /g;
        if ($line =~ m/^\s*(select|delete)/i){
                if ($line =~ m/^\s*select/i){
                        $opr="SELECT: ";
                }else
                {
                        $opr="DELETE: ";
                }
                if ($line =~ m/from.*where/i){
                        while($line =~ m/from\s+(.*?)where/ig){
                                &printTable($1);
                        }
                }elsif ($line =~ m/from.*;/i){
                        while($line =~ m/from\s+(.*);/ig){
                                &printTable($1);
                        }
                }
        }elsif ($line =~m/^\s*update\s+(\w+)\s+/i){
                $opr="UPDATE: ";
                &printTable($1);
        }elsif ($line =~m/^\s*insert\s+into\s+(\w+)\s+/i){
                $opr="INSERT: ";
                &printTable($1);
        }
}

#The main function which reads the files and reads the
#query into a variable and sends it to process function.


if (@ARGV !=1){
        print "Usage: GetTable <sql query file>\n";
        exit 1;
}

open QFILE, $ARGV[0] or die "File $ARGV[0]: $! \n";
my $flag=0;
my $query="";
my $conds="select|insert|update|delete";

while(<QFILE>){
        next if (/^$/);
        if ($flag==1){
                $query.=$_;
                if (/;\s*$/){
                        $flag=0;
                        &process($query);
                }
        }elsif (/^\s*($conds).*;\s*/i){
                &process($_);
        }elsif (/^\s*($conds)/i){
                $flag=1;
                $query=$_;
        }
}
close QFILE;
On running the above Perl script, the output we get is as shown below. As shown, the output contains, along with the table names, also the mode of query in which it is being used.:
$ ./GetTable.pl test.sql
SELECT: EMP
DELETE: BILL_PLAN
SELECT: EMP1
SELECT: EMP2
SELECT: EMP3
SELECT: EMP4
SELECT: EMP3
SELECT: EMP4
SELECT: MGR
SELECT: STORE_INFORMATION
SELECT: GEOGRAPHY
DELETE: BILL
UPDATE: XYZ
INSERT: INSTB
If the user wishes to have only the table names alone without the mode of queries:
$ ./GetTable.pl test.sql  | awk '{print $2}'
EMP
BILL_PLAN
EMP1
EMP2
EMP3
EMP4
EMP3
EMP4
MGR
STORE_INFORMATION
GEOGRAPHY
BILL
XYZ
INSTB

5 comments:

  1. Hi, I have no knowledge on perl, and I tried to do the same steps that you noted here. Created a test.sql file and GetTable.pl.
    I get the following error when I run this command :
    ./GetTable.pl test.sql | awk '{print $2}'

    Error:
    -bash: ./GetTable.pl: /usr/bin/perl: bad interpreter: Permission denied

    Is there anything I need to change according to my environment in the script logic?

    Please help me resolve this issue.

    Also, my requirement ultimately is to get the tablenames out of bunch of NETEZZA Tables.

    Thanks!

    ReplyDelete
    Replies
    1. In place of /usr/bin/perl , put the path where Perl is actually installed in your system. You can find it using the command: which perl

      Delete
  2. Hi,
    this works only for sql in upper case can you also make it work for both and also merge statements

    ReplyDelete
  3. Hi am new to perl. can u explain how to run above pl or sql file in perl window

    ReplyDelete
    Replies
    1. You call sql from Perl like a system command

      Delete