I'm trying to import data from Oracle database to HDFS by using sqoop 1.4.6. I have no problem doing simple imports to HDFS or hive tables by using --table table_name or --query SELECT <...> statements.
However, my question: is there a way to import the table with executing a certain PL/SQL procedure first?
For example, assume there exists a procedure set_date(date IN DATE), specified in the Oracle database that is used to set the report date for the data. With usual JDBC tools (i.e., ROracle package if working from R) two queries are required for the same connection to pull the data:
1) BEGIN set_date(#some_date#); END;
2) SELECT * FROM table_name;
Is it possible to achieve the same result using sqoop? I've tried mixing the two queries into one by the following approach:
sqoop import --connect jdbc:oracle:thin@LINK:PORT:SID \
--username user -P \
--target-dir /some/directory \
--query "BEGIN set_date(#some_date#); END; SELECT * FROM table_name"
But apparently the whole query is assumed to be in PL/SQL format, therefore the 2) part of the query is not formulated as expected. Also, I've tried eval'ing only the set_date part of the query, which seems to work fine, so the only problem is to correctly execute the 2) SQL query.
Is it possible to call a second query using the same sqoop connection? Is it possible to mix 1) and 2) statements into one in order to get the same result (for now a simple select table would be enough) using sqoop?
Note: I don't have much experience working with SQL and PL/SQL, so sorry if I didn't specify some parts clearly enough.