JDBC - Julia interface to Java JDBC database drivers
This package enables the use of Java JDBC drivers to access databases from within Julia. It uses the JavaCall.jl package to call into Java in order to use the JDBC drivers.
The API provided by this package consists essentially of two components: a "direct" (i.e. minimally wrapped) interface directly to Java JDBC and a minimal
Julian interface with support for Tables.jl.
This package currently supports only Julia v0.6 and later.
Initialisation and Destruction
To start it up, add the database driver jar file to the classpath, and then initialise the JVM.
using JDBC
JDBC.usedriver("/home/me/derby/derby.jar")
JDBC.init() # or JavaCall.init()
The JVM remains in memory unless you explicitly destroy it. This can be done with
JDBC.destroy() # or JavaCall.destroy()
Low-Level Java Interface
As described above, this package provides functionality very similar to using a JDBC driver in Java. This allows you to write code very similar to how it would
look in Java.
conn = DriverManager.getConnection("jdbc:derby:test/juliatest")
stmt =createStatement(conn)
rs =executeQuery(stmt, "select * from firsttable")
for r in rs
println(getInt(r, 1), getString(r,"NAME"))
end
In JDBC, accessing the data from a SQL call is done by iterating over a ResultSet instance. In Julia therefore, the ResultSet is a regular Julia iterator, and can be iterated in the usual fashion.
To get each row as a Julia tuple, iterate over the result set using JDBCRowIterator. Values in the tuple will be of Nullable type if they are declared to be nullable in the database.
for r inJDBCRowIterator(rs)
println(r)
end
The following accessor functions are defined. Each of these functions take two arguments: the Resultset, and either a field index or a field name. The result of these accessor functions is always a pure Julia object. All conversions from Java types are done before they are returned from these functions.
While inserts and updates can be done via a fully specified SQL string using the Statement instance above, it is much safer to do so via a PreparedStatement. A PreparedStatement has setter functions defined for different types, corresponding to the getter functions shown above.
Similary, a CallableStatement can be used to run stored procedures. A CallableStatement can have both input and output parameters, and thus has both getter and setter functions defined.
Note that as per the JDBC API there are two kinds of execute methods defined on a Statement : executeQuery returns a ResultSet (usually from a select), and executeUpdate returns an Integer which denotes the number of rows effected by a query (usually an update or insert or a DDL). For PreparedStatements and CallableStatements, an additional function execute is defined which returns a boolean which specifies whether a ResultSet has been returned from the query.
Also note that for a Statement, the query itself is specified in the corresponding execute.. call, while for a PreparedStatement and a CallableStatement, the query itself is specified while creating them.
The connections and the statements should be closed via their close(...) functions. commit(connection), rollback(connection) and setAutoCommit(true|false) do the obvious things.
Metadata (Java Interface)
Pass the JResultSet object from executeQuery to getTableMetaData to get an array of (column_name, column_type) tuples.
This package also provides a more Julian interface for interacting with JDBC. This involves creating JDBC.Connection and JDBC.Cursor objects to which query
strings can be passed
cnxn = JDBC.Connection("jdbc:derby:test/juliatest") # create connection
csr =cursor(cnxn) # create cursor from connection# if you don't need access to the connection you can create the cursor directly
csr =cursor("jdbc:derby:test/juliatest")
# execute some SQLexecute!(csr, "insert into pi_table (pi_value) values (3.14);")
execute!(csr, "select * from my_table;")
# to iterate over rowsfor row ∈rows(csr)
# do stuff with rowendclose(csr) # closes Connection, can be called on Connection or Cursor
Tables Interface and Creating DataFrames
JDBC includes a Tables interface. A Tables
Source object can be created from a JDBC.Cursor or a JDBCRowIterator simply by doing
e.g. JDBC.Source(csr). It can be useful for retrieving metadata with Tables.schema.
This is also useful for loading data from a database into another object that implements the Tables interface. For this we
provide also the convenient JDBC.load function.
For example, you can do
src = JDBC.Source(csr) # create a Source from a JDBC.Cursor# here we load into a DataFrame, but can be any Data.Sink
df = JDBC.load(DataFrame, src)
# you can also load from the cursor directly
df = JDBC.load(DataFrame, csr)
Note that in the above we are assuming that a query was already executed.
Absolute Quickest Way to Get DataBase Data into DataFrame
cnxn_str ="jdbc:derby:test/juliatest"# for example
df = JDBC.load(DataFrame, cursor(cnxn_str), "select * from sometable")
Note again that this works not only for DataFrame but any Data.Sink.
There are a few more JDBC.load methods we haven't listed here, see methods(JDBC.load).
Caveats
BLOB's are not yet supported.
While a large part of the JDBC API has been wrapped, not everything is. Please file an issue if you find anything missing that you need. However, it is very easy to call the Java method directly using JavaCall. Please look at the JDBC.jl source for inspiration if you need to do that.
Both Julia DateTime and Java java.sql.Date do not store any timezone information within them. I believe we are doing the right thing here, and everything should be consistent. However timezone is easy to get wrong, so please double check if your application depends on accurate times.
There are many many different JDBC drivers in Java. This package needs testing with a wide variety of those.
请发表评论