The MapReduce programming model lets developers without experience with parallel and distributed
systems utilize the resources of a large, multi-CPU system. The Oracle RDBMS has had support for the MapReduce paradigm for years through SQL analytics, user defined pipelined table functions and aggregation objects. The Apache Hadoop implements the MapReduce model.

In this session, we describe a prototype of Oracle in-database Hadoop implementation that lets you
write and execute Hadoop compatible applications written in Java directly in the database.
The major advantages of our implementation include:
(1) source compatibility with Hadoop,
(2) minimal dependency on the Apache Hadoop infrastructure,
(3) seamless integration of MapReduce functionality in Oracle SQL
(4) better parallelism and efficiency due to data pipelining (i.e., table functions) and no intermediate materialization.


Comments are closed.