CACR Seminar
We are researching an architecture for a data-intensive computer, a system capable of performing computational tasks that require O(N log N) arithmetic operations, where N is the size of the data set and is very large.
We have developed the MPI-DB software library that provides scientific computing processes with an easy to use abstraction layer to read, write and perform general computations with large arrays stored in a database.
MPI-DB is a client-server framework, which is being developed as a prototype of the operating system of the data-intensive computer, consisting of a computational front end, a fast network and a database back end. MPI-DB is being used by the Johns Hopkins Turbulence research group to automatically create databases containing simulation results and to expose the stored results for subsequent analysis by researchers.
In this talk we will describe MPI-DB and discuss the challenges in the design of an architecture for a data-intensive computer.
E. Givelberg
Johns Hopkins University