MapReduce-MPI Library

A grain of wisdom is worth an ounce of knowledge, which is worth a ton of data. -- Neil Larson
It is a capital mistake to theorize before one has data. -- Arthur Conan Doyle

This is the home page for the MapReduce-MPI (MR-MPI) library, which is an open-source implementation of MapReduce written for distributed-memory parallel machines on top of standard MPI message passing. The current version of MR-MPI is 13 April 2009.

FeaturesDocumentationDownload
New features & bug fixesPublicationsOpen source

MapReduce is the programming paradigm, popularized by Google, which is widely used for processing large data sets in parallel. Its salient feature is that if a task can be formulated as a MapReduce, the user can perform it in parallel without writing any parallel code. Instead the user writes serial functions (maps and reduces) which operate on portions of the data set independently. The data-movement and other necessary parallel operations can be performed in an application-independent fashion, in this case by the MR-MPI library.

The MR-MPI library was developed at Sandia National Laboratories, a US Department of Energy facility, for use on informatics problems. It is an open-source code, distributed freely under the terms of the modified Berkeley Software Distribution (BSD) License. See this page for more details.

The authors of the library are Steve Plimpton and Karen Devine, who can be contacted at sjplimp at sandia.gov and kddevin at sandia.gov.

These are other software packages that perform MapReduce operations:


Recent MR-MPI News