7. Conclusions and futurework
In this paper, we have defined a programming model for material cloud applications that supports multiple different Map and Reduce functions running in parallel. MaMR uses a hybrid shared-memory BSP model that can make full use of the data nodes in a cloud computing system. We have designed an optimized data-sharing strategy using the BSP model to support the shared data for Map and Reduce. Meanwhile, we further provide multicopies of the output to reduce the shuffle overhead. We add a new Merge phase to Map-Reduce that can efficiently merge data already partitioned and sorted (or hashed) by the map and reduce modules. In future work, we will explore this new method to improve the parallel efficiency. Currently, more large cloud computing systems should be used to test and verify the MaMR model. The advantages of the programming model should be further amplified by more material data.