6. Conclusion and future work
In this paper, we analyzed the impact of virtual network topologies on the performance of cloud-based big data applications, studied the detailed procedures of cloud-based MapReduce operations in multi-host virtual networks built using OpenStack, formulated the data transmission and data processing overheads of the MapReduce workflows, and put forward TOMON mechanism to optimize the virtual network topologies. TOMON mechanism struck the right balance between the data transmission latency and the data processing rate of a cloud-based MapReduce cluster, and further improve the performance of the cloud-based big data applications compared with other greedy deployment policies. Our work took the first step towards providing optimal deployment mechanism of multi-host virtual networks based on OpenStack Neutron. In our future work, we plan to improve TOMON mechanism for the optimal deployment of virtual networks on large-scale physical data centers. Firstly, different from the evaluations of TOMON in the centralized physical datacenters shown in this paper, we will further evaluate TOMON mechanism on the large-scale data centers with hierarchical architectures. Secondly, by providing additional performance evaluation mechanisms, our future research will try to evaluate the largest scale of the physical data centers that OpenStack multi-host virtual network can support (When the scale of the physical data center is large enough, the data transmission latency between two physical servers will be long enough, and the virtual network topology may be not the dominant performance factor in this scenario). Finally, based on the experimental results, we will revise TOMON mechanism to accommodate the large-scale data centers with more complex physical architectures. We are also going to explore the properties of the heterogeneous virtual MapReduce clusters, and introduce additional metrics into TOMON to improve this topology optimization mechanism.