9 CONCLUSIONS L IKELY SYSTEM INVARIANTS
can be mined for a variety of service computing systems, including cloud systems, web service infrastructures, datacenters, enterprise systems, IT services and utility computing systems, network services, distributed systems. They represent operational abstractions of normal system dynamics. The identification and the analysis of their violations support a range of operational activities, such as runtime anomaly detection, post mortem troubleshooting, capacity planning. In this work we have used two real-world datasets - the publicly available Google datacenter dataset and a dataset of a commercial SaaS utility computing platform - for assessing and comparing three techniques for invariant mining. Analysis and comparison was based on the common metrics coverage, recall and precision. The results provide insights into advantages and limitations of each technique, and practical suggestions to practitioners to establish the configuration of the mining algorithms and to select the number of invariants. The high-level findings are the following. A relatively small number of invariants allows to reach a relatively high coverage, i.e. they characterize the majority of executions. A small increase of the coverage of correct executions may produce a significant drop of recall and precision. The techniques exhibit similar precision, but the decision list supervised technique outperforms the unsupervised ones in recall. Finally, we presented a general heuristic for selecting a set of likely invariants from a dataset. All these results aim to fill the gap between past scientific studies and the concrete usage of likely system invariants by operations engineers.