6. Conclusion
A data-driven diagnostic approach was described in this paper based on clustering qualitative event sequences. The method was based on a sufficiently high number of training traces recorded from different nominal and faulty scenarios. After training, input traces were categorized (diagnosed) by the most likely scenario based on the training traces. The method had two main phases, the off-line training and the on-line diagnostics phase. After preprocessing, the event sequences were converted to an m-dimensional vector space with a distance metric defined. Kmeans clustering was used for every faulty and nominal scenario to find a single centroid. After every centroid was found the on-line diagnostic is executed. In the on-line diagnosis phase, arbitrary measured traces were converted to coordinate vector form. Using this form, the closest centroid was determined which is the result of the diagnosis for the trace. The aim of the simple process example was to examine the diagnostic accuracy of the proposed method on the same composite process system driven by an operational procedure, under the presence of multiple faults and different output mapping functions. Three types of mapping functions (coarse and finer linear, nonlinear) were used and their positive or negative effects on the accuracy were compared. We also provided a discussion on how the diagnostic algorithm can be used for simultaneous fault detection. A complex diagnostic case study using the benchmark of Tennessee Eastman process (TEP) was also presented to illustrate the efficiency of the proposed method and to compare its performance with some of the statistical methods. It was found that not only constant step-type faults (disturbances) could be detected with a high fault detection rate but also during a transient operation ofthe process.