5. Conclusions
The disadvantage of using AUC for protein remote homology detection was explored in this study. A novel method was proposed for finding the proper prediction probability threshold of a testing set. Experimental evaluation was performed by using an established benchmark, and the results showed that the proposed method can effectively improve prediction performance over more commonly employed methods. In the future, we intend to explore the efficiency of using a function to classify a testing set, as compared with using a single threshold. We expect that a linear function will achieve better performance. Other approaches should also be employed for finding the proper prediction probability threshold, e.g., neural-like computing models [40–43], Hadoop based methods [44,45], which have widely been used in pattern recognition.