Abstract
The technological world has grown by incorporating billions of small sensing devices, collecting and sharing huge amounts of diversified data. As the number of such devices grows, it becomes increasingly difficult to manage all these new data sources. Currently there is no uniform way to represent, share, and understand IoT data, leading to information silos that hinder the realization of complex IoT/M2M scenarios. IoT/M2M scenarios will only achieve their full potential when the devices work and learn together with minimal human intervention. In this paper we discuss the limitations of current storage and analytical solutions, point the advantages of semantic approaches for context organization and extend our unsupervised model to learn word categories automatically. Our solution was evaluated against Miller–Charles dataset and a IoT semantic dataset extracted from a popular IoT platform, achieving a correlation of 0.63.
1. Introduction
With the advent of the Internet of Things (IoT) [1], an increasing number of devices has been equipped with sensing and processing capabilities. These allow them to communicate with each other, and even with services on the Internet, to accomplish a given objective. A major component of this connectivity landscape is machine-to-machine communications [2]. M2M generally refers to information and communication technologies able to measure, deliver, digest and react upon information autonomously, i.e. with none or minimal human interaction.
8. Conclusions
The number of IoT devices is increasing at a steady step. Each one of them generates massive amounts of diverse data. However, each device/manufactures share context information with different structure, hindering interoperability in IoT and M2M scenarios.
In this paper we discussed the limitations of conventional storage and analytical tools, and pointed out the advantages of bottom-up context organization model. We also discussed semantic approaches specifically designed for IoT/M2M scenarios. Our semantic model was extended to support multiple word categories and a new unsupervised learning method was designed. Distributional profiles extracted from web services may contain noisy dimensions and several senses of the target word (sense-conflation). These issues decrease accuracy, and limit the potential of this model. Our learning method minimizes these issues through dimensional reduction filters and clustering.