Data Mining

The original conception of the Internet of Things (IoT) was of a network of physical objects or “things” embedded with electronics, software, sensors, and connectivity to enable objects to exchange data with a centralized operator and/or other connected devices.

Smart grids, smart homes and smart cities were all representations of what an IoT could be/do.

The IoT equivalent of the human brain is the cloud-based analysis of the data rising up from sensors to generate insights and decide on actions. Much of the benefit of the Internet of Things lies in our ability to leverage the (useful) data we collect with it. This is the “analytics of things,” and this area has, in many ways, received the least attention of all. This is unfortunate, because it is analytics that can add the most business, lifestyle, and health value to the IoT. It has been said that “data without meaning, without soul, will not move people to change their behaviors over the long term.”1 Value-added analytics are what many early adopters of activity trackers believe has been most missing and disappointing.

Sensor data have some unique attributes, so related analytics are unique as well. The data are typically continuous and fast-flowing, so there must be processes for continuous analysis of the data. Technologies such as “complex event processing” and “event stream processing” bring the data to the analysis capability, where they are processed in real time, and then results are sent back where they are needed. Because there is so much data, a major focus of the analytics of things is anomaly detection. Is something broken in our operational network? Does a bike ride appear to be in the middle of a corn field? Are you about to end the day without reaching 10,000 steps? Analytics can identify situations that require some form of human intervention.

Some other typical analytical applications for the IoT include the following:

Comparative usage—how your consumption of a resource (for example, calories) compares with others in similar situations
Understanding patterns and reasons for variation—developing statistical models that explain variation
Predictive asset maintenance—using sensor data to detect potential problems in machinery (or your body) before they actually occur
Optimization—using sensor data and analysis to optimize a process, as when a lumber mill optimizes the automated cutting of a log, or a poultry processor automates the preparation of a chicken, or when is the healthiest time to go to sleep or when in your sleep cycle to wake up
Prescription—employing sensor and other types of data to tell the user what to do, as when an activity tracker nudges you to get off the couch or sit up straight
Situational awareness—piecing together seemingly disconnected events and relating them to a larger repository of data to put together an explanation, as when a series of readings from activity trackers, glucose monitors, connected scales, and other devices tells you that you are in danger of contracting diabetes

The analytics of things is often a precursor to cognitive action—taking action based on the results of analyzed sensor data. Comparative usage statistics, for example, might motivate an energy consumer to cut back on usage, while smart thermostats can monitor and optimize the household environment. Predictive asset maintenance suggests the best time to service machinery, which is usually much more efficient than servicing at predetermined intervals. A municipal government could analyze traffic data sensors in roads and other sources to determine where to add lanes and how to optimize stoplight timing and other drivers of traffic flow.

Data mining can be considered a superset of many different methods to extract insights from data. It might involve traditional statistical methods and machine learning. Data mining applies methods from many different areas to identify previously unknown patterns from data. This can include statistical algorithms, machine learning, text analytics, time series analysis and other areas of analytics. Data mining also includes the study and practice of data storage and data manipulation.

Machine Learning

The main difference with machine learning is that just like statistical models, the goal is to understand the structure of the data – fit theoretical distributions to the data that are well understood. So, with statistical models there is a theory behind the model that is mathematically proven, but this requires that data meets certain strong assumptions too. Machine learning has developed based on the ability to use computers to probe the data for structure, even if we do not have a theory of what that structure looks like. The test for a machine learning model is a validation error on new data, not a theoretical test that proves a null hypothesis. Because machine learning often uses an iterative approach to learn from data, the learning can be easily automated. Passes are run through the data until a robust pattern is found.

Deep learning

Deep Learning combines advances in computing power and special types of neural networks to learn complicated patterns in large amounts of data. Deep learning techniques are currently state of the art for identifying objects in images and words in sounds. Researchers are now looking to apply these successes in pattern recognition to more complex tasks such as automatic language translation, medical diagnoses and numerous other important social and business problems.

Category: Data Mining

Analytics and the Internet of Things

Data Mining vs. Machine Learning vs. Deep Learning

Data Mining

Machine Learning

Deep learning

Share this:

Data Mining

Machine Learning

Deep learning

Share this: