What can you learn from IoT with i2M – Part 3
In the last 2 installments (Part 1 & Part 2), we discussed the basics of IoT and an example of how the components can be connected and used to provide basic automation and alerting.
These seemingly simple steps can build up to provide very advanced controls of all aspects of the physical world. The challenge can become managing situations that were not expected.
Some time ago, I worked with a company that managed the IoT environments of entire skyscrapers globally, and they were seeing a problem that they were having an issue diagnosing. The issue was every couple of weeks a building somewhere in the world would see a couple of hundred devices stop working at the same moment, and they couldn’t work out why.
They would try and diagnose it remotely and would end up sending an engineer on-site, who would replace all the devices that were failing, and everything would again be okay. And then a couple of weeks later this same type of event would happen in another building. All the failed devices would be tested back at their offices, and everything would work perfectly.
What was happening? This was becoming a very expensive “rip and replace” problem that was causing their customers to be very dissatisfied.
It turned out the issue was that when first installed, the hub for each set of devices had a memory card slot that was used to store a card that held copies of firmware updates for the devices, and every 30 days each device would automatically reboot. Part of the reboot process was to look at a directory on the memory card for updated firmware. If the card was unavailable, then the device would simply reboot and try again. The issue was that if the memory card was to fail then the device would go into an infinite loop or try and reboot. To save a little money the installers had used a cheaper card than was recommended. So it was those cards that were failing every now and again!
The fix was to install better cards, but identifying this rare but expensive problem was very difficult because the actions that were happening at the time of the failure were not being logged in a way that was auditable.
These kinds of problems are a lot more common than you would think, and solving them relies on a complete record of everything that is happening being stored in a way that can be understood.
At Nastel we present these kinds of end-to-end event logs as a visualized topology, that allows the entire operations team to understand the flow of the transactions. We provide a view into the topology that the user can drill down into in real-time, and interrogate the data directly and visually.
If you are used to receiving static views in reports from data analysis systems and have never seen the power of clicking into a data point, and literally drilling into the detail and following a line of thought in real time, prepare to be amazed. This allows the root cause of complex problems to be identified must more easily.
Since we recognize that IoT is another form of integration infrastructure (i2) also known as middleware, we can use the same technology that we use to monitor and manage every other form of machine data. We use a methodology we call integration infrastructure management (i2M), and this recognizes the importance of integration infrastructure (i2) knowledge to manage operationally every part of the DevOps process from design, implementation, testing, and operations.
Without i2M, IOT management can become expensive and time-consuming.
With i2M IOT becomes much more reliable and manageable.