Understanding IoT Gateways – The Glue for Industrial Internet of Things

IoT gateways have become a critical component of IoT deployments today. In this post, we try to understand the need for IoT Gateways and the role they play in an IoT solution architecture.

Integrating ‘Things’ With the Cloud

Some IoT appliances are sufficiently advanced to support the full extent of the TCP/IP stack and to securely communicate directly with your IoT Cloud.

However, we often encounter lightweight IoT sensors and actuators that support local communication interfaces only – such as Zigbee, Bluetooth, RS232, RS485 etc. They do not have the capability or the compute power to support a full TCP/IP stack.

In such cases, an IoT Gateway acts as an intermediary device that is deployed on the field. It provides multiple local interfaces – sensors and actuators connect to these local interfaces:

  • ZigBee
  • ZWave
  • Bluetooth
  • BLE
  • RS485
  • RS232
  • SPI
  • Digital IO
  • Analog-to-Digital Converter (ADC)

The software on the gateway is then responsible, to aggregate information from sensors and dispatch it to the IoT Cloud. Also, the gateway may receive commands from the cloud, which it relays further to the sensors and actuators via the local interfaces.

IoT Gateways takes care of the protocol impedance mismatch between your IoT Cloud and your sensors (or actuators).

Edge Filtering

An IoT gateway filters data at the network edge so that only relevant data is dispatched to the IoT cloud. Here are some examples where this is useful:

  • Sensors often ‘chirp’ data periodically. A sensor may emit data a much higher frequency than actually needed by your application.
  • Data from sensors may include edge values and boundary-conditions which could be ignored.
  • Sometimes sensors misfire or provide bad sample values which can be discerned and ignored at the outset.

If all such sample values are dispatched to the Cloud it consumes additional network bandwidth; And such data may not be useful to your application at all.

An IoT gateway allows you to specify filtering rules, so that only useful data is sent to the cloud.

Edge filtering helps sanitize your sensor data before dispatching to the IoT cloud.

Data Shaping

In addition to filtering sample values, an IoT gateway also offers some stream processing capability to aggregate and to shape data coming from the sensors. For example:

  • Some sensors offer non-linear response curves. Their sampled values may have to be transformed to a linear scale before transmission to the IoT Cloud.
  • Sensor response may be within a very wide bit range (Say 128 bits) and needs to be scaled down (Say to 16 bits), since your application does not need such a high resolution of measurement.
  • Sensors may exhibit hysteresis, which needs to be compensated for.
  • Sensors may exhibit temperature sensitivity, which needs to be compensated for.
  • A sensing element may need an average of 5 sample measurements to determine a more precise answer.

Data shaping ensures that any quirks and idiosyncrasies in your sensors are handled before sample values are dispatched to the IoT Cloud.

Control Loops

Most IoT applications involve some kind of a ‘control loop’. For example, if the temperature reaches a certain threshold, we need to shut-off the furnace.

A typical control loop involves one or more sensing element, a decision tree (rules engine), and a command to the actuator. Any control loop exhibits a latency of its own.

While the business logic of the control loop could be implemented on the IoT Cloud, certain applications may require a much faster response time.

In such cases, the business rules (decision making) are localized to the IoT gateway itself. A gateway can trigger an actuator based on certain conditions.

IoT Gateway enables tighter control loops with low response latency.

Edge Analytics

Aggregating and rolling-up data at the edge (field) before sending it to the Cloud saves substantial bandwidth. IoT gateways often provide data aggregation and analytics capabilities so that only concise information is dispatched to the cloud for further processing and archival.

Edge Security

Enterprise systems often need to ingest telemetry data from the field. However, we need to ensure that appropriate enterprise security mechanisms are enforced before data can be ingested.

For example, lightweight IoT sensors may not have the capability to support TLS, HTTPS, Client Certificates, VPN tunnels etc. which are a standard part of enterprise security today.

An IoT gateway can provide such capabilities which integrating with your enterprise system or with the IoT cloud.

IoT gateways support the necessary enterprise security standards to ensure that only data from trusted client devices is ingested by your enterprise systems.

Cloud Integration

IoT Cloud platforms support a variety of protocols such as HTTPS, WebSockets, MQTT, AMQP etc. IoT Gateways provide the ability to connect to an IoT Cloud platform over these protocols.

Health Monitoring

Another role of IoT Gateways is to monitor the health of deployed sensors on the field, and to notify the IoT Cloud in case of an errant sensor.

Noteworthy Points

IoT Gateways referred in this post are often called as Field Gateways, as they are often installed on the field (such as a factory floor).

Field gateways are different than Protocol Gateways which are a common component of IoT Cloud platforms. Protocol gateways are software components which run in the IoT Cloud (not on the field), and offer termination for various IoT protocols such as HTTPS, WebSockets, MQTT etc.

Field gateways can integrate with Protocol Gateways too!

Components of An IoT Gateway

  • Compute Capabilities: CPU, Memory, Persistent Store.
  • Interface Capabilities: RS232, PCI, Zigbee, Bluetooth etc.
  • Network Capabilities: Ethernet, WiFi.
  • Embedded OS: Hardened OS such as WindRiver Linux, Ubuntu Core.

Wrapping Up…

If you’re building smart solutions that involve primitive sensors and actuators, IoT Gateways can be an indispensable part of your solution. They offer the ability to integrate with your sensors locally, support multiple cloud protocols, and an ability to filter and shape your data before transmission to the IoT Cloud.

 

MQTT: A Protocol for the Internet of Things

Connecting smart appliances requires a robust and lightweight protocol that facilitates efficient M2M (Machine-to-Machine) communication.

Such protocols are expected to work in conditions of low bandwidth and intermittent connectivity. The protocol implementation should also require a small code footprint to run on devices having limited computational power. See this post for further details about the desired capabilities of M2M protocols.

The MQTT (MQ Telemetry Transport) Protocol was designed nearly 15 years ago to meet such constraints and to facilitate efficient transportation of telemetry data from embedded devices. With the emergence of IoT today, this protocol has risen to prominence – MQTT is an ISO standard and the open source community offers an extensive set of SDKs and Libraries that support MQTT.

MQTT Concepts

If you’re familiar with Enterprise Messaging Systems, you are already familiar with some of the essential concepts in MQTT.

A Pub-Sub Delivery Model

MQTT uses a ‘publish-subscribe’ model for message delivery. All parties interested to communicate with each other, connect to a centralized message broker. This broker acts as a mediator, and all messages are routed via the broker itself.

Parties which connect to the broker can be smart appliances, sensors, actuators, and application services.

Senders such as a temperature sensor, will publish messages (temperature values) to the broker. To ensure that messages reach only the relevant recipients, MQTT uses the concept of topics. A topic is a string that represents a virtual address. One or more recipients can indicate their interest in a topic by subscribing to it (beforehand).

Example:

A furnace controller may be interested in receiving messages that contain temperature information, but not in receiving messages having acceleration information. Whereas, a logging service may be interested in receiving all messages. In this case, topics could be ‘temperature’ and ‘acceleration’.

Each message thus represents an independent package of information and also contains the name of the ‘topic’ that this message pertains to.

When the broker receives any message, it determines the topic to which a message belongs. It then determines the recipients who are interested in that topic at that point in time. The broker is then responsible to forward the message to each of the interested recipients.

The number of subscribers attached to a topic may vary over time as the system evolves. This hub-and-spoke architecture of MQTT ensures that senders and recipients are de-coupled from each other and we have a 1-to-N communication capability between parties.

Simply put, you could think of four stages here:

Connect: All parties ‘connect’ to a message broker over a TCP connection.

Subscribe: Each party informs the broker about specific topics of its interest. Each topic is represented by a unique string.

Publish: Parties publish messages to the broker from time-to-time. Say, when a temperature sensor has sampled temperature information.

Delivery: Broker inspects the topic specified within each message and routes this message to everyone who has subscribed to that topic.

As we shall see later, there is a bit more to it – If the broker determines that an interested recipient is currently not reachable (disconnected), it may queue the messages internally for future delivery.

TCP Connectivity

MQTT uses TCP/IP as the underlying transport mechanism. MQTT devices establish a persistent TCP connection with the broker at all times. This acts as a two-way point-to-point messaging channel between the device and the broker itself.

If the TCP connection is broken, the device attempts a reconnection with the broker. While the device is offline, the broker will buffer (queue) messages until that device comes back online.

Note that devices do not have any direct transport layer connectivity with each other. The broker acts as a central hub to which everybody connects.

MQTT Topics

Every MQTT message contains a ‘topic’ within the message header. The topic is the primary means of routing messages to intended recipients (subscribers). You can think of a topic as a virtual address to which a message is destined to.

  • A topic is a simple string such as ‘temperature’. A sensor may be publishing messages (current temperature information) to this topic periodically.
  • Topics can be hierarchically organized by specifying ‘/’ as the separator. For example, if a building has multiple sensors, they could be organized as:
    • building / floor-1 / temperature
    • building / floor-1 / humidity
    • building / floor-2 / temperature
    • building / floor-2 / humidity
  • Topic strings also support wildcard characters ‘+’ and ‘#’
    • A ‘+’ represents a single step within the hierarchy. And it can occur at any step within the hierarchy.
    • A ‘#’ represents multiple steps within the hierarchy. And it can occur only at the end.
  • For Example: A client who subscribes to ‘building/+/temperature’ would receive messages from both: ‘building/floor-1/temperature’ and ‘building/floor-2/temperature’
  • For Example: A client who subscribes to ‘building/floor-1/#’ would receive messages from all topics under ‘building/floor-1/’.

Transient vs Durable Subscriptions

  • When a client connects to the broker using a TCP connection. Due to network conditions, this TCP may get dropped from time-to-time, and the client could reconnect each time. Each such connection represents a temporary physical session between the two parties.
  • However, the logical association between a client and its subscribed topics, could outlast these temporary session outages. This concept is called as durable subscriptions.
  • In effect, the client has informed the broker: “Please keep my messages for this topic, while I’m offline. I’ll come back and pick those up later”.
  • When a client connects to the broker, it can specify if this connection is Transient (Clean Session Flag = 1) or Persistent (Clean Session Flag = 0).
  • If the Clean Session Flag = 1, the broker considers this to be a transient connection. If this client abruptly disconnects later, the broker would not ‘keep’ any messages on behalf of this client for the topics of interest.
  • If the Clean Session Flag = 0, the broker considers this to be a durable connection. In this case, if the client abruptly disconnects later, the broker would ‘keep’ all messages, on topics subscribed by this client, with the assumption that the client would come back in the future to pick those up. When the client connects again (with Clean Session Flag = 0), the queued messages are delivered to this client.

Supporting durable connections means, the broker has to track additional ‘state’ information on behalf of the clients.

Concept of Retained Messages

  • A client can publish a message to a specific topic, and flag this message to be ‘retained’ by the broker.
  • The broker then delivers this message to the current subscribers of this topic and also retains this message for future use.
  • In the future, when any new clients subscribe to this topic, the broker will automatically deliver this ‘retained message’ to those new subscribers right away.

This model is very useful in scenarios where a subscriber should receive the ‘last known good value’ of something. Say, a subscriber is interested to receive temperature information from a sensor and it can receive the ‘last known temperature’ from the broker (instead of waiting for the publisher to publish this information again).

Now, the publisher only has to publish temperature values to the broker if there is a change in the temperature. This approach helps reduce the network chatter and conserves energy of the publisher as well.

Will and Testament

Given the intermittent network connectivity, it would be useful if interested recipients can be automatically notified when a particular device goes offline. This capability is achieved using a ‘Last Will & Testament’ (LWT) as follows:

  • When a device connects to a broker, it informs the broker about it’s ‘Last Will’, and the broker remembers this information for the future.
  • Later, if that device abruptly goes offline, the broker automatically dispatches this ‘Will’ (message) to any interested parties who may have previously subscribed to this.

LWT is thus a useful way of notifying all interested parties when an IoT appliance goes offline.

For example: When a security camera abruptly disconnects, an interested Service may receive the LWT message from the broker, and this Service can further send an SMS notification to the home owner.

Keep Alive Messages

Sometimes client devices abruptly crash, or have an abrupt network disconnection. This can often result in a half-open TCP connections on the broker – The broker continues to think that it has an active TCP socket with this client.

The ‘keep alive’ is introduced as a timeout mechanism by which parties can determine if the connection is still alive.

  • During the establishment of the connection, the client specifies the ‘keep alive’ duration to the broker. The broker remembers this value.
  • The ‘keep alive’ interval is the longest duration of time that the client and broker can endure without exchanging any message between themselves.
  • The broker maintains a ‘keep alive’ timer for each client:
    • If a broker does not receive any publish messages from the client for a duration of 1.5 X the ‘keep alive duration’, it assumes that the client has disconnected.
    • Upon receipt of a message from the client within the ‘keep alive duration’, the broker resets the ‘keep alive’ timer for that client.
  • However, if the client does not really have any new information to publish within the ‘keep alive’ interval, it can simply publish a PINGREQ message to the broker, and receives a PINGRESP back. This serves as a heartbeat mechanism between the two parties.
  • It is the responsibility of the client to keep publishing messages within the ‘keep alive’ interval.
  • If the ‘keep alive’ threshold for a client is exceeded, the broker will do the following:
    • Forcibly close the TCP connection with this client.
    • If the client had specified any LWT (Last Will and Testament) that will be Published to all interested recipients.
    • If the client has a durable subscription, it will retain all QoS 1 and QoS 2 messages pertaining to the client’s subscriptions until this client connects again.

Quality of Service

MQTT deals with each message as an individual package of information. What are the delivery guarantees for a message to reach its intended recipients? MQTT brokers offer three QoS levels as explained  below:

Level 0: At Most Once Delivery: There is no guarantee of message delivery by the broker. This needs minimal overhead, but applications need to be designed with the assumption that a message may not be delivered as intended.

Level 1: At Least Once Delivery: There is a guarantee that a message will be delivered to each listener at least once (but it could be more than once). In this case, the handling of the message on the recipient needs to be idempotent  – since a particular message may be received more than once.

Level 2: Exactly Once Delivery: There is a guarantee that a message will be delivered to each interested listener exactly once. Ensuring this requires additional overhead in the broker.

Getting Started with MQTT

MQTT Version 3.1.1 is the latest standard and is an OASIS specification today. Below are some additional references to get you started:

MQTT Message Brokers: Eclipse Mosquito, Mosca, ActiveMQ, RabbitMQ are some examples of MQTT broker products.

MQTT Client SDKs: The Eclipse Paho project offers an open source implementation of MQTT client SDKs. Libraries are provided for C, Java, Python, JavaScript and other programming languages.

MQTT Cloud Brokers: Most IoT Cloud providers, such as Amazon AWS IoT,  provide a device gateway which supports a scalable MQTT message broker that devices can connect to.

MQTT: Summary

  • Asynchronous model of communication using discrete messages (events).
  • Hub-and-Spoke architecture using a centralized broker.
  • Instead of creating new network standards, MQTT uses ubiquitous IP networks with persistent TCP connections as the underlying transport mechanism.
  • Publish-Subscribe mechanism to decouple data producers (publishers) and data consumers (subscribers) using message queues (topics).
  • Topics represent ‘virtual addresses’ to which messages get delivered.
  • Low protocol overhead with a 2 byte header. Low network bandwidth.
  • Supports durable connections which outlast temporary TCP sessions.
  • Broker caches messages for each durable connection until the device reconnects.
  • Supports multiple Quality of Service levels for message delivery.
  • Supports keep alive timeout to detect if a device goes offline.
  • Supports Last Will and Testament (LWT) to notify parties if a device goes offline.

Architectural Features of IoT Cloud Platforms

IoT platforms are an essential part of IoT solutions today. They help accelerate the development of IoT applications and also ensure the requisite level of security, remote management, and integration capabilities in your solution.

There are several established platform providers in the market today such as – AWS IoT, ThingWorx, Azure IoT, Xively et. al. Many of these platforms share common features and architectural patterns.

In this post, we explore the architectural components and essential patterns to be considered in your IoT solutions.

We also share our wishlist of desired features for IoT Cloud Platforms. Such a wishlist is quite useful when trying to evaluate and choose a platform for a specific IoT solution.

Device Connectivity and Protocol Support

IoT devices support a variety of protocols, so any mature IoT platform should include support for multiple protocols such as: MQTT, AMQP, CoAP, STOMP, WebSockets, XMPP etc.

A component within an IoT platform which handles (terminates) these protocols is often called as the Cloud Gateway. Such gateways need to be highly scalability with an ability to process millions of messages each day.

Most IoT protocols use a message-centric, asynchronous communication model instead of the traditional Request-Response model of Web Applications. Hence, IoT platforms often include a scalable message bus infrastructure that is responsible for routing messages between devices and application services. Messages are delivered to one or more recipients using a pub-sub delivery model.

Device connectivity is often divided into two logical channels – control and data. The QoS levels and the exact protocols used for each logical channel may vary depending on specific application needs.

  • A Control Channel: To deliver device commands, health status, updates etc.
  • A Data Channel: To carry actual telemetry data, sampling values, from devices to the platform.

Unified Device Management Capabilities

Device management is a must-have feature for any IoT platform today. This includes capabilities enumerated below. Such capabilities are typically exposed as an admin dashboard with can be used by IoT Ops personnel.

  • Device Inventory: Tracking inventory of devices (things).
  • Device Health: Capturing heartbeat and health status of devices.
  • Remote Configuration Management: Remote management of device configuration using two-way sync capabilities.
  • Remote Device Management: Remote management of the device state – wipe, lock, activate.
  • Device Firmware Upgrades: Over-the-air firmware upgrades with canary releases.
  • Remote Logging: Remote access to device logs and capturing error reports from devices.

Security Features

Nearly all CIOs rate ‘security’ to be a paramount concern for IoT applications today. Any IoT Platform hence needs to offer robust security features out-of-the-box. These include:

  • Device Identity: Establish a secure device identity using client certificates or other cryptographic means.
  • Device Enrollment: Securely enroll and authorize IoT devices to the platform.
  • Device Policy: Fine-grained authorization control to restrict device traffic coming into the IoT platform. Restrict what devices can publish, and what they can subscribe to.
  • Secure Communication Channels: Provide secure tunnels for communication between devices and the platform (TLS / SSL / IPSec / Private Networks etc).
  • Secure Firmware Delivery: Deliver signed software updates and checksum verifications during firmware upgrades.

Telemetry Analytics

This includes the ability to capture data streams from devices in real-time and performing analytics to drive business decision making.

Analytics can be offered in four flavors:

  • Real-time analytics,
  • Batch analytics,
  • Predictive analytics using machine learning and,
  • Interactive Analytics.

The underlying analytics platform should be ready for scale, with an ability to handle millions (or even billions) of telemetry messages each day.

Support for Business Rules

This component provides ‘extensibility’ to an IoT Platform. This is where business logic (specific to your IoT application) gets codified.

It includes a business rules engine which can be customized to your business requirements, and it also includes a micro-services stack where custom code (business logic, lambda functions etc.) can be deployed by the application developer.

The rules engine often forms an important part of the ‘control loop’ for IoT applications. For example: If the temperature of a furnace exceed a certain threshold, a specific business rule triggers, and this may send a ‘cut-off’ command to the electric furnace.

Rules engines provide a DSL (Domain Specific Language) to express business rules. A common pattern to express rules is also the IFFT (If-This-Then-That). Alternately, you can codify your business logic in a programming language of your choice and deploy it as micro services.

Rules engines and micro services hook into the message bus so that they are able to receive real-time telemetry data and dispatch commands to devices.

Integration Capabilities

Most enterprise systems offer standard protocols such as REST, SOAP, and HTTPS to facilitate integration with other systems. Enterprise cloud platforms also offer capabilities such as Big Data Stores, Large File Stores, Notification Services etc.

To build a complete IoT solution, devices need to integrate with legacy enterprise solutions and enterprise cloud applications. IoT platforms hence need to provide connectors to such enterprise and cloud services. These connectors would be invoked by the business rules or by the micro services running on the IoT platform.

Wrapping Up…

The rapid growth of IoT paradigms today has made it necessary to accelerate ‘goto market’ timelines for IoT solution providers. Leveraging an IoT platform is a great way to achieve this goal.

IoT platforms provide cross-cutting concerns such as connectivity, security, management, and analytics so that solution developers do not reinvent the wheel. It is critical for you to evaluate your chosen IoT platform against these set of features before you embark on your journey. Now go build something awesome!

 

Protocol Design Challenges in M2M Communication

Connecting smart IoT appliances requires a robust and lightweight protocol that facilitates an efficient Machine-to-Machine (M2M) communication. In this post, we explore some of the interesting challenges in the design of protocols for M2M communication needs.

Low Bandwidth

IoT devices deployed in the field are often connected using networks that have low-bandwidth or offer inconsistent throughput. Hence, the protocol overheads need to be very low – such as having small protocol headers, using variable length headers etc.

For example: Field devices are often connected using 2G / 3G carrier networks that offer low bandwidth.

Two-way Communication:

IoT devices often require a two-way communication channel, whereby communication can be initiated by either parties.

For example: Say, a device needs to send telemetry data to a Backend Application, or the Application wants to dispatch commands to this device. Unlike a traditional HTTP Request-Response model, the communication could be initiated by either of the parties. Polling is an inefficient way of doing this over HTTP.

Intermittent Connectivity

Wide area wireless networks (carrier networks) often experience flaky connectivity. So the underlying protocol needs resilience against connectivity problems. After an abrupt disconnection, when a device comes back online later, any pending messages for this device should be automatically delivered to it.

For example: IoT devices can fade in-and-out of network connectivity as their geo-location changes, or they switch to a ‘hibernation mode’ to conserve power.

Low Compute Footprint

Embedded devices often have low compute capabilities – CPUs with lower energy consumption, lower clock rates, and low memory availability. So the protocol implementation should require a minimal compute and memory overhead on those devices.

One-to-Many Communication Model

Unlike conventional protocols such as HTTP which only allow a 1-to-1 communication, an IoT protocol should allow a one-to-N communication model.

For example: Messages from a temperature sensor could be routed to a telemetry logging service, a temperature control logic, and a monitoring dashboard – all at the same time!

Asynchronous Model

Any party can dispatch messages at will, sometimes even without knowing if the recipient is online at that point in time. As devices fade in-and-out of network connectivity, or hibernate from time to time, it would be inefficient for the senders to keep polling recipients over a HTTP request-response model. Hence, the underlying messaging protocol is expected to be asynchronous in nature

For example: Most IoT devices sense real world events and trigger messages based on the occurrence of those events. Sensors can dispatch this information without really knowing if the intended recipients are online / offline at that point in time.

For example: Applications need to instruct devices by dispatch commands to them. Applications could dispatch commands without knowing if the recipient device is online / offline at that point.

Decoupling of Participants

The protocol needs an appropriate ‘routing mechanism’ whereby a message can be routed to a set of interested recipients, without the sender necessarily knowing who the exact recipients are. The set of recipients for a certain type of message could change as the ecosystem evolves.

For example: A logger device may be interested in temperature information today. Tomorrow, a new dashboard may also be interested to receive this information. The temperature emitter (sensor) may be oblivious to who the recipients actually are.

Routing Complexity

The routing complexity and responsibility should be encapsulated within the IoT Cloud (and not within the devices themselves). The device should be offered a simpler model to connect, dispatch, and to receive messages asynchronously.

Security

This is a paramount concern in IoT today. The protocol should support a strong security model to protect data-in-motion using a robust PKI infrastructure. It needs to offer capabilities such as: Secure endpoint identity (Client certificates), real-time authorization controls, and real-time policy enforcements.

Adoption of Standards

Most embedded OSes and all cloud OSes today support the TCP/IP stack. So a suitable IoT protocol is expected to run on top of the standard TCP/IP stack to ensure highest compatibility across multiple hardware and OS platforms.

Wrapping Up…

M2M communication brings its unique set of challenges – Intermittent connectivity, asynchronous communication needs, and a need to decouple participants. Having an intermediate messaging queue (broker) is a great way to address many of these requirements. Hence IoT protocols such as MQTT adopt a pub-sub, queue-based architecture and are thus better suited for IoT applications today.