When we are talking on successful design of IoT solutions there are two different aspects:
- Common (technology agnostic) principles and recommendation.
Knowledge on a specific platform, which provides components to build IoT solution (like Microsoft Azure)
Considering work on specific solutions we need first to know the common principles and afterwards to has a knowledge on a specific platform, used for implementation.
- Reduce the Complexity:
One of the biggest challenges you face when you are planning Internet of Things (IoT) solutions is dealing with complexity.
This is a common principle for all solutions – not only for IoT. IoT solutions involves many heterogeneous IoT devices, with sensors that generate data that is then analyzed to gain insights. IoT devices are connected either directly to a network or through a gateway device to a network, communicating with each other and with cloud services and applications.
We need to strategies strategies which help us to simplify development, manage complexity, and ensure that your IoT solutions remain scalable, flexible, and robust:
Assume a layered architecture
An architecture describes the structure of your IoT solution, including the physical aspects (that is, the things) and the virtual aspects (like services and communication protocols). Adopting a multi-tiered architecture allows you to focus on improving your understanding about how all of the most important aspects of the architecture operate independently before you integrate them within your IoT application. This modular approach helps to manage the complexity of IoT solutions.
The main question is what is the best design regarding the layered architecture:
The good design can be done in accordance with functional and non-functional requirements:
Considering data-driven IoT applications that involve edge analytics, a basic three-tiered architecture, captures the flow of information from devices, to edge services, and then out to cloud services. A detailed IoT architecture can also include vertical layers that cut across the other layers, like identity management or data security.
There are IoT solutions based only on two main layers: devices and back-end services (usually cloud based). Solutions, which to not require preliminary processing or prompt response after real time analysis can be based on such kind of architecture.
Implement “Security by Design”
Security must be a priority across all of the layers of your IoT architecture. We need to think about security as a cross-cutting concern in your IoT architecture, rather than as a separate layer of your IoT architecture. With so many devices connected, the integrity of the system as a whole needs to be maintained even when individual devices or gateways are compromised. We need to have a security in every module, every layer and for the overall IoT Solution:
We need to adopt standards and best practices for these aspects of your IoT infrastructure:
- Device, application and user identity, authentication, authorization, and access control
- Key management
- Data security
- Secure communication channels and message integrity (by using encryption)
- Secure development and delivery
- Data Model
IoT solutions are data-centric. Each IoT application has a data model that includes data from the devices, as well as user generated data and data from outside systems. The data model must support the use cases in accordance with the solution design / customer requirements.
Most of IoT systems need to have more than one data storage where we need:
- Raw data storage (blob, file, noSQL)
- Metadata storage (SQL, optional noSQL)
- Aggregated data storage (SQL, noSQL)
- Configuration data storage (SQL, optional noSQL)
- Raw Data Storage
Raw data storage is fundamental for IoT Solutions, where one of the main functionality is to ingress information from sensors. Usually we have no need of
schema for these messages. Message format could be updated without any changes in DB structure.
One of the main requirement for most of solutions is this storage to be scalable (partitioned). Relational (SQL) databases support not so very good partitioning and from another hand we need to be aligned to a specific database schema. SQL databases are also most expensive than noSQL solutions or binary/blob storages.
We have several often used options to store the raw data:
- Blob storage or files (file storage)
- Cheap noSQL storage (like Key Value databases)
- Document oriented databases
- TIme series databases
Blob storage or files. There are many reasons to consider Binary Large Objects (BLOBs), or what are more commonly known as files. The storage is cheap and easy to use. The main disadvantage is that we need to consider additional service or custom implementation to search specific information using this storage.
Key-value stores save data as associative arrays where a single value is associated together with a key used as a signifier for the value. Cheap options lioke Azure Table Storage offer limited options for searching (only on the main key and partition key). Others (like Redis or CosmosDB/Table API) allow multi indexing, but on a higher price.
Different options for cheap storage in Microsoft Azure.
Document oriented databases offer support to store semi-structured data. It can be JSON, XML, YAML, or even a Word Document. The unit of data is called a document (similar to a row in RDBMS). The table which contains a group of documents is called as a “Collection”.
Document oriented database design
TIme series databases: TSDBs are databases that are optimized for time series data. Software with complex logic or business rules and high transaction volume for time series data may not be practical with traditional relational database management systems. It is possible these solutions to be based on blob storage or document databases, but with additional logic, which allows users to create, enumerate, update and destroy various time series and organize them in some fashion.
Time series database
- Metadata storage
Metadata storage usually contains additional information about main entities, which we have in our system (sensors and other devices, users and other abstractions, related to the applied area, for which will be used the IoT system), as well as the relations between these entities. Usually this information is not very big and it is easier to store it in SQL database. There are solutions where metadata is stored in document oriented database, but usually the simplest approach is to design metadata model in RDBMS.
- Aggregated data storage
The collected data is being processed (often in real time ) where row data is enriched and aggregated. Aggregated data is used for reports on specific period of time bases. Aggregated data is not so bug as a row data and usually is stored in SQL database or in document oriented database.
- Configuration data storage.
All setting for the IoT system need to be stored in a storage. Most often this is SQL database, because data is small, but RDBMS support rich and easy to design schema. It is possible to use also noSQL for configuration settings.