IoTEF: A Federated Edge-Cloud Architecture for Fault-Tolerant IoT Applications

Asad Javed, Jérémy Robert, Keijo Heljanko, Kary Främling

Research output: Contribution to journalArticleScientificpeer-review


The evolution of Internet of Things (IoT) technology has led to an increased emphasis on edge computing for Cyber-Physical Systems (CPS), in which applications rely on processing data closer to the data sources, and sharing the results across heterogeneous clusters. This has simplified the data exchanges between IoT/CPS systems, the cloud, and the edge for managing low latency, minimal bandwidth, and
fault-tolerant applications. Nonetheless, many of these applications administer data collection on the edge and offer data analytic and storage capabilities in the cloud. This raises the problem of separate software stacks between the edge and the cloud with no unified fault-tolerant management, hindering dynamic relocation of data processing. In such systems, the data must also be preserved from being corrupted or duplicated in the case of intermittent long-distance network connectivity issues, malicious harming of edge devices, or other hostile environments. Within this context, the contributions of this paper are threefold: (i) to propose a new Internet of Things Edge-Cloud Federation (IoTEF) architecture for multi-cluster IoT applications by adapting our earlier Cloud and Edge Fault-Tolerant IoT (CEFIoT) layered design. We address the fault tolerance issue by employing the Apache Kafka publish/subscribe platform as the unified data replication solution. We also deploy Kubernetes for fault-tolerant management, combined with the federated scheme, offering a single management interface and allowing automatic reconfiguration of the data processing pipeline, (ii) to formulate functional and non-functional requirements of our proposed solution by comparing several IoT architectures, and (iii) to implement a smart buildings use case of the ongoing Otaniemi3D project as proof-of-concept for assessing IoTEF capabilities. The experimental results conclude that the architecture minimizes latency, saves network bandwidth, and handles both hardware and network connectivity based failures.
Original languageEnglish
JournalJournal of Grid Computing
Issue number1
Pages (from-to)57-80
Number of pages24
Publication statusPublished - Mar 2020
MoE publication typeA1 Journal article-refereed

Fields of Science

  • 113 Computer and information sciences

Cite this