Building Scalable Data Lakes For Internet Of Things (IoT) Data Management

Main Article Content

Aravind Nuthalapati

Abstract

The rapid expansion of Internet of Things (IoT) devices has resulted in an unprecedented influx of heterogeneous data, posing significant challenges in terms of storage, processing, and analysis. This paper presents scalable data lake architecture, integrated with advanced deep learning techniques, to effectively manage and analyze large volumes of IoT data. The proposed methodology leverages Apache Hadoop for distributed storage, Apache Kafka for real-time data ingestion, and Apache Spark for data processing and model training. Deep learning models, including LSTM, CNN-LSTM hybrid, and GRU, were implemented to capture complex temporal and spatial patterns in IoT data. The CNN-LSTM hybrid model demonstrated superior performance with the lowest MAE and RMSE values, highlighting its effectiveness in predicting future sensor readings. This study underscores the advantages of integrating deep learning models within a scalable data lake frameworks and data strategy, offering significant improvements in predictive accuracy and scalability for IoT applications.

Downloads

Download data is not yet available.

Article Details

How to Cite
Aravind Nuthalapati. (2023). Building Scalable Data Lakes For Internet Of Things (IoT) Data Management. Educational Administration: Theory and Practice, 29(1), 412–424. https://doi.org/10.53555/kuey.v29i1.7323
Section
Articles
Author Biography

Aravind Nuthalapati

Microsoft, Charlotte, NC, United States 28273