Search in the site by keyword

reports - Deliverable

FPCA Analysis Tools and Study for Real-Time Data Management in Big Data Environments

reports - Deliverable

FPCA Analysis Tools and Study for Real-Time Data Management in Big Data Environments

TThe document describes the process of porting the Functional Principal Components Analysis (FPCA) methodology—developed and validated in previous activities—to a single Big Data platform. The report introduces the concept of real-time Big Data (streaming) through a concrete example, such as the charging of electric vehicles.

The digitalization of the electrical system is leading to significant transformations at various levels of the system itself. The proliferation of sensors generates large amounts of data, which can provide many useful insights if interpreted and analyzed correctly. This data is characterized by typical Big Data attributes, including both structured and unstructured data. For these reasons, it is necessary to change the paradigm for analyzing such data, using computing architectures based on clusters and software tools that allow not only statistical analysis but also the application of machine learning techniques. This shift compels utilities to confront methodologies previously unused and to acquire new expertise, which they typically lack, to manage this transition.

This report presents and applies several techniques and tools necessary to address the challenges of analyzing and processing Big Data. A primary example described is the analysis and prediction of electricity consumption based on Functional Principal Components Analysis (FPCA). Starting from a data processing workflow that involved mostly manual actions performed on different, mostly non-scalable platforms unsuitable for managing Big Data, the activity has consolidated the entire process onto a single Databricks platform, more suitable for handling Big Data processing, with a high degree of automation in the entire ETL (Extract, Transform, Load) pipeline. The entire process has also been implemented to be applicable to different datasets from the one originally created for, demonstrating the potential of the tools used. The report demonstrates the application of the ETL pipeline in a real-world case, such as the LANPRIS monitoring system.

Batch Big Data analysis is shown to be very useful in various fields but is not always sufficient. For this reason, the report introduces the concept of real-time Big Data (streaming), describing some open-source tools currently used for real-time analysis of Big Data streams. As a practical example, the design of a monitoring system for a distributed electric vehicle charging system based on home charging points and/or collective charging stations is described.

The results presented in this report contribute to utilities in implementing the digitalization of the electrical system, with a particular focus on managing Big Data through both historical and real-time analysis, in order to provide users with a more resilient network and better services.

Projects

Comments