Data Mining Applied to Decision Support Systems for Power Transformers’ Health Diagnostics

Standard

Data Mining Applied to Decision Support Systems for Power Transformers’ Health Diagnostics. / Khalyasmaa, Alexandra I.; Matrenin, Pavel V.; Eroshenko, Stanislav A. et al.
In: Mathematics, Vol. 10, No. 14, 2486, 07.2022.

Research output: Contribution to journal › Article › peer-review

BibTeX

@article{bba08326b3274fb6a1414f89286ec927,

title = "Data Mining Applied to Decision Support Systems for Power Transformers{\textquoteright} Health Diagnostics",

abstract = "This manuscript addresses the problem of technical state assessment of power transformers based on data preprocessing and machine learning. The initial dataset contains diagnostics results of the power transformers, which were collected from a variety of different data sources. It leads to dramatic degradation of the quality of the initial dataset, due to a substantial number of missing values. The problems of such real-life datasets are considered together with the performed efforts to find a balance between data quality and quantity. A data preprocessing method is proposed as a two-iteration data mining technology with simultaneous visualization of objects' observability in a form of an image of the dataset represented by a data area diagram. The visualization improves the decision-making quality in the course of the data preprocessing procedure. On the dataset collected by the authors, the two-iteration data preprocessing technology increased the dataset filling degree from 75% to 94%, thus the number of gaps that had to be filled in with the synthetic values was reduced by 2.5 times. The processed dataset was used to build machine-learning models for power transformers' technical state classification. A comparative analysis of different machine learning models was carried out. The outperforming efficiency of ensembles of decision trees was validated for the fleet of high-voltage power equipment taken under consideration. The resulting classification-quality metric, namely, F-1-score, was estimated to be 83%.",

keywords = "data preprocessing, equipment technical state, feature engineering, identification of technical condition, machine learning applications, power transformer",

author = "Khalyasmaa, {Alexandra I.} and Matrenin, {Pavel V.} and Eroshenko, {Stanislav A.} and Manusov, {Vadim Z.} and Bramm, {Andrey M.} and Romanov, {Alexey M.}",

year = "2022",

month = jul,

doi = "10.3390/math10142486",

language = "English",

volume = "10",

journal = "Mathematics",

issn = "2227-7390",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "14",

}

RIS

TY - JOUR

T1 - Data Mining Applied to Decision Support Systems for Power Transformers’ Health Diagnostics

AU - Khalyasmaa, Alexandra I.

AU - Matrenin, Pavel V.

AU - Eroshenko, Stanislav A.

AU - Manusov, Vadim Z.

AU - Bramm, Andrey M.

AU - Romanov, Alexey M.

PY - 2022/7

Y1 - 2022/7

N2 - This manuscript addresses the problem of technical state assessment of power transformers based on data preprocessing and machine learning. The initial dataset contains diagnostics results of the power transformers, which were collected from a variety of different data sources. It leads to dramatic degradation of the quality of the initial dataset, due to a substantial number of missing values. The problems of such real-life datasets are considered together with the performed efforts to find a balance between data quality and quantity. A data preprocessing method is proposed as a two-iteration data mining technology with simultaneous visualization of objects' observability in a form of an image of the dataset represented by a data area diagram. The visualization improves the decision-making quality in the course of the data preprocessing procedure. On the dataset collected by the authors, the two-iteration data preprocessing technology increased the dataset filling degree from 75% to 94%, thus the number of gaps that had to be filled in with the synthetic values was reduced by 2.5 times. The processed dataset was used to build machine-learning models for power transformers' technical state classification. A comparative analysis of different machine learning models was carried out. The outperforming efficiency of ensembles of decision trees was validated for the fleet of high-voltage power equipment taken under consideration. The resulting classification-quality metric, namely, F-1-score, was estimated to be 83%.

AB - This manuscript addresses the problem of technical state assessment of power transformers based on data preprocessing and machine learning. The initial dataset contains diagnostics results of the power transformers, which were collected from a variety of different data sources. It leads to dramatic degradation of the quality of the initial dataset, due to a substantial number of missing values. The problems of such real-life datasets are considered together with the performed efforts to find a balance between data quality and quantity. A data preprocessing method is proposed as a two-iteration data mining technology with simultaneous visualization of objects' observability in a form of an image of the dataset represented by a data area diagram. The visualization improves the decision-making quality in the course of the data preprocessing procedure. On the dataset collected by the authors, the two-iteration data preprocessing technology increased the dataset filling degree from 75% to 94%, thus the number of gaps that had to be filled in with the synthetic values was reduced by 2.5 times. The processed dataset was used to build machine-learning models for power transformers' technical state classification. A comparative analysis of different machine learning models was carried out. The outperforming efficiency of ensembles of decision trees was validated for the fleet of high-voltage power equipment taken under consideration. The resulting classification-quality metric, namely, F-1-score, was estimated to be 83%.

KW - data preprocessing

KW - equipment technical state

KW - feature engineering

KW - identification of technical condition

KW - machine learning applications

KW - power transformer

UR - https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=tsmetrics&SrcApp=tsm_test&DestApp=WOS_CPL&DestLinkType=FullRecord&KeyUT=000833110400001

UR - http://www.scopus.com/inward/record.url?scp=85136936305&partnerID=8YFLogxK

U2 - 10.3390/math10142486

DO - 10.3390/math10142486

M3 - Article

VL - 10

JO - Mathematics

JF - Mathematics

SN - 2227-7390

IS - 14

M1 - 2486

ER -

ID: 30720872