Use of Data-Driven Approaches for Defect Classification in Stator Winding Insulation

Partial discharges (PD) in the high voltage insulation systems are both a symptom and cause of terminal and impending failures. The use of data-driven methods based on PD measurements will enable predictive strategies to replace traditional maintenance strategies. This paper employs machine learningbased classification models to identify and characterize PD signals originating from lab-made artificial defects in epoxy-mica material samples. Three different PD sources were studied: surface discharges in air, corona discharges, and discharges caused by internal cavities/delaminations. To generate high-quality datasets for the training, validation, and testing of classification models, Phase-Resolved PD (PRPD) data for each test object was obtained at room temperature under 50 Hz AC excitation at 10 % above the PD inception voltage (PDIV) of each sample. Relevant statistical and deterministic features were extracted for each observation and were labeled based on the defect type (supervised learning). Finally, the trained and validated machine learning (ML) models were used to identify PD sources in the service-aged stator winding insulation. Support vector machines (SVM), ensemble, and k-nearest neighbor (kNN) algorithms achieved significantly high accuracy (≥ 95 %) of defect identification.


Introduction
Intermittent power generation causes stator winding insulation of hydrogenerators, which were designed for primarily continuous operation 50 years back, to experience damaging and frequent service failures, resulting in long downtimes, and costly repairs, thus, significant economic losses [1,2]. Reliable and accurate condition monitoring of stator winding insulation has been widely performed using both off-line and on-line partial discharge (PD) measurements. PD measurements are a vital tool for assessing and monitoring the condition of power equipment. Different sources of PD have different effects on the insulation performance and thus reliability of the power apparatus. Therefore, identifying various PD sources at different locations is of great importance for the health assessment of stator winding insulation [3]. Data-driven methods employing artificial intelligence, such as machine learning (ML) algorithms, are now more feasible due to increasing computation power and accessible tools, and a growing interest in such quantitative and predictive methods is expected. Developing such models for generator lifetime/condition qualification and estimation will accelerate the transition from traditional strategies toward predictive strategies, such as upgrading insulation components before they are estimated to fail [4]. Human experts usually perform a condition assessment of generator insulation with experience and qualitative judgment of data. The identification of PD sources (and their severity) is usually done using Phase-Resolved PD (PRPD) analysis, where each PD event is resolved into the apparent discharge magnitude (Q a ), phase angle (ϕ), and the number of the PDs (n). The use of MLbased data-driven models as a decision support tool can maximize the reliability and accuracy of the defect identification by unlocking hidden correlations. ML-based PD diagnostics have been receiving increasing attention in the literature to handle the growing amount of data, reduce the human labor for feature engineering and tap the full potential of the data [5]. However, most of these studies employ similar fundamental features and are limited to specific test setups, whose experimental conditions do not explicitly state systematical details. These limitations inhibit comparative analysis and reproducibility. To further develop robust ML-based models, different statistical and deterministic PD features and due details for data acquisition and feature extraction techniques should be introduced. The primary purpose of this study is to employ novel features (predictors) extracted from the obtained PRPD datasets and generate high-quality datasets for the training, validation, and testing of classification models based on various ML algorithms (classifiers). To that end, lab-made artificial defects are made to represent the most common defect types. Then, the qualified classification models (trained and tested) are used to predict possible defect types in the service-aged stator winding insulation.

Test Objects
Test samples with known defects were made to emulate the most common discharge sources encountered in practice, which are classified into three main groups: i. corona discharges in the air (e.g., semiconductor-field grading paint (or tape) junction at the end windings), ii. surface discharges (e.g., slot discharges), iii. internal discharges (e.g., internal cavities and delaminations). Fig. 1(a) illustrates the arrangement to induce corona discharges in an air gap (4 cm) between an energized piece of thin aluminum wire (with a tip diameter of 1 mm) and a flat ground electrode. To induce surface discharges, a similar thin wire was attached to the electrically grounded semiconductive coating and was extended towards the high voltage (HV) copper strands of a real hydrogenerator stator bar (≈ 2 cm gap), as depicted in Fig. 1(b). In this work, stator bar is used interchangeably with stator winding insulation. Also, three service-aged stator bars (similar to the one shown in Fig. 1(b) without the wire) were used to predict PD sources associated with them. We made laboratory objects from resin-rich mica/epoxy/glass-fiber tape to emulate internal discharges originating from voids and delaminations. The object dimensions were 100 mm × 100 mm × 3 mm. 1-mm of sheet/plate thickness was formed by stacking six tape layers half-overlapped. We cured the test objects at 160 • C for one hour under pressure. Metal spacers were used to create cylindrical voids during the curing process. Upon pressing cured plates together, a test object with a definite void dimension was formed. Three different test objects with a total insulation thickness of 3 mm incorporating different void types were made, as exhibited in Fig. 1: (c1) 10 mm void diameter and 1 mm void gap distance, (c2) 5 mm void diameter and 0.5 mm void gap distance with round edges, and (c3) 40 mm void diameter and 1 mm void gap distance. The main difference between the voids in (c1) and (c3) is that the void in (c3) has electrically unstressed walls and represents large delaminations and voids. In (c2), the void was created only on one plate while the other plate was void-free, i.e., the void is slightly closer to the bottom electrode, representing an asymmetrical void in the insulation. The diameter of the HV electrode was 30 mm.

Test Setup
We generated high-quality PD datasets (free of crosstalk and noise above a selected threshold value of 1 pC) for the training, validation, and testing of various ML algorithms using a commercial PD acquisition unit (Omicron MPD 600) along with a 100-pF coupling capacitor. Fig. 2 illustrates the HV PD setup used for data generation, and test object stands for the lab-made samples with artificial defects. Desired voltage amplitude and frequency were set by a digital to analog converter (DAQ) and amplified to HV by an HV amplifier (TREK 20/20C-20 kV, 20 mA) in series with an RLC-low-pass filter with a cut-off frequency at 5 kHz. Measurements were performed according to IEC 60270, [6] where the amplitude spectrum was integrated with a center frequency of 250 kHz and a bandwidth of 300 kHz.

Test Procedure and Data Acquisition
The test objects were subjected to voltage preconditioning before the data collection activity to preempt the so-called "memory effect", and thus, the same test sample could be tested multiple times to achieve reliably extensive training and testing datasets. PD inception voltage (PDIV) of each test object had been experimentally determined in accordance with the definition in IEC 60270, and test objects were subjected to voltage preconditioning at 10 % above their corresponding PDIV at 50 Hz for 5 minutes before each test was performed (data collection activity). Given the stochastic nature of the PD phenomenon, generating data using the same test sample should be performed at least five times for acceptable statistical significance [7] as well as to generate an extensive and reliable dataset. For this purpose, each sample was subjected to HV for 600 s at 50 Hz AC voltage. Each measurement (also referred to as "observation") incorporated 1000 AC cycles at 50 Hz (20 s), allowing for up to a total of 30 separate data windows (30 observations) for the same test sample (30 × 20 s = 600 s). Fig. 3 presents the steps followed for data and feature extraction, training, validation (fraction of training data used during training for tuning hyperparameters), testing, and prediction. MATLAB's inherent functions and Classification Learner Toolbox were used for the entire data analysis and ML tasks. Critical remarks are listed below.
1. Employing the obtained PRPD data, statistical and PD quantities were extracted using 20 observations for each test sample out of 30 available observations: i.a., PD inception/extinction voltage, pulse repetition rate, phase angle, average discharge current, discharge power, quadratic rate, kurtosis, skewness. The generated parameters were labeled based on the defect type (class). The complete list of extracted features is shown in Table 1. The reasons for the selection of features are given in the next section. 2. The extracted features for each observation were labeled based on the artificial defect type (supervised learning). It should be highlighted that observations belonging to c1-c3 type defects were grouped/labeled together as "void discharge." The entire dataset was then split into training and testing sets using the holdout method, where 80 % of the data was allocated for training, and the remaining 20 % of the data was used for testing. In addition, "k-fold cross-validation" was used on the training dataset where the dataset was arbitrarily divided into "k" groups. In this work, 5fold cross-validation was used. One of the groups was used as the validation set, and the rest were used as the training set, as illustrated in Fig. 3. Then, the chosen model was trained on the training set and was evaluated on the test set. The procedure continued until each unique group was used as the test set. 3. Selected machine learning classifiers, such as support vector machines (SVM), k-nearest neighbor (kNN), ensemble, and decision tree, including their various subtypes, were trained by optimizing their hyperparameters. Then, predictions were made based on the defect type (label or class). Based on the training and test accuracies, the best models were chosen. 4. The qualified ML models were then used to predict/identify PD sources in service-aged (50 years) stator bars. Following the test procedure performed for the lab-made artificial defects, the same predictors/features were extracted from the obtained PRPD data of the stator winding insulation for each observation. The observations were then fed into the selected ML models, and possible PD sources were predicted based on the three training classes/labels.
The end-windings of these bars were removed due to the presence of asbestos, and then the semiconductive coating was removed at the ends, and field stress grading paint was applied by following the manufacturer's instructions. Thus, PD sources related to end-windings were eliminated. Also, the semi-conductive and field grading coating on the bars were checked against any abrasion or damage; only the undamaged bars were tested. Therefore, any PDs arising from the stator bars were expected to originate from internal discharges in the mainwall insulation represented by one or more void classes defined in c1-c3. However, the dataset was not labeled due to the uncertainty of unknown defects that might be present in the winding insulation. A detailed discussion of the predicted classes is given in the results.

Data Preprocessing and Feature Extraction
Firstly, the entire dataset was separately grouped for the positive and negative half-cycles as a reference to a sinusoidal AC voltage waveform to account for the asymmetrical PD events based on the defect type. The superscript (+) for the positive half cycle and (−) for the negative half cycle were used, respectively. Subsequently, numeric features (deterministic) such as number of charges, average, median, maximum (99th percentile employed to eliminate outliers) charge magnitudes, and discharge power were calculated for the positive and negative half-cycles (see Table 1) using the de-noised ϕ − Q a − n data exported from MPD 600. Secondly, statistical post-processing of the ϕ − Q a − n data was performed in addition to the above-mentioned deterministic parameters. For the statistical analytics, PRPD data were characterized by using H n (ϕ)-phase distribution of the number of PDs and H qn (ϕ)-phase distribution of mean discharge amplitude. An example of such distributions extracted from the training database is shown in Fig. 4(a). Similarly, each feature was extracted separately for the positive and negative half-cycles using the corresponding H n (ϕ) and H qn (ϕ), respectively. We employed Weibull distribution, kurtosis (Ku), and skewness (Sk) analysis for the statistical analytics. The distribution of the obtained discharge amplitudes (Q a ) follows an S-shape-like trajectory on the cumulative distribution function (cdf) plot, as presented in Fig. 4(b), indicating the existence of more than one active discharge mechanism [8]. The plot shows that a fiveparameter mixed Weibull distribution (with two separate mechanisms) fitted the diffused discharge data well. A mixed Weibull distribution based on the sum rule is given by: where q is the discharge magnitude (Q a in our case), F 1 (q) and F 2 (q) are the cdfs of each discharge mechanism, p is the probability of occurrence of the subpopulation F 1 (q) with 0 ≤ p ≤ 1 [8]. Below are five parameters; α 1 , β 1 , α 2 , β 2 , and p, shown explicitly: where α is the scale parameter, and β is the shape parameter of each cdf. It is assumed that p = 0.5.
The shapes of H n (ϕ) and H qn (ϕ) have characteristic features for different PD sources [9]. Skewness and kurtosis have widely been used for the quantitative classification of PDs in artificial defects [9,10] and hydro generators [11]. In relation to a normal distribution, skewness refers to the degree of asymmetry of a distribution: Sk is positive (negative) for a leftward (rightward) shift of a normal distribution. On the other hand, kurtosis stands for the degree of the sharpness of • H n (φ) and H qn (φ) • Pulse repetition rate, phase angle, average discharge current, discharge power, quadratic rate,... a distribution in relation to a normal distribution, and Ku is positive (negative) for a pointier (flatter) distribution compared to a normal distribution. Lastly, feature scaling was applied to the data to equalize their influence on the model. In this work, each feature column in the vector containing the entire observations (DATA * as delineated in Fig. 3) was scaled to limit its range in the interval [0, 1].

Results and Discussion
The obtained PRPD plots for 1000 AC cycles (50 Hz) for each defect type and winding insulation are shown in Fig. 5. The algorithms were then trained and tested based on the extracted features from the PRPD data to distinguish different PD sources under AC voltage. The input data consisted of the standardized 32 features ( Table 1). The predicted classes were rod to plane (Rod2plane), surface discharge (surfaceDischarge), and internal discharges (Void).

Cluster Analysis and Feature Selection
A quick way to check if the classes can be grouped based on their extracted features is to visualize the data on a scatter plot and see if there are any obvious patterns or groups. For this purpose, chosen features were plotted against each other on a 3D scatter plot, as shown in Fig. 6. The selected features in the plot are closely related to most of the other features; hence similar patterns were observed among the classes. The scatter plot suggests that the extracted features can be useful to differentiate between the PD sources because the  data points belonging to each class form distinguishable clusters. Also, using a box plot to compare classes after a statistical Weibull parameter, viz. ShapeNeg2 (β 2 of the negative half cycle data) suggests discernible differences, as presented in Fig. 7.

Classification Analysis
The performances of the trained and tested algorithms are usually interpreted using a confusion matrix, which depicts the amount of correctly classified instances in the major diagonal and the wrong predictions in the minor diagonal. Also, "accuracy" is a useful parameter showing the overall performance and is defined as the value obtained by dividing the number of correct predictions by the number of total predictions. Table 2 shows a list of selected classifiers that were trained and validated along with their accuracies. Optimized versions for each classifier stand for the performed hyperparameter optimization (model tuning), as illustrated in Fig. 3. As an example, Fig. 8 presents the confusion matrices for validation and test results of the tuned kNN-classifier. As depicted, the accuracy was 100 % both for the training (i.e., validation in training-222 observations) and testing (55 observations) data. As the next step, selected validated ML models were used to predict/identify PD sources in the service-aged (50 years) stator winding insulation. The same data  extraction methodology (de-noising, feature extraction, and data normalization/standardization) was followed before feeding the model with the data. PRPD data from three different service-aged winding insulation was acquired at their rated voltage (7.4 kV). 25 observations from each bar were obtained, amounting to a total of 75 observations where each measurement/observation incorporated 1000 AC cycles at 50 Hz (20 s). As previously mentioned, the stator bars did not contain any visible damage, nor did they have end-windings. Thus, a human expert would expect to observe PDs reminiscent of internal voids and delaminations. And based on the PRPD plots of the winding insulation, as shown in Fig. 5(d), they would label the PD source as internal discharge. However, we had not labeled the observations as "void" but as "unknown." Table 3 shows the predicted PD classes in the winding insulation by the selected models. The tuned SVM model classified 74/75 observations as "void," which was the expected response. Only one observation seems to be misclassified. The optimized ensemble model yielded similar results, while the rest of the models predicted fewer voids. As a caveat, ideally, a well-defined PD source in a different experimental setting would have been more assuring to test the model's accuracy, even though the initial results seem to be promising and indicate that the identification of different PD sources within a test object is viable. In future work, further tests will be performed on winding insulations with other types of known defects to endorse the models' accuracies more reliably. Then, the selected models will be tested with the unseen labeled experimental data representing each PD class. Moreover, not all the 32 features used in this work are likely to have the same importance for the trained models because some of them are close descendants of each other, and some may have random values that might mislead the training procedure and cause overfitting problems. After further endorsing the models using labeled unseen data, feature selection methods (i.a., neighborhood component analysis, MRMR, Chi2, ANOVA), dimensionality reduction techniques (i.a., principle component analysis, multidimensional scaling), and seeking different features will be performed to see if the models can be represented with fewer features without compromise in the overall accuracy.
An overview of various qualified models will also be given based on the selected features. All pertinent ML classifiers, including neural networks, will be tested. Last but not least, new classes to represent other types of PD sources will be included as well as defining different types of voids studied in this work as separate classes, and then the classifiers will be tested under those circumstances.

Conclusions
The main aim of this work was to test out several ML-based models to classify different PD sources accurately, investigate the benefit of analytics for condition monitoring of HV insulation, and address challenges and knowledge needs for future studies. Labmade artificial defects were used to generate ϕ − Q a − n datasets from which novel features were extracted. The generation of high-quality training datasets was a success, and they will be extended to incorporate more PD sources, especially those arising in stator winding insulation. Several ML classifiers were trained based on the extracted features, and validation accuracies thereof were benchmarked. Several types of SVM, ensembles and kNN models achieved significantly high accuracies of PD defect identification (≥ 95 %). Then, the selected classification models were used to predict possible defect types in service-aged winding insulations. The chosen models predicted the dominating PD source in the stator winding insulation to be of void discharge type, agreeing with the initial expectation. However, further tests should be performed on winding insulations with other known defects to more reliably verify the models' accuracy. This study has laid the groundwork for the future investigation of feature selection and reduction, as well as the introduction of novel statistical features that are fingerprints of PD data. Finding the right features that adequately define and classify different PD sources is still ongoing for further improvement. In particular, five parameter Weibull fit approach yielded promising results and deserves further attention to identify simultaneously active PD mechanisms, e.g., slot discharges and internal discharges in the winding insulation.