loading page

Synthetic Minority Oversampling Technique Enhanced Machine Learning Models for Energy Theft Detection
  • Md Saiful Islam Sajol,
  • Imtiaz Ahmed,
  • Quazi Sanjid Mahmud
Md Saiful Islam Sajol
Louisiana State University
Imtiaz Ahmed
New Mexico Institute of Mining and Technology

Corresponding Author:[email protected]

Author Profile
Quazi Sanjid Mahmud
University of California Riverside

Abstract

Electricity theft poses significant challenges to utility companies worldwide, resulting in substantial financial losses. This study addresses the problem by leveraging machine learning algorithms to detect energy theft in smart grids. The insufficiency of data on theft conditions and the imbalance of datasets have always hindered the precise identification of fraudulent activity. To mitigate these challenges, we curated a dataset from the Open Energy Data Initiative, which encompasses sixteen consumer categories and six theft conditions. Our approach focuses on using the Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance by generating synthetic samples for minority classes. We conducted a comparative analysis of various machine learning based classification algorithms, including K-Nearest Neighbors (KNN), Decision Tree, Random Forest (RF), Bagging with RF, and Ensemble Learning, and observed the results before and after the implementation of SMOTE on the dataset. We find that SMOTE demonstrates its most significant impact on classifying the most challenging classes within the dataset. In particular, it shows improvements of 57.00%, 37.88%, and 36.88,% for Class 6, Class 1, and Class 3, respectively, with the KNN algorithm. Other algorithms also indicate significant increments in terms of accuracy, kappa, F1-score, and AUC metrics in detecting fraudulent activity. Overall, this research contributes to advancing energy security by highlighting the importance of robust theft detection frameworks for safeguarding energy distribution systems.
27 Mar 2024Submitted to TechRxiv
30 Mar 2024Published in TechRxiv