While this study offers significant insights, it is constrained by several limitations. Primarily, the accuracy of the analysis is contingent upon the reliability of data captured by IoT sensors. Any anomalies such as sensor malfunctions, calibration errors, or transmission lags could compromise data quality and, consequently, the robustness of the predictive models. Additionally, the study is based on data from a single institution—Kumaraguru Institutions—whose infrastructural and operational characteristics may not fully represent the diversity found in other educational environments. This contextual specificity limits the generalizability of the findings across broader institutional or geographic contexts. Despite these limitations, the study lays the groundwork for future research and development of adaptive, cross-contextual energy management systems.
About the Study
Energy management has emerged as a critical dimension of institutional sustainability, particularly in higher educational settings characterized by expansive infrastructure and continuous energy demands. Efficient energy utilization not only lowers operational expenditures but also reinforces institutional commitment to environmental stewardship. This study examines the energy management practices at Kumaraguru Institutions, a pioneer in sustainability initiatives, leveraging data-driven approaches for optimization. Utilizing data collected from Internet of Things (IoT) sensors installed across strategic locations on the campus, the study analyzes energy consumption patterns in real time. IoT-enabled systems provide granular and continuous data, offering fertile ground for the application of machine learning algorithms. The central objective of this research is to develop a predictive energy management framework that enhances load forecasting, identifies inefficiencies, and facilitates proactive energy distribution strategies. By integrating machine learning with IoT infrastructure, the study proposes a scalable model for real-time, data-informed energy decision-making. Such a model holds the potential to reduce energy wastage, align peak and off-peak demand with usage trends, and ultimately improve operational performance. In doing so, the study contributes to both practical and theoretical advancements in sustainable energy practices within educational ecosystems, setting a benchmark for similar institutions aspiring to integrate intelligent energy management systems.
Problem Statement
Educational institutions, owing to their vast and multifaceted infrastructure, face unique challenges in achieving efficient energy management. Conventional systems often lack the capacity to provide real-time monitoring or predictive insights, resulting in energy inefficiencies, elevated operational costs, and missed opportunities for optimization. Despite the deployment of IoT-based infrastructure at Kumaraguru Institutions—recognized for its sustainability initiatives—there exists a significant gap in translating this data into actionable insights. The absence of a robust, predictive energy management system limits the institution’s ability to maximize the utility of its sensor-generated data. This research seeks to bridge this gap by harnessing machine learning techniques to analyze high-volume, real-time data streams for energy consumption optimization. Without such data-driven interventions, institutions risk persistent inefficiencies and underutilization of technological infrastructure, which undermines both cost-effectiveness and sustainability objectives. Addressing this problem is essential for formulating a systematic, predictive, and replicable approach to intelligent energy management in academic institutions.
Scope of the Study
This research endeavors to design a predictive energy management framework grounded in machine learning techniques, using Kumaraguru Institutions as the primary case study. The scope includes the collection and analysis of energy consumption data through IoT sensors strategically deployed across campus buildings. By identifying consumption patterns, inefficiencies, and load variances, the study aims to develop predictive models that enable dynamic energy optimization. The study not only aspires to enhance the institution’s energy efficiency and cost management but also to contribute to broader environmental goals by minimizing its carbon footprint. The integration of real-time IoT data with machine learning models exemplifies a novel, scalable approach to sustainable energy management. Furthermore, the proposed framework is envisioned to serve as a reference model for similar academic and non-academic institutions seeking to implement data-driven sustainability solutions.
Limitations of the Study
While this study offers significant insights, it is constrained by several limitations. Primarily, the accuracy of the analysis is contingent upon the reliability of data captured by IoT sensors. Any anomalies such as sensor malfunctions, calibration errors, or transmission lags could compromise data quality and, consequently, the robustness of the predictive models. Additionally, the study is based on data from a single institution—Kumaraguru Institutions—whose infrastructural and operational characteristics may not fully represent the diversity found in other educational environments. This contextual specificity limits the generalizability of the findings across broader institutional or geographic contexts. Despite these limitations, the study lays the groundwork for future research and development of adaptive, cross-contextual energy management systems.
Major Players in Indian Higher Education
India’s higher education sector comprises a mix of elite institutions, established private universities, and dynamic regional colleges. Prominent among them are the Indian Institutes of Technology (IITs), Indian Institutes of Management (IIMs), and top-tier private universities such as Amity University, VIT, and BITS Pilani. These institutions set benchmarks in academic excellence, research output, and industry engagement. Regional players such as Kumaraguru College of Technology (KCT) also play a critical role in delivering quality education tailored to the needs of emerging industries. By offering specialized programs in engineering, management, and technology, institutions like KCT contribute significantly to the national talent pool, particularly in sectors such as IT, manufacturing, and renewable energy. Collectively, these players form a dynamic ecosystem addressing the growing demand for skilled professionals in India and abroad.
Emerging Trends in the Higher Education Sector
The Indian higher education landscape is undergoing a transformative shift driven by technological innovation, policy reform, and evolving learner expectations. Key trends include the proliferation of digital and blended learning platforms, enabling flexible and inclusive education. There is a discernible shift toward competency-based curricula that align with industry requirements and emphasize employability. Institutions are increasingly investing in research and innovation ecosystems, fostering entrepreneurship through incubation centers and interdisciplinary collaboration. Moreover, regulatory bodies such as AICTE and NBA are steering institutions toward outcome-based education and continuous quality improvement. The adoption of advanced technologies—such as artificial intelligence, data analytics, and blockchain—is redefining administrative efficiency and pedagogical delivery. Sustainability, too, is gaining traction, with many institutions incorporating green practices and ESG (Environmental, Social, Governance) metrics into their operational strategies.
Challenges Confronting the Sector
Despite its growth, the Indian higher education sector continues to grapple with multifaceted challenges. A major concern is the persistent gap between academic instruction and industry expectations, resulting in skill mismatches among graduates. Many institutions face infrastructural inadequacies, limited access to research funding, and challenges in attracting and retaining qualified faculty. Furthermore, the rapid pace of technological change demands continuous curriculum innovation and faculty upskilling, which many institutions struggle to implement effectively. Issues of equity and accessibility remain pressing, particularly in rural and underrepresented regions. Additionally, institutions are under growing pressure to meet national and international accreditation standards, which often require significant organizational restructuring and resource mobilization. Overcoming these challenges is essential for ensuring the sector’s long-term resilience and global competitiveness.
OBJECTIVES OF THE STUDY
This research methodology outlines the approach to achieving the objectives of the study on energy consumption prediction in educational institutions using hybrid machine learning models. The study uses a combination of Random Forest and Gradient Boosting algorithms to improve the accuracy of power usage forecasting. Key features such as rolling mean, energy per block, and peak load indicators are analysed for their impact on total energy consumption. The methodology also includes model evaluation, performance comparison, and deployment through a user-friendly interface for real-time predictions.
METHOD OF DATA COLLECTION
Secondary Dataset: The study utilizes a secondary dataset containing historical energy consumption records (2023–2024) collected from energy monitoring systems within an educational institution. The data was obtained in Excel format and includes key variables such as total power consumption, energy per block, day type classification, and peak load status.
Data Source: The dataset was sourced from the institution's internal energy management reports, compiled by the campus facilities team. These records are routinely maintained for operational monitoring and sustainability tracking.
THEORETICAL FRAMEWORK
Energy consumption forecasting plays a crucial role in optimizing energy usage, reducing costs, and enhancing sustainability efforts. By analysing historical energy consumption data, seasonal trends, and external factors, educational institutions can better plan energy use and implement energy-saving strategies. For this study, historical energy consumption data from 2023 to 2024 is analysed to identify consumption patterns, such as peak loads during specific times of the day or during certain events. External factors like weather conditions, day type (weekdays vs weekends), and the campus's operational schedule (e.g., holidays, exam periods) are also considered as they significantly affect energy usage. Predictive models, such as Random Forest (RF) and Gradient Boosting (GB), are used to capture trends, identify anomalies, and forecast future consumption. The hybrid model combining RF and GB provides more accurate predictions by integrating the strengths of both models, allowing the institution to optimize energy consumption across buildings, reduce wastage, and improve overall efficiency. The ability to predict energy usage accurately helps with better energy management, cost savings, and supports sustainability initiatives by reducing the carbon footprint.
RESEARCH INSTRUMENT
The study employs quantitative analysis through predictive modeling techniques to forecast energy consumption in an educational institution. The key research instruments used include:
PERIOD OF THE STUDY
The study will be conducted over a 6-month period, starting with data preparation and pilot modelling in the first 2 months. The next 3 months will focus on model development, testing, and validation. The final month will be dedicated to result interpretation, report writing, and recommendations.
ANALYSIS AND INTERPRETATION
DATA DISTRIBUTION & RESIDUALS
A histogram was plotted to analyse the distribution of residuals. The residuals were found to follow a near-normal distribution, suggesting the model's predictions were unbiased.
Fig 6.5.1
Fig 6.5.2
The histogram shows that most power consumption values fall between 5,000 and 20,000 units, indicating a right-skewed distribution. A few extreme values (above 60,000) represent rare but significant anomalies or surges. The sharp drop-off after the peak suggests consistent consumption patterns with occasional high outliers.
Fig 6.5.3
The residual distribution is heavily right-skewed, with most residuals concentrated near zero, indicating a generally good model fit. However, the presence of long positive tails suggests a few instances where the model underpredicted significantly.
DESCRIPTIVE STATISTICS
Descriptive statistics summarize the main characteristics of the dataset, providing insights into central tendency (mean, median) and dispersion (standard deviation, range).
Fig 6.6.1
Fig 6.6.2
The descriptive statistics show that POWERHOUSE_1_C_BLOCK has the highest average power consumption (mean: 333.01), while POWERHOUSE_1_D_BLOCK has the lowest (mean: 151.98). All blocks exhibit significant variability, with occasional extreme values like the maximum of 2645.63 in A_BLOCK and 2226.48 in B_BLOCK, indicating potential anomalies or peak usage periods.
Fig 6.6.3
The dataset contains no missing values across any of the listed variables, ensuring completeness.
TIME SERIES AND PATTERN ANALYSIS
A line plot for actual and predicted energy consumption was generated. The hybrid model closely followed real-time patterns, confirming its ability to learn temporal structures.
Fig 6.8.1
Fig 6.8.2
The time series plot shows mostly stable energy consumption with periodic fluctuations from November 2023 to April 2024. There are significant spikes in power usage observed around mid-2024, especially between May and September.
Fig 6.8.3
Fig 6.8.4
To better understand the underlying patterns in the energy consumption data, a time series decomposition was performed on the Total Power Consumption. The decomposition breaks the series into four components:
Observed: This is the actual recorded power consumption. It shows significant fluctuations, with occasional sharp spikes indicating sudden surges in usage during certain days.
Trend: The trend line reveals the overall direction of energy consumption over time. A gradual increase can be observed in the first half of the period, followed by a slight dip towards the end, suggesting seasonal or operational changes in energy demand.
Seasonal: This component displays repeated cyclical patterns across the time series. These cycles correspond to weekly operational routines within the institution, such as weekday vs. weekend activity, and highlight the periodic nature of energy usage.
Residual: The residual plot shows the irregular variations or noise remaining after removing the trend and seasonality. While most values remain close to zero, occasional spikes indicate abnormal or unplanned consumption events, which may require further investigation.
MODEL DEPLOYMENT
The deployment phase bridges the gap between model development and real-world application. For this study, the hybrid model, which combines the strengths of Random Forest and Gradient Boosting, was deployed to enable real-time prediction of energy consumption in educational institutions. Furthermore, the deployed model not only supports real-time predictions but also facilitates proactive decision-making. By continuously monitoring energy consumption patterns and forecasting future usage, the system can generate timely alerts during peak load conditions, helping facilities management take corrective actions to reduce energy wastage. The integration with interactive platforms like Streamlit ensures accessibility for non-technical stakeholders, while compatibility with smart grid tools like Gridsearch GV enhances the model’s scalability for larger campus networks. This deployment marks a significant step toward data-driven energy management in educational institutions.
Streamlit: A Python-based web framework was used to develop an interactive prediction. It allows users to input features (e.g., block power readings, day type, load indicators) and instantly view predicted power consumption values.
Fig 6.11.1
Fig 6.11.2 (Link: http://localhost:8501/)
The above image shows the immediate energy consumption prediction system using Streamlit
FINDINGS
Fig 7.1.2
The above graph shows the comparison between the weekday’s energy consumption and weekend energy consumption.
Fig 7.1.3
Fig.7.1.4
The above graph shows the highlighted weekends and its consumption of energy consumption.
SUGGESTIONS
Integration would enable the automation of energy-saving measures—such as dynamic lighting control, HVAC load adjustments, and equipment shutdown scheduling—based on forecasted consumption patterns. This real-time feedback loop can significantly reduce manual intervention and ensure proactive load management aligned with anticipated demand.
Deployment of Interactive Dashboards for Operational Monitoring
The development of real-time, user-friendly dashboards, potentially using platforms like Streamlit or Power BI, is strongly advised to facilitate continuous monitoring by facility management teams. These dashboards should display forecasted energy consumption, peak load alerts, and historical trends in a visual format conducive to operational decision-making. Empowering stakeholders with actionable insights can support rapid responses during peak demand periods and promote transparency in energy governance.
Enrichment of Feature Space with Contextual Variables
Future iterations of the model should incorporate a broader range of contextual data—such as meteorological conditions (temperature, humidity), academic calendars (examination periods, holidays, events), and equipment usage logs—to capture latent drivers of energy consumption. The inclusion of such variables is expected to improve model granularity and predictive accuracy, particularly in scenarios influenced by external environmental or institutional factors.
Continuous Learning Through Model Retraining and Adaptation
It is recommended that the machine learning model be periodically retrained with updated datasets to maintain its relevance and accuracy over time. As energy usage patterns evolve due to changes in infrastructure, scheduling, or occupancy, retraining ensures that the model adapts to new consumption behaviors. Establishing a retraining frequency (e.g., quarterly or semester-wise) will help in sustaining model robustness and operational reliability.
CONCLUSION
This study successfully demonstrates the application of machine learning techniques to accurately forecast energy consumption in educational institutions using a hybrid model. By leveraging historical energy usage data, meaningful features such as rolling averages, energy per block, and peak load indicators were engineered to enhance prediction quality. The integration of Random Forest and Gradient Boosting models proved highly effective, with the hybrid model achieving exceptional performance metrics (MAE: 144.24, RMSE: 666.34, R²: 0.9984). These results validate the robustness and reliability of the model in capturing complex consumption patterns across multiple blocks on campus. The predictive insights generated by the model provide a solid foundation for data-driven energy management strategies. Institutions can now proactively prepare for high-demand periods, allocate resources more efficiently, and explore demand-side interventions such as load shifting and smart scheduling. In conclusion, this research not only delivers a reliable forecasting solution but also opens avenues for future work, including integrating weather and occupancy data, implementing anomaly detection for energy misuse, and deploying IoT-based real-time data streams to make the system even more responsive and scalable.