Advances in Consumer Research
Issue 4 : 676-687
Original Article
Predicting Hospital Length of Stay Using Machine Learning and Spatial Analytics: A Large Open Health Dataset
 ,
 ,
 ,
1
PhD Scholar, IIC University of Technology, Cambodia, (Enrolment No. FNR210604).
2
Professor at Prin. L. N. Welingkar Institute of Management Developmement and Research (PGDM), Mumbai.
3
Geomatics Scientist
4
London School of Management Education
Abstract

This mixed-methods study develops and validates machine learning models for hospital length of stay (LOS) and cost prediction, integrates Geographic Information System (GIS) spatial analytics for population-level healthcare governance, and examines cross-national transferability through qualitative validation with Indian healthcare professionals. Analysing 1,048,575 inpatient discharge records from the New York State SPARCS database, Random Forest regression achieved R² = 0.41 for LOS prediction (MAE = 3.18 days, 71.3% accuracy within ±3 days) and R² = 0.79 for cost prediction. Clinical classification variables-principally APR-DRG severity-accounted for 74.61% of feature importance versus 8.21% for demographics, establishing a nine-to-one ratio that identifies classification infrastructure as the binding constraint on prediction capability. Extreme severity patients demonstrated 5.11-fold longer stays than minor cases (15.58 vs. 3.05 days), males showed 19% longer LOS than females (6.30 vs. 5.28 days, p < 0.0001), and patients aged 50+ exhibited 68% longer stays. GIS integration extended individual-level predictions to spatial governance, identifying high-burden ZIP code hotspots (~120,000 cases in ZIP 112), a 4.7-fold severity-stratified LOS gradient across geographic units, and a 2-fold county-level cost disparity (New York County ~$41,000 vs. Clinton County ~$20,000) revealing equity gaps invisible to individual-level models. Cost-effectiveness analysis yielded a dominant strategy with negative ICER of –$2,499.55 per bed-day avoided (ROI: 561,637%). Qualitative validation revealed complete APR-DRG unfamiliarity among all seven Indian healthcare professionals, exposing a structural classification gap with cascading consequences for reimbursement fairness, hospital benchmarking, and spatial resource allocation. The findings support phased adoption of severity-adjusted classification frameworks adapted to India’s disease burden, integrated with spatial analytics for district-level healthcare governanc..

 

Keywords
Recommended Articles
Original Article
Role of Excise Duty in Shaping Economic Development in Chhattisgarh: A Critical Study
Original Article
Impact Of Unified Payments Interface (Upi) On The Growth Of The Cashless Economy In India: An Empirical Analysis A Comprehensive Review of Scopus-Indexed Empirical Literature
Original Article
Impact Of Stress On The Mental And Physical Health Of Dental Practitioners In Mumbai
Original Article
A Bibliometric Mapping Of Green Finance In Msmes: Trends, Structures, And Future Research Directions (2015–2025)
Loading Image...
Volume 3, Issue 4
Citations
234 Views
211 Downloads
Share this article
© Copyright Advances in Consumer Research