AI-Driven Socioeconomic Modeling: Income Prediction and Disparity Detection Among U.S. Citizens Using Machine Learning

doi:N/A

Advances in Consumer Research

Issue 4 : 2694-2709

Original Article

AI-Driven Socioeconomic Modeling: Income Prediction and Disparity Detection Among U.S. Citizens Using Machine Learning

Syed Ali Reza

Md Khalilor Rahman

Md Sazzad Hossain

Md Nazmul Shakir Rabbi

⁴

Abdul Quddus Mozumder

⁵

Saniah Safat

⁶

Maksuda Begum

⁷

Md Wasim Ahmed

⁸

Md Abdul Ahad

⁴

Department of Data Analytics, University of the Potomac (UOTP), Washington, USA

MBA, Business analytics, Gannon University, Erie, PA, USA

MBA, business analytics, gannon University, Erie, PA, USA

⁴

Master of Science in Information Technology, Washington University of Science and Technology

⁵

Master of Science in Information System Management, Stanton University

⁶

Computer Science and Engineering, The University of Texas at Arlington

⁷

Master of Business Administration, Trine University.

⁸

Master of law, Green University of Bangladesh

Abstract

This study looks at how individual socioeconomic factors relate to income levels among U.S. citizens, using a four-stage machine learning framework to piece things together. It started with prediction. We tested several regression models to estimate annual income based on features like education, employment status, debt, and household makeup. Each model brought something slightly different to the table, and together they helped sketch a clearer picture of the income landscape. Next came refinement. We dug into feature engineering, tuned the models, and brought in ensemble methods to pull out deeper patterns, especially the ones hiding in the interactions between things like education, housing, and digital access. In the third phase, we shifted focus to disparity. Using methods like ANOVA and t-tests, we looked at how income varies across groups, by race, gender, region, and marital status. The gaps were real and often held up even when we controlled for education or job type. The final step involved unsupervised clustering. This helped break the population into distinct socioeconomic profiles. Some clusters revealed vulnerable combinations, like high debt, spotty internet, and unstable work, that don’t always raise red flags on their own but matter when they show up together. What stood out through all of this is that income isn’t shaped by one factor at a time. It’s the result of how different parts of someone’s life overlap; region and education, debt and family structure, digital access and job opportunities. By combining prediction, diagnostics, and clustering, this approach gives both a close-up and wide-angle view of how income works. For researchers, it’s a way to move beyond surface-level forecasting. For policymakers, it offers a clearer path to spotting the groups most likely to fall through the cracks.

Keywords

Income Prediction

Socioeconomic Disparities

Random Forest

XGBoost

Clustering

PCA

Ensemble Learning

Feature Engineering

U.S. Demographics

ANOVA

Recommended Articles

Original Article

Design and Implementation of Intelligent Autonomous Agents for Data Validation, Orchestration, and Cost Optimization

Soma Sekhar Gaddipati,

Siva Gandikota

Download Read Article

Original Article

Clinicobiochemical and Metabolic Associations of Polycystic Ovary Syndrome with Dermatological Manifestations and Renal Function Alteration among Reproductive-Age Women

Enas M.A Elzeity,

...

Mubasherah Jamil