With the acceleration of urbanization, shared bicycles have become an indispensable component of modern urban transportation systems, playing an important role in improving social resource utilization, alleviating traffic pressure, and promoting green travel. However, a common problem of supply-demand imbalance exists in the spatial and temporal distribution of vehicles, which not only affects user experience but also increases operating costs. In this context, accurate prediction of shared bicycle rental demand is crucial for achieving refined operations. This paper presents a high-precision prediction model constructed using machine learning technology. The proposed methodology, structured into three distinct modules, begins with detailed exploratory data analysis and comprehensive feature engineering. This includes logarithmic transformation of target variables, encoding of categorical features, and critically, the construction of key interaction features to capture the complex relationship between variables like temperature and hour. On this basis, the XGBoost model is employed to evaluate feature importance and select an optimal feature subset. Subsequently, through a series of comparative experiments, a range of machine learning regression models, including linear, tree-based, and gradient boosting models, are systematically compared. Using a time-series cross-validation method, each model is trained and evaluated with Root Mean Square Logarithmic Error (RMSLE) and the coefficient of determination (R2) as the main evaluation indicators. Finally, hyperparameters for the top-performing models are optimized to further improve prediction accuracy and generalization ability. The results indicate that the tree-based ensemble model LightGBM exhibits superior performance in the task of predicting shared bicycle demand, with its accuracy further improved after hyperparameter optimization. This study not only provides an effective demand forecasting solution for shared bicycle operators, but also offers a valuable reference for similar time series forecasting problems.
Haipeng LiLan ShenYing LiYanhong JingSong WangTian Lv
Shu ShenZhao-Qing WeiLijuan SunKhalida Shaheen RaoRuchuan Wang
Yibo SunJiaqi MaR. ZhangQiongshuai Lyu