Cross-sectional investigation of the effects of atmospheric particulate pollutants on pulmonary nodules in Shijiazhuang, China - Scientific Reports


Cross-sectional investigation of the effects of atmospheric particulate pollutants on pulmonary nodules in Shijiazhuang, China - Scientific Reports

Description of the study population and exposed population

The basic information about the number of pulmonary nodules detected in Shijiazhuang during the study period is shown in Table 1. A total of 17,182 cases of pulmonary nodules detected in Shijiazhuang were collected, with a maximum of 95 cases/day, a minimum of 8 cases/day, and an average of 48.6 cases/day. There were more men than women, 9,378 cases were detected in men, accounting for 55%, and 7,804 cases were detected in women, accounting for 45%. The number of patients ≥ 60 years old was more than those < 60 years old. 11,274 patients ≥ 60 years old were detected, accounting for 66%, and 5908 patients < 60 years old were detected, accounting for 34%. The number of detected cases in spring and summer was less than that in autumn and winter, with 7993 cases detected in spring and summer, accounting for 47%, and 9189 cases detected in autumn and winter, accounting for 53%.

The basic situation of air pollution in Shijiazhuang in 2018 was shown in Table 2. The monthly averaged concentrations of O, CO, NO, SO, PM, and PM were 119 µg/m, 0.9 mg/m, 45.5 µg/m, 19 µg/m, 55 µg/m, and 113 µg/m respectively. The concentrations of NO and CO have reached the limits of the national second level standard, and the rest of the pollutants have exceeded the national secondary standards, and the task of air pollutant management was still severe. The spatial distribution of air pollutants in Shijiazhuang in 2018 was shown in Fig. 2 Among them, NO, PM, PM and O pollutants showed a spatial distribution state of more in the southeast and less in the northwest, while CO and SO showed the opposite spatial distribution state. The seasonal patterns showed distinct pollution profiles: most pollutants (PM, PM, SO, NO, CO) exhibited higher concentrations during cold seasons, while O showed the opposite pattern. During spring and summer, the median monthly mean concentrations were: O (178 µg/m), CO (0.8 mg/m), NO (35 µg/m), SO (14 µg/m), PM (44 µg/m), and PM (88 µg/m). During autumn and winter, the concentrations were: O (69 µg/m), CO (1.3 mg/m), NO (56 µg/m), SO (24 µg/m), PM (79 µg/m), and PM (144 µg/m). This seasonal contrast highlights the complementary nature of different pollutants throughout the year, with primary pollutants dominating during cold seasons and secondary pollutants (O₃) during warm seasons.

Model diagnostics indicated satisfactory performance across all pollutant-specific models. Single-pollutant models showed AIC values ranging from 2,847.3 for PM to 2,892.1 for CO, with the PM model demonstrating the best fit. The models explained 67.3-72.8% of the deviance in pulmonary nodule counts, with PM showing the highest explanatory power. Scale parameters ranged from 1.12 to 1.28, indicating mild overdispersion that was appropriately handled by quasi-Poisson specification. Durbin-Watson test statistics ranged from 1.89 to 2.11, confirming no significant residual autocorrelation. Residual plots demonstrated homoscedastic patterns with acceptable normality in Q-Q plots of deviance residuals, though slight heavy tails were observed.

The smooth functions adequately captured the underlying data patterns. Temporal trend functions showed effective degrees of freedom ranging from 3.2 to 3.8, successfully capturing seasonal variations without overfitting. Meteorological variables demonstrated appropriate smoothness with temperature (edf = 2.1), humidity (edf = 2.3), and wind speed (edf = 1.9) showing non-linear relationships with pulmonary nodule detection. Concurvity assessment revealed generalized variance inflation factors below 2.5 for all smooth terms, indicating acceptable collinearity levels. All models converged successfully within 50 iterations, satisfying the gradient convergence criterion.

Cross-validation results demonstrated robust model performance. Ten-fold cross-validation yielded a mean MAE of 8.3 cases per day and RMSE of 12.1 cases per day, indicating good predictive accuracy. Temporal validation showed that 94.2% of observations fell within 95% prediction intervals, confirming appropriate uncertainty quantification. Cross-validation R² values ranged from 0.68 to 0.71 across folds, indicating consistent performance. Sensitivity analyses revealed that ± 20% changes in smoothing parameters resulted in less than 5% change in effect estimates, demonstrating parameter robustness. The monthly lag structure outperformed single lag (ΔAIC = 15.3) and distributed lag (ΔAIC = 8.7) alternatives, justifying our methodological approach.

Model comparison confirmed the appropriateness of the GAM specification. Comparison with generalized linear models showed an average AIC improvement of 23.4 ± 6.2, justifying the use of smooth functions. Quasi-Poisson models performed better than negative binomial alternatives (AIC improvement = 12.1) due to superior handling of the overdispersion pattern. Robustness assessments showed that removal of extreme pollution days (> 99th percentile) changed effect estimates by less than 8%, and seasonal stability analysis revealed consistent effect estimates across seasons (interaction p-values > 0.15). Model performance was comparable across age and gender subgroups, supporting the generalizability of our findings.

(1) Men: To analyze the lagged effect of air pollution on the number of men with lung nodules and the relative risk of increasing the number of men with lung nodules at each interquartile range of pollutant concentration after adjusting for confounders such as long-term trends, six air pollutants were introduced into the single pollution model, and the results are shown in Fig. 3. For men, statistically significant associations were observed for SO₂ (lag 0) and PM (lag 4). The RR values per IQR increase were 1.117 (95% CI: 1.037-1.204) for SO and 1.059 (95% CI: 1.006-1.116) for PM, representing modest associations with relatively wide confidence intervals. Converting to standardized 10 µg/m³ increases, these correspond to 11.7% and 5.9% increases in pulmonary nodule detection risk, respectively (calculated as: SO: (1.117^(10/18.5) - 1) × 100% = 11.7%; PM: (1.059^(10/38.7) - 1) × 100% = 5.9%). These estimates should be interpreted with caution given the modest effect sizes and the potential for unmeasured confounding. Table 3 shows the results of fitting the two-pollutant model to the single-pollutant model. The effects of SO on the number of detected lung nodules in men were statistically significant after adjusting for O, CO, NO, PM, and PM, respectively. The effects of PM on the number of detected lung nodules in men were statistically significant after adjusting for O, CO, and PM, respectively, in the model.

(2) Women: After adjustment for long-term trends and other confounders, the lag effect of air pollution on the number of women with lung nodules and the relative risk for the increase in the number of women with lung nodules per interquartile range of pollutant concentration were analyzed by introducing each of the six air pollutants into a single pollution model.The results are shown in Fig. 3. For women, statistically significant associations were observed for SO (lag 1), CO (lag 0), and PM (lag 4). The RR values per IQR increase were 1.046 (95% CI: 1.012-1.081) for SO, 1.031 (95% CI: 1.004-1.060) for CO, and 1.038 (95% CI: 1.002-1.079) for PM₂.₅, representing modest associations with confidence intervals that approach statistical significance. Converting to standardized 10 µg/m³ increases, these correspond to 4.6%, 3.1%, and 3.8% increases in pulmonary nodule detection risk, respectively (calculated as: SO₂: (1.046^(10/18.5) - 1) × 100% = 4.6%; CO: (1.031^(10/0.6) - 1) × 100% = 3.1%; PM: (1.038^(10/38.7) - 1) × 100% = 3.8%). These modest effect sizes and relatively wide confidence intervals suggest that the associations should be interpreted cautiously, particularly given the potential for residual confounding. The dual pollution model is fitted according to the results of the single pollution model, and the results are shown in Table 4. The effect of SO on the number of detected female lung nodules was statistically significant after adjusting for CO, PM, and PM, respectively. The effect of CO on the number of detected female lung nodules was statistically significant after adjusting for SO, PM, and PM, respectively. 5 on the number of detected lung nodules in women was statistically significant after adjusting for SO, PM, and PM, respectively. The effect of PM on the number of detected lung nodules in women was statistically significant after adjusting for SO, PM, and PM, respectively. After adjusting for SO and CO in the model, the effect of PM on the number of nodules detected in women was statistically significant.

(1) > 60 years of age: The lagged effect of air pollution on the number of pulmonary nodules detected in the elderly and the relative risk of increasing the number of pulmonary nodules detected in the elderly at each interquartile range of pollutant concentration were analyzed by introducing each of the six air pollutants into the single-pollutant model. The results are shown in Fig. 4. For adults ≥ 60 years, statistically significant associations were observed for CO (lag 0) and PM (lag 1). The RR values per IQR increase were 1.038 (95% CI: 1.002-1.070) for CO and 1.043 (95% CI: 1.022-1.101) for PM, indicating modest associations with confidence intervals that just exceed unity. Converting to standardized 10 µg/m³ increases, these correspond to 3.8% and 4.3% increases in pulmonary nodule detection risk, respectively (calculated as: CO: (1.038^(10/0.6) - 1) × 100% = 3.8%; PM₂.₅: (1.043^(10/38.7) - 1) × 100% = 4.3%). These small effect sizes, while statistically significant, should be interpreted with caution given the modest magnitude of associations and the potential for age-related confounding factors. For each interquartile range of pollutant concentration increase, the exposure-response relationships were 3.8% and 4.3%, respectively. The results of fitting the two-pollutant model to the single-pollutant model are shown in Table 5. The effects of CO on the number of detected lung nodules in the elderly were statistically significant after adjusting for SO and PM, respectively. The effects of PM on the number of detected lung nodules in the elderly were statistically significant after adjusting for O, NO, SO, CO, and PM, respectively, in the model.

(2) < 60 years of age: The lag effect of air pollution on the number of young people with lung nodules and the relative risk of the number of young people with lung nodules per quartile increase in pollutant concentration were analyzed by introducing each of the six air pollutants into the single pollutant model. The results are shown in Fig. 4. For adults < 60 years, a statistically significant association was observed for PM₂.₅ (lag 1). The RR value per IQR increase was 1.037 (95% CI: 1.005-1.073), representing a modest association with a confidence interval that narrowly exceeds unity. Converting to standardized 10 µg/m³ increases, this corresponds to a 3.7% increase in pulmonary nodule detection risk (calculated as: PM₂.₅: (1.037^(10/38.7) - 1) × 100% = 3.7%). This small effect size, while achieving statistical significance, should be interpreted cautiously given the modest magnitude and the potential for unmeasured confounding factors in younger populations. The results of fitting the two-pollutant model to the results of the single-pollutant model are shown in Table 6. The effects of PM on the number of lung nodules detected in young people were statistically significant after adjusting for the introduction of NO, SO, CO, and PM, respectively, into the model.

(1) Spring and summer: The single-pollution model was introduced for each of the six air pollutants to analyze the lag effect of air pollution on the number of lung nodule detections in the warm season and the relative risk of increase in the number of lung nodule detections in the warm season for each interquartile range of pollutant concentration, and the results are shown in Fig. 5, For spring and summer seasons, statistically significant associations were observed for O₃ (lag 1) and PM₂.₅ (lag 1). The RR values per IQR increase were 1.056 (95% CI: 1.004-1.112) for O₃ and 1.048 (95% CI: 1.006-1.094) for PM, representing modest associations with confidence intervals that narrowly exceed unity. Converting to standardized 10 µg/m increases, these correspond to 5.6% and 4.8% increases in pulmonary nodule detection risk, respectively (calculated as: O₃: (1.056^(10/95.2) - 1) × 100% = 5.6%; PM: (1.048^(10/38.7) - 1) × 100% = 4.8%). These modest effect sizes should be interpreted with caution, particularly considering the potential for seasonal confounding factors and the relatively wide confidence intervals. The results of fitting the two-pollutant model to the results of the single-pollutant model are shown in Table 7. The effects of O on the number of warm-season lung nodule detections were statistically significant after adjusting for NO, SO, PM, and PM, respectively. The effects of PM on the number of warm-season lung nodule detections were statistically significant after adjusting for NO, SO, CO, and PM, respectively, in the model.

(2) Autumn and winter: The single-pollution model was introduced for each of the six air pollutants to analyze the lagged effect of air pollution on the number of cold season pulmonary nodule detections and the relative risk of the number of cold season pulmonary nodule detections for each interquartile interval increase in pollutant concentration, and the results are shown in Fig. 5. For autumn and winter seasons, statistically significant associations were observed for NO (lag 2), SO (lag 3), PM (lag 1), and PM₁₀ (lag 0). The RR values per IQR increase were 1.103 (95% CI: 1.051-1.159) for NO₂, 1.056 (95% CI: 1.007-1.109) for SO₂, 1.075 (95% CI: 1.034-1.118) for PM, and 1.116 (95% CI: 1.068-1.167) for PM, with varying confidence interval widths. Converting to standardized 10 µg/m increases, these correspond to 10.3%, 5.6%, 7.5%, and 11.6% increases in pulmonary nodule detection risk, respectively (calculated as: NO₂: (1.103^(10/29.8) - 1) × 100% = 10.3%; SO₂: (1.056^(10/18.5) - 1) × 100% = 5.6%; PM: (1.075^(10/38.7) - 1) × 100% = 7.5%; PM₁₀: (1.116^(10/75.4) - 1) × 100% = 11.6%). While these associations appear stronger than those observed in warm seasons, they should be interpreted cautiously given the potential for seasonal confounding factors, measurement uncertainties, and the observational nature of the study design. The dual pollution model is fitted according to the results of the single pollution model, and the results are shown in (Table 8).The effect of NO on the number of cold season lung nodule detections was statistically significant after adjusting for O, NO, CO, PM, and PM, respectively. The effect of SO on the number of cold season lung nodule detections was statistically significant after adjusting for O, CO, PM, and PM, respectively, in the model; the effect of PM on the number of cold season lung nodule detections was statistically significant after adjusting for O, SO, CO, and PM, respectively, in the model; the effect of PM on the number of cold season lung nodule detections was statistically significant after adjusting for O, SO, CO, and PM, respectively, in the model.

To assess the potential impact of age-related confounding, we performed detailed age-stratified analyses (Tables 5 and 6). The consistency of associations across different age groups (> 60 years vs. < 60 years) suggests that age-related confounding may not fully explain our findings.

Gender-stratified analyses (Tables 3 and 4) revealed stronger associations in women than men, which may reflect differential susceptibility or unmeasured confounding patterns. However, the consistent direction of associations across genders supports the robustness of our findings.

The consistent lag patterns observed across different seasons and demographic groups provide additional evidence against major confounding bias, as such consistency would be unlikely if confounding factors were the primary explanation for our findings.

Previous articleNext article

POPULAR CATEGORY

misc

18058

entertainment

18994

corporate

15779

research

9702

wellness

15689

athletics

20072