top of page

AI-enabled left atrial volumetry in coronary artery calcium scans (AI-CAC) predicts atrial fibrillation as early as one year, improves CHARGE-AF, and outperforms NT-proBNP:

The multi-ethnic study of atherosclerosis


Morteza Naghavi, David Yankelevitz, Anthony P. Reeves, Matthew J. Budoff, Dong Li, Kyle Atlas Chenyu Zhang, Thomas L. Atlas, Seth Lirette, Jakob Wasserthal, Sion K. Roy, Claudia Henschke, Nathan D. Wong, Christopher Defilippi, Susan R. Heckbert, Philip Greenland


Abstract

Background

Coronary artery calcium (CAC) scans contain actionable information beyond CAC scores that is not currently reported.

Methods

We have applied artificial intelligence-enabled automated cardiac chambers volumetry to CAC scans (AI-CACTM) to 5535 asymptomatic individuals (52.2% women, ages 45–84) that were previously obtained for CAC scoring in the baseline examination (2000–2002) of the Multi-Ethnic Study of Atherosclerosis (MESA). AI-CAC took on average 21 ​s per CAC scan. We used the 5-year outcomes data for incident atrial fibrillation (AF) and assessed discrimination using the time-dependent area under the curve (AUC) of AI-CAC LA volume with known predictors of AF, the CHARGE-AF Risk Score and NT-proBNP. The mean follow-up time to an AF event was 2.9 ​± ​1.4 years.

Results

At 1,2,3,4, and 5 years follow-up 36, 77, 123, 182, and 236 cases of AF were identified, respectively. The AUC for AI-CAC LA volume was significantly higher than CHARGE-AF for Years 1, 2, and 3 (0.83 vs. 0.74, 0.84 vs. 0.80, and 0.81 vs. 0.78, respectively, all p ​< ​0.05), but similar for Years 4 and 5, and significantly higher than NT-proBNP at Years 1–5 (all p ​< ​0.01), but not for combined CHARGE-AF and NT-proBNP at any year. AI-CAC LA significantly improved the continuous Net Reclassification Index for prediction of AF over years 1–5 when added to CHARGE-AF Risk Score (0.60, 0.28, 0.32, 0.19, 0.24), and NT-proBNP (0.68, 0.44, 0.42, 0.30, 0.37) (all p ​< ​0.01).

Conclusion

AI-CAC LA volume enabled prediction of AF as early as one year and significantly improved on risk classification of CHARGE-AF Risk Score and NT-proBNP.

Keywords

Coronary artery calcium, Atrial fibrillation, Left atrial volume, Artificial intelligence, CHARGE-AF, NT-proBNP


1. Introduction

Coronary artery calcium (CAC) scoring is the strongest predictor of atherosclerotic cardiovascular disease (ASCVD) in asymptomatic individuals available today.However, it is a weak predictor of atrial fibrillation (AF), the most common sustained arrhythmia that significantly increases the risk of stroke and cardiovascular mortality. Incident AF is on the rise leading to morbidity and mortality worldwide, both in the elderly and among younger adults.The CHARGE-AF Risk Score is a widely recognized risk model used to predict the 5-year risk of AF in asymptomatic populations. CHARGE-AF is an epidemiological risk calculator created based on both asymptomatic people and patients with cardiovascular disease (CVD) from Atherosclerosis Risk in Communities (ARIC) and Framingham Heart Study (FHS) cohorts. Amino terminal Pro-B-type natriuretic peptide (NT-proBNP) is a blood protein that is associated with enlarged cardiac chambers and correlates with left atrial (LA) volume. Recent studies have linked NT-proBNP to the incidence of AF and reported incremental predictive value when NT-proBNP is added to the CHARGE-AF Risk Score.

Since left atrial diameter and strain are known to be associated with risk for developing atrial fibrillation, and pioneering efforts from Heinz Nixdorf Recall Study showed the potential value of non-coronary findings in CAC scans we hypothesized that AI-powered cardiac chambers volumetry in CAC scans (AI-CAC) could enable AF prediction in asymptomatic individuals. In this study, we present AI-CAC data obtained from existing CAC scans in a large prospective study and compare the predictive value of AI-CAC estimated LA volume versus the CHARGE-AF Score and NT-proBNP for predicting AF.


2. Methods

2.1. Study population

The Multi-Ethnic Study of Atherosclerosis (MESA) is a prospective, population-based, observational cohort study of 6,814 men and women without clinical cardiovascular disease (CVD) at the time of recruitment from six field centers in the United States. As part of the initial evaluation (2000–2002), participants received a comprehensive medical history, clinic examination, and laboratory tests. Demographic information, medical history, and medication use at baseline were obtained by self-report. An ECG-gated non-contrast CT was performed at the baseline examination to measure CAC. Non-CT scan covariates included NT-proBNP and variables used in calculating the CHARGE-AF Risk Score. Details on NT-proBNP assays measurements are described below under NT-proBNP Measurement. Covariates used in CHARGE-AF Score for our analyses are age, gender, ethnicity, height, weight, systolic blood pressure, diastolic blood pressure, current smoking, hypertension medication, diabetes, which were obtained as a part of MESA baseline exam 1 previously described. Additionally, CHARGE-AF Risk Score includes myocardial infarction and heart failure which were by default absent in the asymptomatic MESA population at baseline exam 1.

771 MESA participants were removed who did not consent for commercial use of data, leaving 6043 participants for analysis. After removing 125 cases with missing slices in CAC scans, 4 cases with missing data for CHARGE-AF Risk Score, and 168 cases with missing NT-proBNP values there were 5746 remaining participants. Subsequently, 70 cases with pre-baseline AF, 9 cases with surgical AF, and 132 non-AF deaths were removed resulting in the total number of 5535 cases available for analysis. The 125 cases with missing slices were 49.8% male and 50.2% females. None of these cases had a diagnosis of AF. These cases were random, and investigations did not reveal any association with dependent or independent variables in our study.

2.2. Outcomes

Participants were contacted by telephone every 9–12 months during follow-up and asked to report all new cardiovascular diagnoses. International Classification of Disease (ICD) codes were obtained. Incident AF was identified by ICD codes 427.3x (version 9) or I48.x (version 10) from inpatient stays and, for participants enrolled in fee-for-service Medicare, from Medicare claims for outpatient and provider services. For participant reports of heart failure, coronary heart disease, stroke, and CVD mortality, detailed medical records were obtained, and diagnoses were adjudicated by the MESA Morbidity and Mortality Committee. Additionally, NT-proBNP data was obtained from MESA core laboratory for MESA exam 1 participants. A detailed study design for MESA has been published elsewhere. MESA participants have been followed since the year 2000. Incident AF has been identified through December 2018. 70 cases with previously diagnosed AF prior to MESA enrollment were removed from the analysis.

2.3. The AI tool for automated cardiac chambers volumetry

The automated cardiac chambers volumetry component in AI-CAC™ in this study is called AutoChamber™ (HeartLung.AI, Houston, TX), a deep learning model that used TotalSegmentator as the base input and was further developed to segment not only each of the four cardiac chambers; LA, left ventricle (LV), right atrium (RA), and right ventricle (RV) but also ascending aorta, aortic root and valves, pulmonary arteries, and several other components which are not presented here. The AI-CAC LA volumetry is the focus of this manuscript. Fig. 1 shows the AutoChamber segmentations of enlarged LA along with other cardiac chambers in two cases who developed AF. The base architecture of the TotalSegmentator model was trained on 1139 whole body cases with 447 cases of coronary CT angiography (CCTA) using nnU-Net, a self-configuring method for deep learning-based biomedical image segmentation. The initial input training data were matched non-contrast and contrast-enhanced ECG-gated cardiac CT scans with 1.5 ​mm slice thickness. Because the images were taken from the same patients in the same session, registration was done with good alignment. Following this transfer of segmentations, a nnU-Net deep learning tool was used for training the model. Additionally, iterative training was implemented whereby human supervisors corrected errors made by the model, and the corrected data were used to further train the model, leading to improved accuracy. To standardize the comparison in MESA, cardiac chambers were reported by gender and ethnicity adjusted by body surface area (BSA) using residual adjustment techniques. (BSA: 0.007184 x (height(m)ˆ0.725) x (weight(kg)ˆ0.425)). Additionally, an internal reference was developed based on the field of view size and the posterior height of thoracic vertebral bones. This measure would be used whenever BSA information is unavailable, however it was not an issue in MESA. AutoChamber™ AI was run on 6043 non-contrast CAC scans that consented to commercial data usage out of the 6814 scans available in MESA exam 1. Expert rules built in the AI-model excluded 125 cases due to missing slices in image reconstruction created by some of the electron beam CT scanners used in MESA baseline. These cases were random, and our investigations did not reveal any particular association with dependent or independent variables in our study (see Results).


Fig. 1. Two case examples of AI-CAC detection of enlarged LA along with all cardiac chambers segmentations in a coronary artery calcium CT scan. Both cases fall in the top quartile of LA volume who were flagged by the AutoChamber component of AI-CAC for further investigation. Case 1 developed atrial fibrillation (AFib), while Case 2 developed AFib and stroke.


2.4. CHARGE-AF risk score

The CHARGE-AF risk score was developed to predict risk of 5-year incident AF in three American cohorts, and it was validated in two European cohorts. The linear predictor from the CHARGE-AF Risk Score is calculated as: (age in years/5) ∗ 0.5083+ethnicity (Caucasian/white) ∗ 0.46491 ​+ ​(height in centimeters/10) ∗ 0.2478 ​+ ​(weight in kg/15) ∗ 0.1155 ​+ ​(SBP in mm Hg/20) ∗ 0.1972 – (DBP in mm Hg/10) ∗ 0.1013 ​+ ​current smoking ∗ 0.35931 + antihypertensive medication use ∗ 0.34889 + DM ∗ 0.23666. The result is the sum of the product of the regression coefficients and the predictor variables, which represents the change in the hazard ratio for a one-unit change in the corresponding predictor variable.

2.5. NT-proBNP measurement

Details on NT-proBNP assays used in MESA have been reported. N-terminal proBNP is more reproducible than NT-proBNP at the lower end of the distribution range, and more stable at room temperature. However, both NT-proBNP and N-terminal proBNP are clinically available. Intra-assay and inter-assay coefficients of variation at various concentrations of NT-proBNP have been previously reported. The analytical measurement range for NT-proBNP in exam 1 was 4.9–11699 ​pg/ml. The lower limits of detection for the NT-proBNP assay is 5 ​pg/ml, thus cases above 0 and below 4.99 were treated as 4.99 ​pg/ml. Clinically, values are not reported below 4.99 ​pg/ml because the analytical accuracy is poor at those low levels (i.e. typically a coefficient of variation of greater than 20% between repeat measures).

2.6. Statistical analysis

We used SAS (SAS Institute Inc., Cary, NC) and R-4.3.3 software for statistical analyses. All values are reported as means ​± ​SD except for NT-proBNP which did not show normal distribution and is presented in median and interquartile range (IQR). All tests of significance were two tailed, and significance was defined at the p ​< ​0.05 level. Confidence intervals are presented at the 95% level.

To evaluate discrimination at various time points, the time-dependent ROC (receiver operator curve) AUC (area under the curve) was calculated based on predicted survival probabilities using the inverse probability of censoring weighting estimator without competing risks. Pointwise confidence intervals and bands were derived from 2000 bootstrapped samples and computed from the asymptotic normality of the time-dependent AUC estimator. Standard errors and confidence intervals were estimated based on the independent and identically distributed (iid)-representation of the estimator. All variables were modeled continuously. We have analyzed data for AF prediction at 1–5 years follow up.

Hazard ratios for 5-year incident AF were calculated per SD increase using Cox proportional hazards regression. NT-proBNP and CAC were natural logarithm-transformed (ln-transformed) to avoid undue influence of large values and to improve the interpretability of hazard ratios. AI-CAC LA volume and CHARGE-AF Risk Score showed a normal distribution. CHARGE-AF score was omitted from the adjustment model due to its derivation from established risk factors.

Category-free (continuous) net reclassification index (NRI) was calculated using the sum of the differences between the proportions of upward reclassifications and downward reclassifications for AF events and AF non-events, respectively. NRI was developed as a statistical measure to evaluate the improvement in risk prediction models when additional variables are incorporated into a base model. Confidence intervals for NRI are presented at the 95% level.

Cumulative incidence of AF for each predictor was calculated using one minus the Kaplan-Meier survival estimate. Group differences in incidence were determined using the log-rank test.

2.7. Ethical approval

As a longitudinal population-based study sponsored by the National Institute of Health (NIH), MESA has received proper ethical oversight. The MESA protocol was approved by the Institutional Review Board (IRB) of the 6 field centers and the National Heart, Lung, and Blood Institute. All subjects gave their informed consent for inclusion before they participated in the study. Data from participants who did not consent to commercial use were removed from our study.


3. Results

In the cohort, ages ranged from 45 to 84, 52.2% were women, 39.7% were White, 26.1% Black, 22% Hispanic, and 12.1% Chinese. Table 1 shows the baseline characteristics of MESA participants who were diagnosed with incident AF versus those who were not over the period of 5 years follow up. At 1,2,3,4, and 5 years follow up 36, 77, 123, 182, and 236 cases of AF were identified respectively. In univariate comparisons, incident AF cases were older, more likely male, and more likely White. The incident AF cases had higher cardiac chamber volumes for LA, LV, RA, LV Wall, CHARGE-AF Risk Scores and NT-proBNP levels versus those without incident AF (all comparisons p ​< ​0.001) (Table 1).


Table 1. Baseline characteristics of the Multi-Ethnic Study of Atherosclerosis (MESA) participants including cases with and without Atrial Fibrillation (AF) at 5 years.

Overall

No AF

AFa

P Value



(N ​= ​5535)

(N ​= ​5319)

(N ​= ​236)

Age (per 10 years)

Age 45-54

28.9%

29.7%

2.1%

<0.0001

Age 55-64

27.6%

28.0%

18.1%

<0.0001

Age 65-74

29.4%

28.8%

46.1%

<0.0001

Age 75-84

14.1%

13.5%

33.7%

<0.0001

Female sex (%)

52.2%

52.8%

47.9%

<0.0001

Body Surface Area

1.90 ​± ​0.24

1.89 ​± ​0.24

1.92 ​± ​0.25

<0.0001

Ethnicity

White

39.5%

46.6%

39.4%

0.0229

Chinese

12.4%

11.4%

12.2%

0.7054

Black

25.8%

21.2%

26.3%

0.0615

Hispanic

22.3%

20.8%

22%

0.6569

AI-Enabled Cardiac Chambers Volumetry

LV volume (cc)

102.23 ​± ​24.96

102.1 ​± ​25.0

108.0 ​± ​31.1

<0.0001

LA volume (cc)

60.94 ​± ​15.10

60.6 ​± ​15.3

73.5 ​± ​24.5

<0.0001

RV volume (cc)

134.30 ​± ​34.43

134.1 ​± ​34.4

136.0 ​± ​37.7

0.4081

RA volume (cc)

76.76 ​± ​18.75

76.6 ​± ​18.4

83.3 ​± ​26.0

<0.0001

LV Wall volume (g)

107.53 ​± ​26.08

107.3 ​± ​26.1

114.2 ​± ​30.6

<0.0001

Total heart (cc)

481.76 ​± ​108.69

480.7 ​± ​108.1

514.9 ​± ​134.9

<0.0001

CHARGE-AF Score

11.7 ​± ​1.2

11.7 ​± ​1.2

12.8 ​± ​0.9

<0.0001

NT-proBNP (pg/mL) (Median – IQR)

51.41 (23.19–104.4)

49.46 (22.54–98.15)

115.8 (62.42–236)

<0.0001

NT-proBNP (pg/mL) (mean)

82.1 ​± ​95.0

78.3 ​± ​89.4

175.7 ​± ​159.6

<0.0001

CAC (Median – IQR)

0 (0–80.84)

0 (0–73.34)

59.52 (3.16–257.60)

<0.0001

CAC (mean)

133.7 ​± ​379.0

125.5 ​± ​358.8

333.3 ​± ​686.8

<0.0001

Risk Factors

Diabetes

12.1%

12.1%

15.7%

0.0987

Hypertension

43.8%

43.4%

62.7%

<0.0001

Smoking (Current use)

12.8%

13.0%

10.6%

0.2816

Alcohol (Current use)

69.3%

69.4%

63.5%

0.0547

Blood Pressure Lowering Rx

36.0%

35.7%

54.9%

<0.0001

Lipid Lowering Rx

16.4%

16.5%

16.6%

0.9677

LDL Cholesterol (mg/dL)

117.2 ​± ​31

117.4 ​± ​31.2

115.4 ​± ​33.4

0.2017

HDL Cholesterol (mg/dL)

50.9 ​± ​15

51.0 ​± ​15.0

50.0 ​± ​13.9

0.3212

Total Cholesterol (mg/dL)

194.4 ​± ​35.3

194.5 ​± ​35.5

192.2 ​± ​38.0

0.0021

a

AF Events above 5 years are excluded. Mean follow-up years to an AF event 2.9 ​± ​1.4.


The cumulative incidence of AF over 5 years for AI-estimated LA volume, CHARGE-AF Risk Score and NT-proBNP are shown in Fig. 2. The incidence of AF in the 99th percentile of AI-LA volume, CHARGE-AF Risk Score, and NT-proBNP were 37.3%, 16.5%, 27.1% respectively (p ​< ​0.0001).



Fig. 2. a–d. Cumulative Incidence of Atrial Fibrillation (AF) in the Top Quartile of Artificial Intelligence (AI)-Left Atrial (LA) Volume, CHARGE-AF Score, NT-proBNP (and coronary artery calcium (CAC) over 5 years of follow-up.


The time-dependent AUC [95% CI] at 1-year follow-up for AI-CAC LA (0.83 [0.74, 0.90]) was significantly higher than CHARGE-AF (0.74 [0.66, 0.81]), NT-proBNP (0.74 [0.66, 0.82]), and Agatston CAC Score (0.68 [0.56, 0.80]) (Fig. 4). Across 1-to-5-year follow-up, the AUC for AI-CAC LA volume (adjusted by age, gender, BSA) was significantly higher than NT-proBNP. The AUC for AI-CAC LA volume vs. CHARGE AF was statistically significant (p ​< ​0.02) over 1–3 years, but not statistically significant for year 4 (p ​= ​0.11) and year 5 (p ​= ​0.08). The difference in AUC for AI-estimated LA volume alone versus CHARGE-AF and NT-proBNP combined, despite higher AUC for LA volume in years 1–5, was not statistically significant for any of the years (Table 2).


Fig. 3. a–b Quartiles of AI-CAC Left Atrial (LA) Volume by predicted 5-year CHARGE-AF Risk and NT-proBNP Quartiles.



Fig. 4. 1-year Follow-up Time-dependent area under the curve (AUC) for atrial fibrillation (AF) prediction between AI-CAC LA volume, NT-proBNP (NT-proBNP), CHARGE-AF, and Agatston CAC Score.


Table 2. Time-dependent Area Under Curve (AUC) and Net Reclassification Index (NRI) for Atrial Fibrillation (AF) Prediction between Artificial Intelligence (AI)-enabled Left atrial (LA) Volume (AI-CAC), CHARGE-AF Risk Score, and NT-proBNP (BNP) over 1–5 ​Years in the Multi-Ethnic Study of Atherosclerosis (MESA)


1 ​Year

2 ​Years

3 ​Years

4 ​Years

5 ​Years



















AF Events

36

77

123

182

236



















Predictors

AUC (95% CI)

P value

AUC (95% CI)

P value

AUC (95% CI)

P value

AUC (95% CI)

P value

AUC (95% CI)

P value


AI-CAC LA Volume

0.83 (0.74, 0.90)

0.84 (0.79, 0.89)

0.81 (0.77, 0.85)

0.78 (0.75, 0.81)

0.77 (0.75, 0.81)


AI-CAC All Cardiac Chambers

0.82 (0.74,0.88)

0.330

0.84 (0.78, 0.89)

0.752

0.81 (0.76, 0.84)

0.496

0.78 (0.75, 0.80)

0.603

0.77 (0.75, 0.80)

0.576


CHARGE-AF

0.74 (0.66, 0.81)

0.010

0.80 (0.76, 0.85)

0.003

0.78 (0.75, 0.82)

0.022

0.76 (0.74, 0.80)

0.110

0.76 (0.74, 0.79)

0.080


NT-proBNP

0.74 (0.66, 0.82)

0.003

0.77 (0.71, 0.83)

0.005

0.75 (0.69, 0.79)

0.001

0.73 (0.68, 0.76)

0.001

0.73 (0.70, 0.77)

0.001


CHARGE-AF ​+ ​NT-proBNP

0.77 (0.68, 0.84)

0.070

0.83 (0.75, 0.88)

0.660

0.81 (0.74, 0.87)

0.990

0.79 (0.73, 0.85)

0.500

0.79 (0.75, 0.83)

0.410


CHARGE-AF ​+ ​AI-CAC LA Volume

0.80 (0.72, 0.89)

0.076

0.82 (0.75, 0.87)

0.020

0.79 (0.73, 0.86)

0.059

0.77 (0.72, 0.84)

0.309

0.77 (0.74, 0.82)

0.513


Category-Free NRI adding AI-CAC LA

NRI (95% CI)

NRI P value

NRI (95% CI)

NRI P value

NRI (95% CI)

NRI P value

NRI (95% CI)

NRI P value

NRI (95% CI)

NRI P value


To Base Model CHARGE-AF

0.60 (0.27,0.94)

<0.0001

0.38 (0.15,0.60)

<0.0001

0.33 (0.17, 0.54)

<0.0001

0.19 (0.02, 0.35)

0.0233

0.23 (0.10, 0.36)

0.0006


To Base Model NT-proBNP

0.68 (0.25, 0.97)

<0.0001

0.44 (0.19, 0.67)

<0.0001

0.42 (0.21, 0.58)

<0.0001

0.20 (0.04, 0.35)

0.0128

0.37 (0.18, 0.45)

<0.0001


To Base Model Agatston CAC Score

0.73 (0.38, 1.05)

<0.0001

0.49 (0.20, 0.66)

<0.0001

0.53 (0.37, 0.64)

<0.0001

0.31 (0.16, 0.46)

<0.0001

0.44 (0.26, 0.54)

<0.0001



The continuous NRI [95% CI] for prediction of AF when AI-estimated LA volume was added to CAC score as the only predictor in the base model for years 1–5 were highly significant (year 1: 0.73 [CI: 0.38, 1.05], year 2: 0.49 [CI: 0.20, 0.66], year 3: 0.53 [CI: 0.37, 0.64], year 4: 0.31 [CI: 0.16, 0.46], and year 5: 0.44 [CI: 0.26, 0.54], respectively p ​< ​0.0001). The NRI for AI-LA volume over 1–5 years when added to base model with CHARGE-AF Risk Score was highly significant (year 1: 0.60 [CI: 0.27, 0.94], year 2: 0.28 [0.15,0.60], year 3: 0.33 [CI: 0.17, 0.54], year 4: 0.19 [CI: 0.02, 0.35], year 5: 0.23 [CI: 0.10, 0.36]). The NRI for AI-LA volume over 1–5 years when added to base model with NT-proBNP was highly significant(year 1: 0.68 [CI: 0.25, 0.97], year 2: 0.44 [CI: 0.19, 0.67], year 3: 0.42 [CI: 0.21, 0.58], year 4: 0.20 [CI: 0.04, 0.35], and year 5: 0.37 [CI: 0.18, 0.45]) respectively, p for all <0.0001) (Table 2).

699 participants classified as low-risk (<2.5%) for 5-year incident AF by CHARGE-AF were flagged by AI-CAC for enlarged LA volume (top quartile). Of these 699 participants, 30 (4.3%) experienced incident AF within 5 years. Similarly, 476 participants in the lowest risk quartile (4.9–23.6 ​pg/ml) of NT-proBNP had enlarged LA. 65 (13.7%) participants out of the 476 participants experienced incident AF within 5 years. (Fig. 3a–b).

Univariate and multivariate models assessed 5-year HR increase per SD for each predictor for incident AF (Table 3). All predictors were statistically significant in univariate models. Only the HR [95% CI] per SD for AI-CAC LA (1.301 [1.143–1.462]) and NT-proBNP (1.288 [1.080–1.568]) was significant in multivariate adjustment models based on age, gender, and BSA (p ​< ​0.0001). The HR per SD of Agatston CAC Score in the adjustment model was 0.992 [0.859–1.146], p ​= ​0.9176.


Table 3. Five-year atrial fibrillation (AF) risk: Hazard ratios (HR) per standard deviation (SD) increase in AI-CAC LA volume, NT-proBNP Agatston CAC score (CAC), and CHARGE-AF risk score.

Empty Cell

Univariate Model

Multivariate Modelb














Predictors

HR (95% CI)

Betaa

P-value

HR (95% CI)

Betaa

P-value


AI-CAC LA Volume (per 1 SD)

1.422 (1.219–1.659)

0.352

<0.0001

1.301 (1.143–1.462)

0.263

<0.0001


Ln (NT-proBNP)


(per 1 SD)

1.306 (1.110–1.545)

0.267

<0.0001

1.288 (1.080–1.568)

0.253

0.0057


Ln (CAC+1)


(per 1 SD)

1.149 (1.025–1.287)

0.139

0.0153

0.992 (0.859–1.146)

−0.008

0.9176


CHARGE-AF Scorec (per 1 SD)

1.464 (1.274–1.683)

0.381

<0.0001










a

Beta per 1 SD increase.

b

AI-CAC LA Volume, CAC, and NT-proBNP adjusted for age, gender, and body surface area (BSA) in a multivariate Cox model.

c

CHARGE-AF Score could not be adjusted for risk factors in a multivariate model, as the score is modeled off risk factors.

4. Discussion

To our knowledge this is the first report of an AI-enabled automated cardiac chambers volumetry in non-contrast CT scans obtained for coronary calcium score in a large multi-ethnic study of asymptomatic individuals. Our study demonstrated that the AI-enabled LA volumetry 1) has enabled prediction of AF in CAC scans, 2) significantly outperformed NT-proBNP over 1–5 years, 3) significantly outperformed CHARGE-AF Risk Score over 1–3 years, 4) provided for a sizable net reclassification improvement on top of CHARGE-AF Risk Score and NT-proBNP, and 5) showed comparable performance against a combined model of CHARGE-AF and NT-proBNP over 1–5 years.

CHARGE-AF is an epidemiological risk score for predicting AF based on risk factors at population levels, but it does not lend itself to a useful clinical tool for individualized risk assessment and monitoring of high-risk patients because the large impact of unmodifiable risk factors. For example, if a patient loses 30 lbs. or lowers systolic blood pressure by 20 ​mmHg, the linear predictor from the CHARGE-AF Risk Score only goes down by 0.1. This would be a very minimal change (0.8%) knowing the average CHARGE-AF score in MESA AF cases was 12.8 ​± ​0.9. Nonetheless, in the absence of an individualized metric with comparable predictive power, it serves as a useful tool for estimating risk and alerting high risk populations to reduce future AF risk.

NT-proBNP is a serum biomarker of cardiac volume overload particularly and has been studied extensively in various cardiovascular diseases, particularly heart failure. Thejus et al. have shown values above the 80th percentile (97 ​pg/ml in women and 60 ​pg/ml in men) present an odds ratio of 2.65 for the incidence of AF. Asselberg et al. found that in the general population, elevated NT-proBNP levels at baseline predicted the development of AF when reassessed at 4 years. The baseline median level was 62.2 ​pg/ml in those who eventually developed AF compared to 35.7 ​pg/ml in those who did not (p ​= ​0.001). Our study shows that LA volume outperformed NT-proBNP in MESA consistently over 5 and improved its predictive value by NRI of 0.69 for year 1 to NRI of 0.38 for year 5. This may be due to the fact that NT-proBNP is not specific to LA or RA volume and can be influenced by other factors.

Although ECG-based screening for AF is currently a topic of great clinical interest, it would not be a proper comparison for this study because ECG is primarily used for the detection of prevalent AF not for prediction of future AF. However, recent studies suggest that AI-enabled ECG could play a role in predicting future AF. A study by Christopoulos et al. that compared the performance of AI-enabled ECG with CHARGE-AF Risk Score, there was no significant difference between the cumulative incidence of AF in the top quartile of the two methods whereas in our study the top percentiles of AI-estimated LA volume detected a significantly higher percentage of AF versus CHARGE-AF. Perhaps by directly identifying individuals with a very large LA volume, our approach is inherently more capable of detecting high-risk cases for future AF than other methods including ECG-based predictive AI models.

The purpose of this study was not to evaluate cost effectiveness of AI-CAC LA volume vs. CHARGE-AF, NT-proBNP, or other methods. However, the main advantage of this AI tool resides in its opportunistic detection of patients with enlarged LA in cardiac or lung CT scans (done for any reason) who otherwise would be missed.

4.1. CAC scans can provide more than CAC scores

Our study corroborates findings from the Heinz Nixdorf Recall Study and others, and further brings to light the value of non-coronary findings in coronary calcium scans for a comprehensive CVD risk assessment beyond coronary heart disease. Although manual and automated LA volumetry in chest CT scans are relatively novel the pathophysiology of enlarged LA and its relationship with AF is well understood. AI-CAC automated cardiac chambers vs. manual measurements derived from the gold standard cardiac MRI was investigated in a separate study and showed non-significant difference.

Several echocardiographic studies have shown that increased LA strain is associated with atrial arrhythmia. Tsang et in 2001 reported that larger LA volume in echocardiographic studies was associated with a higher risk of AF in older patients. The predictive value of LA volume was incremental to that of clinical risk profile and conventional M-mode LA dimension. Kizer et showed that LA size was an independent predictor of CVD events. Mahabadi et al. showed in the longitudinal Heinz Nixdorf Recall Study that two-dimensional LA size and epicardial adipose tissue from non-contrast CT were strongly associated with prevalent and incident AF and that LA size diminished the link of epicardial adipose tissue with AF, and was also associated with incident major CV events independent of risk factors and CAC-score.

In a study of 131 cases AI-CAC cardiac chambers measurements in non-contrast cardiac CT scans were well correlated with automated cardiac chambers volumetry in contrast-enhanced cardiac CT scans using Philips Brilliance Workspace. Similarly, AutoChamber measurements in 169 ECG-gated cardiac versus non-gated chest CT scans in the same patients (paired scans done same day) showed strong correlations (R2 ​= ​0.85–0.95 for different chambers).

4.2. Limitations

Our study has some limitations. The MESA Exam 1 baseline CT scans, performed between 2000 and 2002, were predominantly conducted using electron-beam computed tomography (EBCT) scanners. This technology is no longer the commonly used method of CAC scanning. Since our AI training was done completely outside of MESA and used a modern multi-detector (256 slice) scanner, we do not anticipate this to affect the generalizability of our findings. Because MESA used the ICD codes to identify a history of AF at baseline and newly diagnosed AF, and it is known that ICD based diagnosis can be inaccurate (PPV 70–96%, median sensitivity 79%) it is likely that MESA missed some cases of AF. The ICD-10 code of AF does not include AF that has diagnosed and not yet been coded, but most importantly, does not include AF that has not yet been diagnosed. It is possible that a significant number of individuals with paroxysmal AF were missed.

5. Conclusion

In this study, we presented AI-CAC data obtained from existing CAC scans in a large multi-ethnic prospective study and compared the predictive value of AI-CAC estimated LA volume versus the CHARGE-AF Score and NT-proBNP for predicting AF. AI-CAC LA volumetry enabled prediction of AF and improved on the predictive value of CHARGE-AF Risk Score and NT-proBNP.

6. Clinical perspectives

The potential value of non-coronary findings in coronary calcium scans is significant. The clinical utility of this opportunistic add-on to CAC scans warrants further validation in other longitudinal cohorts. Additionally, the high rate of AF in the 99th percentile of AI-CAC LA volume makes it attractive for selection of participants into AF prevention clinical trials.

Funding

This research was supported by 2R42AR070713 and R01HL146666 and MESA was supported by contracts 75N92020D00001, HHSN268201500003I, N01-HC-95159, 75N92020D00005, N01-HC-95160, 75N92020D00002, N01-HC-95161, 75N92020D00003, N01-HC-95162, 75N92020D00006, N01-HC-95163, 75N92020D00004, N01-HC-95164, 75N92020D00007, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168 and N01-HC-95169 from the National Heart, Lung, and Blood Institute, and by grants UL1-TR-000040, UL1-TR-001079, and UL1-TR-001420 from the National Center for Advancing Translational Sciences (NCATS). Preparation of the manuscript was also supported by funding from HeartLung.AI.

Declaration of competing interest

Several members of the writing group are inventors of the AI tool mentioned in this paper. Dr. Naghavi is the founder of HeartLung.AI. Dr. Reeves, Dr. Atlas, Dr. Yankelevitz, Dr, Wong, and Dr. Li are consultants for HeartLung.AI. Chenyu Zhang is a software developer for HeartLung.AI. Kyle Atlas is a graduate research associate of HeartLung.AI. The remaining authors have nothing to disclose.

Acknowledgements

Special thank you to Philip Greenland and Susan Heckbert for reviewing early versions of the manuscript.

The authors thank the other investigators, the staff, and the participants of the MESA study for their valuable contributions. A full list of participating MESA investigators and institutions can be found at http://www.mesa-nhlbi.org.

bottom of page