Dietary Pattern and Risk of Hodgkin Lymphoma
Dietary Pattern and Risk of Hodgkin Lymphoma
Cases of incident HL were recruited from the greater Boston metropolitan area of Massachusetts and the state of Connecticut from August 1, 1997, to December 31, 2000. Eligible patients were aged 15–79 years, living within the target geographical area and without human immunodeficiency virus (HIV) infection at diagnosis. Cases were identified by using the rapid case ascertainment systems of Harvard and Yale universities with additional support from the Massachusetts and Connecticut state tumor registries. There were 677 eligible cases invited to participate in the study, and 84% (n = 567) consented. Certain data used in this study were obtained from the Connecticut Tumor Registry located in the Connecticut Department of Public Health.
Population-based controls were frequency matched to cases by age (within 5 years), sex, and state of residence (Massachusetts or Connecticut) and did not have a personal history of HL. In greater Boston, controls were identified through the "Town Books," annual records documenting all citizens aged ≥17 years, which are 90% complete. Of 720 invited controls in Massachusetts, 51% (n = 367) consented. In Connecticut, 450 eligible controls aged 18–65 years were identified by random-digit dialing, and 61% (n = 276) consented. Of 69 eligible controls in Connecticut aged 66–79 years identified through the Health Care Financing Administration (Medicare), 52% (n = 36) consented to participate.
The original research protocol was approved by the institutional review boards of the Harvard School of Public Health, Yale University School of Medicine, Johns Hopkins University School of Medicine, all participating hospitals, the Massachusetts Cancer Registry, and the Connecticut Tumor Registry in the Connecticut Department of Public Health. The present analysis of nonidentifiable study data was deemed exempt by the Harvard School of Public Health Human Subjects Committee.
Study pathologists reviewed all available pathology material to confirm an HL diagnosis. When possible, cHL cases were further subtyped as nodular sclerosis, mixed cellularity, lymphocyte-deleted cHL, or lymphocyte-rich cHL. Sixteen cases with nodular lymphocyte-predominant subtype HL were excluded from the analysis, as this subtype is considered biologically and clinically distinct from cHL. Tumor tissue was analyzed for EBV through in situ hybridization for EBV-encoded RNA transcripts and/or immunohistochemistry to detect the viral latency membrane protein in Reed-Sternberg cells. A tumor was considered positive for EBV if at least 1 assay was positive, and negative otherwise.
Lifestyle information was collected through a structured telephone interview for 97% of study participants, while 3% completed an abbreviated mailed study questionnaire. Additionally, 511 cases (93%) and 648 controls (95%) completed a validated, semiquantitative food frequency questionnaire (FFQ) to assess average consumption of 61 food and beverage items, plus vitamin and mineral supplements, over the year prior to enrollment. Participants reported the average frequency of consumption for each food according to commonly used units or portion sizes, which were then converted into standard servings per day. Participants were excluded from the present analysis if they left ≥3 FFQ items blank or reported a total energy intake >3 standard deviations from the sex-specific mean on the natural log scale (n = 77).
After exclusions, we had complete FFQ data on 881 participants. An additional 183 study participants had missing data on only 1 or 2 food items. To avoid unnecessarily reducing statistical power, we imputed a value of 0 servings per day for the 43 foods for which ≥20% of the remaining study population reported 0 servings per day (Web Appendix 1 http://aje.oxfordjournals.org/content/182/5/405/suppl/DC1 available at http://aje.oxfordjournals.org/), as missing values for infrequently consumed foods are likely to indicate 0 consumption. The cutoff was based on an evaluation of the distribution of missing and reported 0 intake, and changes in the cutoff did not meaningfully affect the results. Missing values on foods for which <20% of the population reported 0 servings per day were retained as missing, and these individuals were excluded from the study population (n = 66). The dietary pattern analysis thus included 435 cases and 563 controls.
To identify dietary patterns common to the study population, we conducted a principal components analysis of the 61 food and beverage items included on the FFQ, followed by a varimax orthogonal rotation to improve interpretability and minimize correlation between components. The number of principal components (i.e., eigenvectors) retained in the analysis was determined graphically using the scree test, which plotted the eigenvalues (i.e., the amount of total variance explained by a principal component) by each principal component. We retained 4 principal components following this assessment, each representing a separate, uncorrelated dietary pattern. Dietary patterns were ranked according to eigenvalue and described through identification of the major foods contributing to the pattern based on each food item's factor loading coefficient. As these variables were not normally distributed, we categorized the scores for each dietary pattern into quartiles. We found similar dietary patterns when the principal components analysis was conducted among the controls only and among cases and controls together; therefore, we retained patterns derived from both cases and controls because of greater stability.
Multivariable unconditional logistic regression models were used to estimate odds ratios and 95% confidence intervals for the association between quartile of a given dietary pattern (lowest quartile as the reference group) and cHL risk. Linear trend across quartiles was assessed in multivariable models by modeling the median values of the quartiles as a semicontinuous variable. All tests of statistical significance were 2 sided. Multivariable models were adjusted for the matching factors, plus total daily caloric intake (kcal/day), body mass index expressed as weight (kg)/height (m) (<25, 25–29.9, ≥30), and potential confounders previously found to be associated with overall cHL risk: race/ethnicity (non-Hispanic white, other/missing), nursery school/day-care attendance for ≥1 year (yes, no/did not attend/missing), history of smoking ≥10 lifetime packs of cigarettes (yes, no), education (less than high school, high school, more than high school), number of siblings (0, 1, 2, 3, 4, ≥5), alcohol intake (drinker, nondrinker), regular aspirin use (≥2 regular-strength tablets/week; yes, no) and regular acetaminophen use (≥2 regular-strength tablets/week; yes, no). The primary analysis stratified cHL models by age group at diagnosis (<50 vs. ≥50) to explore potential heterogeneity in the associations of dietary patterns with younger- or older-adult cHL. Five controls missing data on smoking status (n = 1) or number of siblings (n = 4) were dropped from multivariable models, as separate categories for missing data introduced instability to the models.
Because the distributions of EBV-positive tumors and cHL histological subtypes suggest age-related heterogeneity in the role of cHL risk factors, we conducted secondary analyses to examine the association of dietary pattern with cHL risk separately by tumor EBV status (positive, negative) or histological subtype (mixed cellularity, nodular sclerosing). Age-stratified models for tumor EBV status limited covariable adjustment to matching factors, total caloric intake, and body mass index, while models for older-adult mixed cellularity subtype were not adjusted for nursery school attendance because of sparse categories. All analyses were conducted by using SAS, version 9.2, statistical software (SAS Institute, Inc., Cary, North Carolina).
Methods
Study Population
Cases of incident HL were recruited from the greater Boston metropolitan area of Massachusetts and the state of Connecticut from August 1, 1997, to December 31, 2000. Eligible patients were aged 15–79 years, living within the target geographical area and without human immunodeficiency virus (HIV) infection at diagnosis. Cases were identified by using the rapid case ascertainment systems of Harvard and Yale universities with additional support from the Massachusetts and Connecticut state tumor registries. There were 677 eligible cases invited to participate in the study, and 84% (n = 567) consented. Certain data used in this study were obtained from the Connecticut Tumor Registry located in the Connecticut Department of Public Health.
Population-based controls were frequency matched to cases by age (within 5 years), sex, and state of residence (Massachusetts or Connecticut) and did not have a personal history of HL. In greater Boston, controls were identified through the "Town Books," annual records documenting all citizens aged ≥17 years, which are 90% complete. Of 720 invited controls in Massachusetts, 51% (n = 367) consented. In Connecticut, 450 eligible controls aged 18–65 years were identified by random-digit dialing, and 61% (n = 276) consented. Of 69 eligible controls in Connecticut aged 66–79 years identified through the Health Care Financing Administration (Medicare), 52% (n = 36) consented to participate.
The original research protocol was approved by the institutional review boards of the Harvard School of Public Health, Yale University School of Medicine, Johns Hopkins University School of Medicine, all participating hospitals, the Massachusetts Cancer Registry, and the Connecticut Tumor Registry in the Connecticut Department of Public Health. The present analysis of nonidentifiable study data was deemed exempt by the Harvard School of Public Health Human Subjects Committee.
Histopathology
Study pathologists reviewed all available pathology material to confirm an HL diagnosis. When possible, cHL cases were further subtyped as nodular sclerosis, mixed cellularity, lymphocyte-deleted cHL, or lymphocyte-rich cHL. Sixteen cases with nodular lymphocyte-predominant subtype HL were excluded from the analysis, as this subtype is considered biologically and clinically distinct from cHL. Tumor tissue was analyzed for EBV through in situ hybridization for EBV-encoded RNA transcripts and/or immunohistochemistry to detect the viral latency membrane protein in Reed-Sternberg cells. A tumor was considered positive for EBV if at least 1 assay was positive, and negative otherwise.
Data Collection
Lifestyle information was collected through a structured telephone interview for 97% of study participants, while 3% completed an abbreviated mailed study questionnaire. Additionally, 511 cases (93%) and 648 controls (95%) completed a validated, semiquantitative food frequency questionnaire (FFQ) to assess average consumption of 61 food and beverage items, plus vitamin and mineral supplements, over the year prior to enrollment. Participants reported the average frequency of consumption for each food according to commonly used units or portion sizes, which were then converted into standard servings per day. Participants were excluded from the present analysis if they left ≥3 FFQ items blank or reported a total energy intake >3 standard deviations from the sex-specific mean on the natural log scale (n = 77).
After exclusions, we had complete FFQ data on 881 participants. An additional 183 study participants had missing data on only 1 or 2 food items. To avoid unnecessarily reducing statistical power, we imputed a value of 0 servings per day for the 43 foods for which ≥20% of the remaining study population reported 0 servings per day (Web Appendix 1 http://aje.oxfordjournals.org/content/182/5/405/suppl/DC1 available at http://aje.oxfordjournals.org/), as missing values for infrequently consumed foods are likely to indicate 0 consumption. The cutoff was based on an evaluation of the distribution of missing and reported 0 intake, and changes in the cutoff did not meaningfully affect the results. Missing values on foods for which <20% of the population reported 0 servings per day were retained as missing, and these individuals were excluded from the study population (n = 66). The dietary pattern analysis thus included 435 cases and 563 controls.
Dietary Patterns
To identify dietary patterns common to the study population, we conducted a principal components analysis of the 61 food and beverage items included on the FFQ, followed by a varimax orthogonal rotation to improve interpretability and minimize correlation between components. The number of principal components (i.e., eigenvectors) retained in the analysis was determined graphically using the scree test, which plotted the eigenvalues (i.e., the amount of total variance explained by a principal component) by each principal component. We retained 4 principal components following this assessment, each representing a separate, uncorrelated dietary pattern. Dietary patterns were ranked according to eigenvalue and described through identification of the major foods contributing to the pattern based on each food item's factor loading coefficient. As these variables were not normally distributed, we categorized the scores for each dietary pattern into quartiles. We found similar dietary patterns when the principal components analysis was conducted among the controls only and among cases and controls together; therefore, we retained patterns derived from both cases and controls because of greater stability.
Statistical Analysis
Multivariable unconditional logistic regression models were used to estimate odds ratios and 95% confidence intervals for the association between quartile of a given dietary pattern (lowest quartile as the reference group) and cHL risk. Linear trend across quartiles was assessed in multivariable models by modeling the median values of the quartiles as a semicontinuous variable. All tests of statistical significance were 2 sided. Multivariable models were adjusted for the matching factors, plus total daily caloric intake (kcal/day), body mass index expressed as weight (kg)/height (m) (<25, 25–29.9, ≥30), and potential confounders previously found to be associated with overall cHL risk: race/ethnicity (non-Hispanic white, other/missing), nursery school/day-care attendance for ≥1 year (yes, no/did not attend/missing), history of smoking ≥10 lifetime packs of cigarettes (yes, no), education (less than high school, high school, more than high school), number of siblings (0, 1, 2, 3, 4, ≥5), alcohol intake (drinker, nondrinker), regular aspirin use (≥2 regular-strength tablets/week; yes, no) and regular acetaminophen use (≥2 regular-strength tablets/week; yes, no). The primary analysis stratified cHL models by age group at diagnosis (<50 vs. ≥50) to explore potential heterogeneity in the associations of dietary patterns with younger- or older-adult cHL. Five controls missing data on smoking status (n = 1) or number of siblings (n = 4) were dropped from multivariable models, as separate categories for missing data introduced instability to the models.
Because the distributions of EBV-positive tumors and cHL histological subtypes suggest age-related heterogeneity in the role of cHL risk factors, we conducted secondary analyses to examine the association of dietary pattern with cHL risk separately by tumor EBV status (positive, negative) or histological subtype (mixed cellularity, nodular sclerosing). Age-stratified models for tumor EBV status limited covariable adjustment to matching factors, total caloric intake, and body mass index, while models for older-adult mixed cellularity subtype were not adjusted for nursery school attendance because of sparse categories. All analyses were conducted by using SAS, version 9.2, statistical software (SAS Institute, Inc., Cary, North Carolina).
Source...