Prediction of autoimmune diabetes and celiac disease in childhood by genes and perinatal environment : Design and initial aims of the PAGE study

Type 1 diabetes and celiac disease result from misdirected immune mediated destruction of host cells, and are among the most common chronic diseases in children. Despite changes in incidence over the past 3 decades, little is known about non-genetic risk factors (except for dietary gluten for celiac disease). Norway is among the countries in the world with the highest incidence of these two diseases. We describe here plans and study design for the PAGE study (Prediction of Autoimmune diabetes and celiac disease in childhood by Genes and perinatal Environment). PAGE is a sub-study within the Norwegian Mother and Child Cohort study, including follow-up of more than 100,000 pregnancies. Children who develop type 1 diabetes or celiac disease are identified via linkage to the Norwegian Patient Register and the Norwegian Childhood Diabetes Registry, with complementing information from questionnaires. The overall aim is to test hypotheses about potential non-genetic risk factors for type 1 diabetes and for celiac disease, with focus on factors operating early in life. In addition to a full cohort analysis of factors registered in questionnaires, we will analyse biomarkers in maternal blood plasma and cord blood plasma. Mothers and children will be genotyped for well-established susceptibility polymorphisms. Biomarkers will be analysed in cases and controls within the cohort. Factors to be tested in the full cohort include infant feeding, diet and dietary supplements in the mother during pregnancy and in the child, and use of antibiotics and non-prescription drugs. Biomarkers to be tested include 25-hydroxyvitamin D, markers of immune activation, and small metabolites (metabolomics). We will also explore the potential role of maternal cells in the fetal circulation (maternal microchimerism) in later risk of celiac disease and type 1 diabetes. This is an open access article distributed under the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


INTRODUCTION
Type 1 diabetes and celiac disease are among the most common chronic diseases with onset in childhood.Type 1 diabetes is due to an autoimmune selective destruction of the insulin producing beta-cells in the pancreas (1), leading to excessive blood sugar unless treated with exogenous insulin injections.As optimal therapy is difficult to achieve, many patients suffer acute complications such as hypoglycaemia and diabetic ketoacidosis, and late complications in the eyes, kidneys, peripheral nerves and cardiovascular diseases.Together with Finland and Sweden, Norway is among the countries in the world with the highest incidence of type 1 diabetes in children (2,3).
Celiac disease results from a misdirected, cell mediated immune response to dietary gluten that leads to inflammation and flattening of the gut mucosa with resulting malabsorption and long-term complications such as osteoporosis and gastrointestinal cancers (4).The current treatment is limited to a lifelong glutenfree diet.Both celiac disease and type 1 diabetes are associated with reduced quality of life and life expectancy (5,6).Along with Sweden, Norway has a very high incidence of diagnosed celiac disease among children, compared to other countries (7).
The two diseases have a number of interesting features in common, and tend to occur together (8).The majority of celiac disease and type 1 diabetes patients carry the HLA-DQ2 haplotype and/or the DQ8 haplotype (9,10).As these haplotypes are relatively common in the population and the minority of those with one or even two risk haplotypes develop disease, HLA susceptibility alleles seem necessary but not sufficient for disease.Additional causal factors are therefore likely involved.Circulating autoantibodies may be predictive of clinical disease months and years before onset of both diseases (1,4,11).The fact that these may appear before six months of age and frequently appear before two to three years of age (12,13) suggest that factors influencing the seroconversion operate early in life.
Here, we describe the study design and the initial research questions in the PAGE-study: Prediction of Autoimmune diabetes and celiac disease in childhood by Genes and perinatal Environment.While insulin replacement and gluten-free diet are established treatments for type 1 diabetes and celiac disease, respectively, there is currently no cure for either disease, and no established intervention to reduce the risk of disease by pharmacological intervention or otherwise.
The overall goal of the study is to increase our understanding of the aetiology of these diseases, particularly how specific environmental factors in early life, including in utero, can predict risk of autoimmune type 1 diabetes and celiac disease in childhood.Potential positive findings could be informative for planning of future preventive trials.Another important rationale for the study is that even "negative results" (lack of associations) would be informative for patients and families who could then lay to rest worries about potential "self-inflicted" causes of these diseases.
In the next paragraphs, we first provide a brief introduction to what we already know about genetic and non-genetic factors in the aetiology of celiac disease and type 1 diabetes (focusing primarily on biomarkers), before we describe the specific aims of the study, and then describe the methods.We finish off with a brief glance into the future.

GENETIC FACTORS AND GENE-ENVIRONMENT INTERACTIONS
An important role of HLA genetics in both diseases is well established.Nevertheless, only a minority of those with HLA susceptibility genes develop disease.A large number of non-HLA SNPs have also been identified in both diseases (4,14,15).Non-HLA SNPs generally confer weak associations, and even when combined, they have limited predictive power compared with that of HLA genes (16,17).Together with evidence discussed below, this suggests that environmental factors are also involved in both diseases.In fact most diseases and traits are likely a result of both genetic and non-genetic factors (18).The PAGE study will not attempt to identify novel susceptibility genes, as this is better done in large genetics consortia.Rather, the strength in PAGE is the ability to study non-genetic factors with a prospective collection of blood and questionnaires.Prospective cohort studies offer unique advantages for the study of gene-environment interactions, as retrospective assessment of exposure to non-genetic factors are fraught with measurement error and potential bias (19).Because of the considerable resources required, there are few large scale, population based prospective studies of non-genetic factors risk factors for type 1 diabetes or celiac disease.

EARLY ENVIRONMENTAL FACTORS IN CELIAC DISEASE AND TYPE 1 DIABETES
While there are many hypotheses and abundant circumstantial evidence, no aetiological environmental factor has yet been convincingly identified in type 1 diabetes (20)(21)(22).In celiac disease, dietary intake of gluten is necessary for expression of the disease, but this is not a sufficient cause.Limited efforts have so far been invested in the search for other non-genetic factors in celiac disease.
The incidence of both diseases varies between countries, even within Europe, and has increased over time (2,23).During 2000-2010, the incidence rate of clinically detected cases of celiac disease in children in South-Eastern Norway has increased from 16 to 44/100.000/person-years(K.Størdal et al., unpublished observation).Sweden has experienced an epidemic with cohort effects on cumulative incidence (24).Such rapid changes cannot be explained by changes in the genetic composition of the population.
The autoimmune process leading to celiac disease and type 1 diabetes occurs early in life.Islet autoantibodies may appear before 6 months, often in the first 1-4 years of life (12,13), and tissue transglutaminase autoantibodies somewhat later (11,25).Perinatal factors such as birth weight and caesarean section have been associated with both type 1 diabetes and celiac disease (26)(27)(28).These factors are likely markers of other, yet unknown processes.We must therefore focus on factors operating very early in life, even in utero, to understand how autoimmunity initiates (21).

Breast feeding and introduction of cereals
There is circumstantial evidence for a role of breastfeeding and timing of introduction of cereals in both type 1 diabetes and celiac disease, mostly based on retrospective studies (29,30).It is likely that potential relationships here may be very complex, and involving potentially numerous different mechanisms such as development and integrity of the infant gut, exposure to foreign antigens, gut microbiota, and protection against infections.A few prospective studies of selected children at increased genetic risk have generally not found any consistent associations with islet autoimmunity or type 1 diabetes (31,32), while studies of celiac disease have indicated potential increased risk with late introduction of gluten (33,34).Two recent randomised trials have investigated the effect of timing of introduction of gluten on risk of celiac disease, and they found no discernible effect (35,36).One randomised trial tested whether delaying the introduction of intact cow's milk in genetically susceptible infants would reduce the risk of islet autoimmunity, but found no significant effect (37).In all three trials, participants were encouraged to breastfeed, and duration of breastfeeding was not associated with endpoint in observational analyses in any of the trials.

Plasma metabolome profiling
The plasma metabolome and -lipidome refer to the collection of small metabolites, and lipids, respectively, in blood plasma.With limited knowledge of the nature of potential factors predictive of type 1 diabetes and celiac disease, it makes sense to explore potential biomarkers using an "omics" approach by comparing children who later develop disease with controls.The actual metabolites that are reliably detected depend on sample handling and quality as well as the available technology for profiling the sample.Metabolomics platforms are usually categorised into targeted or nontargeted, but a variety of factors influence the detectable analytes.In practice, the available technology we aim to use detects up to a few hundred, small, hydrophilic (water soluble) metabolites.The Finnish DIPP study reported patterns in the serum or plasma lipidome and metabolome associated with islet autoimmunity, some of which persisted from birth (38).For instance, the appearance of diabetes associated autoantibodies (markers of "prediabetes") was preceded by lower serum levels of ketoleucine and elevated glutamic acid in this study.We are not aware of any such studies in celiac disease.The plasma lipidome is very sensitive to sample handling conditions, and we may have to limit or exclude lipidomics from PAGE for this reason, while we aim to include metabolomics.

Vitamin D
Vitamin D status, measured by serum or plasma 25hydroxyvitamin D, is influenced by both dietary vitamin D and synthesis in the skin upon ultra violet light exposure.Many studies on type 1 diabetes are based on maternal recall of their children's use of supplements, while the few prospective studies have limited numbers with clinical endpoint (39).Published studies of 25-hydroxyvitamin D in childhood (40) or maternal samples from pregnancy (41,42) are very few, and have shown mixed results.The apparent inconsistent results may be due to methodological factors and timing of samples.Intake of vitamin D is likely to change during pregnancy, in addition to changes in physiology of metabolites in the vitamin D "pathway".We have preliminary data from another cohort (41,43) suggesting also an association between vitamin D binding protein (DBP) in pregnancy and type 1 diabetes in the offspring.DBP probably has multiple physiological effects apart from acting as a carrier from 25hydroxyvitamin D and other vitamin D metabolites, and DBP is also known to increase during pregnancy (44,45).If associations with 25-hydroxyvitamin D in late pregnancy and type 1 diabetes (and perhaps also with celiac disease) can be confirmed, it would provide a basis for planning future intervention studies with vitamin D supplements to pregnant women.We are not aware of studies of vitamin D in relation to risk of celiac disease, but we aim to study this for the first time in PAGE.

INFLAMMATION
Inflammation probably plays an important, but complex role in a large number of diseases of public health importance including celiac disease and type 1 diabetes (1,46).We will focus on a limited selection of soluble markers of immune activation or inflammation in the perinatal period.Various studies have implicated cytokines such as interleukin (IL) 1β, interferon gamma (IFN-γ), IL-2, and the chemokines CXCL10 (IP10) and MCP-1 (CCL2) in type 1 diabetes, celiac disease, or related disorders.Pregnancy levels of these in relation to disease in the offspring have been investigated in a few moderately sized studies (47,48).
Neopterin is a recently "rediscovered" biomarker, which is considered a relatively stable marker of cellular immune activation.It is produced by macrophages under the influence of interferon-gamma (49), which is characteristic of Th-1 immune responses.In fact, and early study found that urinary neopterin seemed to be increased in celiac disease, but little more came out of this story (50).Neopterin may change during pregnancy, and has been associated with for instance preeclampsia (51).Another marker we plan to measure is the kynurenine:tryptophan ratio (52).This is also a marker of T-helper 1 immune activation, considered more stable during sample handling than cytokines (53).

MATERNAL MICROCHIMERISM
An understudied area that may play a role is maternal microchimerism (54).The engraftment of maternal cells in the foetus, known as maternal microchimerism (MMc), was initially recognized in children with severe combined immunodeficiency.Maternal cells were identified in umbilical cord blood samples from healthy male infants in 1995 (55).The placenta is therefore not an impenetrable barrier to cellular traffic and it has been shown that MMc can persist for many years in healthy individuals (56).This suggests the maintenance of immune tolerance to genetically distinct maternal cells in health.The mechanism underlying this naturally obtained tolerance is unknown.It has recently been shown that substantial numbers of maternal cells cross the placenta to reside in fetal lymph nodes inducing development of CD4+CD25highFoxP3+Tregs that suppress fetal antimaternal immunity and persist for many years (57).One might therefore hypothesise that increased levels of MMc that has been reported in the periphery in several forms of autoimmunity including type 1 diabetes ( 58) may indicate a failure in anti-maternal tolerance.We are not aware of such studies in celiac disease, and no study of cord blood MMc and type 1 diabetes has to our knowledge been done yet.

Specific aims for biomarker studies
The specific objectives of the current project are to test whether the below listed factors, together with established susceptibility genes, can predict risk of type 1 diabetes and celiac disease (these will also be tested for gene-environment interactions): • Lower levels of 25-hydroxvitamin D in cord blood or maternal blood near delivery (We will also measure vitamin D binding protein, and genetic markers such as Gc/DBP and CYP27B1 in addition to HLA genotype as part of this sub-project.Vitamin D metabolites will be measured in maternal sample from mid pregnancy and at delivery as well as in cord blood, and we will also investigate changes during pregnancy and relation between maternal and fetal levels in relation to disease outcome).• Pattern identified by metabolomics profiling of cord vein plasma (lipidomics profiling will be considered, depending on the quality of the specimens).• Maternal markers of immune activation or inflammation, including selected cytokines mentioned above, C-reactive protein, neopterin and the kynurenine:tryptophan ratio.• Estimated quantity of maternal cells in the fetal circulation, as determined by the quantity of maternal DNA.(see methods for more details.While the exact role and mechanism of the increased levels of maternal microchimerism observed in previous studies is not known, the novel research question in the PAGE study is whether an increased level of maternal cells in the child's circulation can be observed already at birth, as measured in cord blood).

Specific aims for questionnaire based studies:
• Specific factors prospectively collected in questionnaires will be tested for statistical association with higher or lower risk for type 1 diabetes and celiac disease (these will also be tested for gene-environment interaction): • Short duration of breastfeeding, early introduction of gluten and other solid foods during weaning, use of dietary supplements and cod liver oil by the mother during pregnancy and by the child in the first three years of life.• Reported diseases and symptoms during the child's first three years of life, including gastrointestinal infections and respiratory tract infections.• Asthma, eczema and allergies in the children (the literature on this topic is inconsistent, but early childhood eczema has frequently been found to be associated with lower risk of type 1 diabetes, primarily in retrospective studies ( 61)).• Maternal education and other socioeconomic indicators: The scattered literature is inconsistent, but our hypothesis is based on a suggestive association between high maternal education and lower risk of type 1 diabetes (6).Describing the potential association of socioeconomic status with both celiac disease and type 1 diabetes, or lack thereof, is important in order to learn more about the nature of the elusive environmental triggers.

Study design
The PAGE study is based on biobanked plasma and DNA samples and questionnaire data collected from over 100.000 pregnancies in the Norwegian Mother and Child Cohort study (MoBa, (62,63)).Cases with type 1 diabetes will be identified by register linkage to the Norwegian Childhood Diabetes Registry, which stores blood from diagnosis as well as detailed data on clinical characteristics (3,64), as well as the Norwegian Patient Registry (NPR) (7).The NCDR is based on consent from the participants and/or the parents/guardians (www.barnediabetes.no).By register linkage to NorPD (prescription drug register, using insulin purchase as indicator of type 1 diabetes diagnosis), we estimated that the ascertainment for children diagnosed before 15 years of age during 2004-2008 was 91% (3).A few cases of type 1 diabetes may therefore be missed, of whom some may be captured in the Norwegian Patient Register.As long as the ascertainment is >90%, we consider it most important for validity that the captured cases are true type 1 diabetes cases.To miss a few true cases is less important that to include false cases (non-diabetes, or other forms of diabetes such as monogenic forms, type 2 diabetes).Celiac disease will be identified from a combination of NPR and standard MoBa questionnaires.A minimum of two entries with celiac disease (K90.0) in the NPR is used as a criterion, to avoid false positives given the diagnosis during a diagnostic process.Questionnaires have been sent to the potential cases identified from NPR and questionnaires to confirm the diagnosis, indicating that 95% of the identified cases are true celiac cases.Additionally, information regarding the diagnosis and symptoms in cases has been collected.A set of controls randomly selected from the whole MoBa-cohort will be used in common for both diseases.The established genetic markers will be investigated together with other biomarkers and questionnaire data, including gene-environment interaction analyses.T1D) is ascertained by linkage to the Norwegian Childhood Diabetes Registry (www.barnediabetes.no),which also stores blood collected at diagnosis.Celiac disease will be ascertained primarily by linkage to the Norwegian Patient Registry, but also by supplemental information from questionnaires.

Conception
Figure 1 outlines the concept and design of the PAGE study.
MoBa questionnaires with information on diet, lifestyle, medication, socio-economic status and other environmental factors we propose to utilize are collected at gestational week 15, 22, and at birth, and when the child is 6,18 and 36 months.The prospective collection of information during pregnancy minimises the risk for recall bias, a problem in most other studies.The Norwegian Patient Registry (NPR) has collected information on diagnoses in the health care system with data traceable to individuals since 2008.Celiac disease has to be diagnosed by a paediatrician or a gastroenterologist in Norway.We also include specific questionnaires to celiac disease cases identified in NPR, to obtain confirmative information about the diagnosis and additional information about the timing and nature of initial symptoms.
For type 1 diabetes, clinical characteristics, including family history, islet autoantibodies, and screening results for celiac disease are routinely collected at diagnosis and thereafter in the Norwegian Childhood Diabetes Registry.Phenotypic classification is based on clinical and genetic information, including HLA genotype, and suspected cases of monogenetic forms of diabetes are screened.

Vitamin D metabolite assays
Plasma 25-hydroxyvitamin D will be assayed using liquid chromatography followed by tandem mass spectrometry (LC-MS/MS).Vitamin D binding protein (DPB) and 1,25-dihydroxyvitamin D both will be measured using commercial radioimmunoassays (RIA).(Genotyping of SNPs in the DBP/GC gene and the CYP27B1 gene will be done as described under genotyping below).

Metabolomics and lipidomics profiling
Cord vein plasma and maternal plasma from pregnancy will be subjected to metabolomics profiling using ultra-high pressure liquid chromatography (UPLC) combined with mass spectrometry (MS), and by gas chromatography -time of flight mass spectrometry (GCxGC-TOF/MS) (38).

Markers of inflammation
A limited number of selected cytokines and chemokines of particular interest, including IL-1β, IFN-γ, IL-2, CXCL10 (IP10) and MCP-1 (CCL2) and C-reactive protein, will be measured in maternal mid-pregnancy venous blood plasma Luminex technology.Neopterin, kynurenine and tryptophan will be measured using liquid chromatography/tandem mass spectrometry (LC-MS/MS) (65).

Maternal microchimerism assays
DNA has been isolated from cord blood in MoBa, and the Norwegian Childhood Diabetes Registry have stored DNA isolated from venous blood drawn at diagnosis from type 1 diabetes cases.The non-inherited HLA allele will be determined from the HLA genotype of the mother and child.Quantitative PCR (qPCR) specific for the un-transmitted HLA allele will be carried out as described (58).Briefly, the method quantifies DNA originating from maternal cells, and is expressed as genomic equivalents (an estimate of number of maternal cells).

Genotyping
For the purpose of the study, we will use tagSNPs to determine HLA genotypes of relevance to risk of type 1 diabetes and celiac disease, and for non-inherited maternal alleles.This is an alternative to labour (and cost) intensive traditional HLA genotyping with a large number of allele-specific primers and/or probes.This will imply a certain loss of accuracy of the genotyping, which is acceptable for most aspects of the study.For the maternal microchimerism analyses, the maternal HLA allele predicted from the SNP typing will be sought confirmed with traditional, in house allele specific PCR primer set assays.We will also genotype currently established non-HLA susceptibility loci for type 1 diabetes and coeliac disease (approximately 150 non-HLA SNPs (14,15)) using an Illumina Golden Gate custom chip.

Data analysis
Analysis will primarily be based on logistic and linear regression models.Continuous variables will be tested primarily by tests for trend, and dose response effects evaluated using categorisation and smoothing splines.Interactions will be tested by adding interaction terms to the models.qPCR data from the maternal microchimerism part of the project will be analysed using linear regression models applied to ranks of the MMc values.Metabolomics data will be adjusted for multiple tests using the (FDR-related) q-value (66).For each type predictor investigated, analyses will be adjusted for and stratified by markers of genetic susceptibility.

Sample size
Based on external incidence data for celiac disease and type 1 diabetes and known proportions of missing blood samples and questionnaires, we projected that around 200 cases with type 1 diabetes and more than 400 cases of celiac disease would be available for analysis in MoBa by 2013.This has now (Aug 2014) been confirmed.Since many of the assays are relatively expensive, we decided based on a cost vs power trade off to include 600 controls (3:1 for diabetes and 1.5:1 for celiac disease).In 2013, the PAGE study was funded by the Research Council of Norway to carry out these planned analyses.We project that by the time all MoBa-children has turned 15 years old, nearly 500 cases of type 1 diabetes will have been diagnosed, and a much larger number will have been diagnosed with celiac disease.

Statistical power considerations
The approximately 200 type 1 diabetes cases and 600 common controls will provide between 81 and 99% power to detect significant differences at the 5% level, for dichotomous exposure variables with between 10 and 80% prevalence, as long as the true odds ratio for association is 2.0, and higher power if the true odds ratio is higher (and higher power for celiac disease, for which there are 400 cases).For maternal plasma 25hydroxy-vitamin D, we have published a significant (logit-) linear association with type 1 diabetes with 109 cases and 219 controls in an independent study cohort (41).The mean was 66nM and 73nM in cases and controls, respectively, and the standard deviation 27nM in both groups.With 200 cases and 600 controls, we will have 91% power to detect a similar difference in mean between cases and controls in the currently proposed study.Some aspects of the project are novel, and the magnitude of potential differences impossible to project, with resulting uncertainties in power calculations.

COLLABORATIONS WITH OTHER COHORTS
Experience from genetic epidemiology has shown that replication is the most reliable way of protecting against chance findings (false positive findings).Due to the relatively low percentage of children from the general population who develops celiac disease or type 1 diabetes, prospective studies must be very large with years of follow-up for a large number of endpoints to accrue.For this reasons, there are very few cohorts available.Depending on the nature of the findings in PAGE, some results can probably be sought replicated in one of the existing cohorts with genetically susceptible children such as MIDIA, http://www.fhi.no/MIDIA(12) and other similar cohorts (67).However, we must keep in mind that these studies differ from PAGE in many aspects of design and target group.
The only other study we are aware of that is directly comparable to MoBa and PAGE, is MoBa's Danish "sister cohort", the Danish National Birth Cohort (68).We have specific plans with Danish collaborators to coordinate analyses of several potential risk factors for both type 1 diabetes and celiac disease.The Danish cohort is of comparable size and will contribute a comparable number of cases with type 1 diabetes, but a smaller number of celiac disease cases, because of the enigmatic lower incidence of celiac disease in Denmark compared with Norway.Jointly, the two cohorts will amount to the largest population-based birth cohort of celiac disease and type diabetes in the world.

CONCLUDING REMARKS
With increasingly long follow-up time, more cases will accrue with consequent increase in total statistical power, and the potential to replicate initial findings using future cases developing the cohort.It is perfectly possible that the truly causative non-genetic risk factors we are searching for are not on the list of current hypotheses, but the biobanks make it possible to test novel factors in the future and thus has tremendous value.Depending on funding, a number of additional biomarkers could be tested.However, we must always consider the trade-off between testing current hypotheses and saving material for the future to replicate novel hypotheses.PAGE has its strength in the wealth of data from pregnancy and early postnatal life, and the large sample size, but lacks detailed follow-up data and longitudinally collected biological material in studies such as MIDIA and other high risk cohorts mentioned above.Together with the handful of other prospective studies in the field, we feel confident that the PAGE study will contribute to fulfil our ultimate goals to enhance the understanding of the aetiology of celiac disease and type 1 diabetes, and one day, hopefully, an intervention to reduce the incidence of these diseases.
Overview of concept and design of the PAGE study.In the Norwegian Mother and Child Cohort Study, over 108 000 pregnancies are followed prospectively.Blood samples are available from the mother at 17-20 gestational weeks and at delivery, and from the umbilical vein, representing the fetal circulation.Information regarding demographic factors, diet, lifestyle and relevant environmental exposures are collected twice during pregnancy and when the child is 6, 18 and 36 months, indicated by Q in the figure.Type 1 diabetes ( Figure 1.