Home HEALTH Multiple sclerosis genomic map implicates peripheral immune cells and microglia in susceptibility – Science Magazine

Multiple sclerosis genomic map implicates peripheral immune cells and microglia in susceptibility – Science Magazine

by admin2 admin2
30 views
Multiple sclerosis genomic map implicates peripheral immune cells and microglia in susceptibility – Science Magazine

Genetic roots of multiple sclerosis

The genetics underlying who develops multiple sclerosis (MS) have been difficult to work out. Examining more than 47,000 cases and 68,000 controls with multiple genome-wide association studies, the International Multiple Sclerosis Genetics Consortium identified more than 200 risk loci in MS (see the Perspective by Briggs). Focusing on the best candidate genes, including a model of the major histocompatibility complex region, the authors identified statistically independent effects at the genome level. Gene expression studies detected that every major immune cell type is enriched for MS susceptibility genes and that MS risk variants are enriched in brain-resident immune cells, especially microglia. Up to 48% of the genetic contribution of MS can be explained through this analysis.

Science, this issue p. eaav7188; see also p. 1383

Structured Abstract

INTRODUCTION

Multiple sclerosis (MS) is an inflammatory and degenerative disease of the central nervous system (CNS) that often presents in young adults. Over the past decade, certain elements of the genetic architecture of susceptibility have gradually emerged, but most of the genetic risk for MS remained unknown.

RATIONALE

Earlier versions of the MS genetic map had highlighted the role of the adaptive arm of the immune system, implicating multiple different T cell subsets. We expanded our knowledge of MS susceptibility by performing a genetic association study in MS that leveraged genotype data from 47,429 MS cases and 68,374 control subjects. We enhanced this analysis with an in-depth and comprehensive evaluation of the functional impact of the susceptibility variants that we uncovered.

RESULTS

We identified 233 statistically independent associations with MS susceptibility that are genome-wide significant. The major histocompatibility complex (MHC) contains 32 of these associations, and one, the first MS locus on a sex chromosome, is found in chromosome X. The remaining 200 associations are found in the autosomal non-MHC genome. Our genome-wide partitioning approach and large-scale replication effort allowed the evaluation of other variants that did not meet our strict threshold of significance, such as 416 variants that had evidence of statistical replication but did not reach the level of genome-wide statistical significance. Many of these loci are likely to be true susceptibility loci. The genome-wide and suggestive effects jointly explain ~48% of the estimated heritability for MS.

Using atlases of gene expression patterns and epigenomic features, we documented that enrichment for MS susceptibility loci was apparent in many different immune cell types and tissues, whereas there was an absence of enrichment in tissue-level brain profiles. We extended the annotation analyses by analyzing new data generated from human induced pluripotent stem cell–derived neurons as well as from purified primary human astrocytes and microglia, observing that enrichment for MS genes is seen in human microglia, the resident immune cells of the brain, but not in astrocytes or neurons. Further, we have characterized the functional consequences of many MS susceptibility variants by identifying those that influence the expression of nearby genes in immune cells or brain. Last, we applied an ensemble of methods to prioritize 551 putative MS susceptibility genes that may be the target of the MS variants that meet a threshold of genome-wide significance. This extensive list of MS susceptibility genes expands our knowledge more than twofold and highlights processes relating to the development, maturation, and terminal differentiation of B, T, natural killer, and myeloid cells that may contribute to the onset of MS. These analyses focus our attention on a number of different cells in which the function of MS variants should be further investigated.

Using reference protein-protein interaction maps, these MS genes can also be assembled into 13 communities of genes encoding proteins that interact with one another; this higher-order architecture begins to assemble groups of susceptibility variants whose functional consequences may converge on certain protein complexes that can be prioritized for further evaluation as targets for MS prevention strategies.

CONCLUSION

We report a detailed genetic and genomic map of MS susceptibility, one that explains almost half of this disease’s heritability. We highlight the importance of several cells of the peripheral and brain resident immune systems—implicating both the adaptive and innate arms—in the translation of MS genetic risk into an auto-immune inflammatory process that targets the CNS and triggers a neurodegenerative cascade. In particular, the myeloid component highlights a possible role for microglia that requires further investigation, and the B cell component connects to the narrative of effective B cell–directed therapies in MS. These insights set the stage for a new generation of functional studies to uncover the sequence of molecular events that lead to disease onset. This perspective on the trajectory of disease onset will lay the foundation for developing primary prevention strategies that mitigate the risk of developing MS.

The MS genetic map implicates microglia as well as multiple different peripheral immune cell populations in the onset of the disease.

We list some of the immune cells in which we found an excess of MS susceptibility genes, implicating these cells as contributing to the earliest events that trigger MS. The sample size of our genome-wide association study is listed along with a circus plot illustrating main results.

” data-hide-link-title=”0″ data-icon-position=”” href=”https://science.sciencemag.org/content/sci/365/6460/eaav7188/F1.large.jpg?width=800&height=600&carousel=1″ rel=”gallery-fragment-images-591409918″ title=”The MS genetic map implicates microglia as well as multiple different peripheral immune cell populations in the onset of the disease. We list some of the immune cells in which we found an excess of MS susceptibility genes, implicating these cells as contributing to the earliest events that trigger MS. The sample size of our genome-wide association study is listed along with a circus plot illustrating main results.”>

The MS genetic map implicates microglia as well as multiple different peripheral immune cell populations in the onset of the disease.

We list some of the immune cells in which we found an excess of MS susceptibility genes, implicating these cells as contributing to the earliest events that trigger MS. The sample size of our genome-wide association study is listed along with a circus plot illustrating main results.

Abstract

We analyzed genetic data of 47,429 multiple sclerosis (MS) and 68,374 control subjects and established a reference map of the genetic architecture of MS that includes 200 autosomal susceptibility variants outside the major histocompatibility complex (MHC), one chromosome X variant, and 32 variants within the extended MHC. We used an ensemble of methods to prioritize 551 putative susceptibility genes that implicate multiple innate and adaptive pathways distributed across the cellular components of the immune system. Using expression profiles from purified human microglia, we observed enrichment for MS genes in these brain-resident immune cells, suggesting that these may have a role in targeting an autoimmune process to the central nervous system, although MS is most likely initially triggered by perturbation of peripheral immune responses.

Over the past decade, elements of the genetic architecture of multiple sclerosis (MS) susceptibility have gradually emerged from genome-wide and targeted studies (16). The role of the adaptive arm of the immune system, particularly its CD4+ T cell component, has become clearer, with multiple different T cell subsets being implicated (4). Although the T cell component plays an important role, functional and epigenomic annotation studies have begun to suggest that other elements of the immune system may be involved as well (7, 8). We assembled available genome-wide MS data to perform a meta-analysis followed by a systematic, comprehensive replication effort in large independent sets of subjects. This effort has yielded a detailed genome-wide genetic map that includes the first successful evaluation of the X chromosome in MS and provides a powerful platform for the creation of a detailed genomic map, outlining the functional consequences of most variants and their assembly into susceptibility networks (fig. S1).

Discovery and replication of genetic associations

We organized available (1, 2, 4, 5) and newly genotyped genome-wide data in 15 data sets, totaling 14,802 subjects with MS and 26,703 controls for our discovery study (tables S1 to S3) (9). After rigorous per-data-set quality control, we imputed all samples using the 1000 Genomes Project European panel, resulting in an average of 7.8 million imputed single-nucleotide polymorphisms (SNPs) with a minor allele frequency (MAF) of at least 1% (9). We then performed a meta-analysis, penalized for within–data set residual genomic inflation, to a total of 8,278,136 SNPs, with data in at least two data sets (9). Of these, 26,395 SNPs reached genome-wide significance (P < 5 × 10−8; fixed-effects inverse-variance meta-analysis), and another 576,204 SNPs had at least nominal evidence of association (5 × 10−8 < P < 0.05; fixed-effects inverse-variance meta-analysis). In order to identify statistically independent SNPs in the discovery set and to prioritize variants for replication, we applied a genome-partitioning approach (9). Briefly, we first excluded an extended region of ~12 Mb around the major histocompatibility complex (MHC) locus to scrutinize this distinct region separately, and we then applied an iterative method to discover statistically independent SNPs in the rest of the genome using conditional modeling. We partitioned the genome into regions by extracting ±1 Mb on either side of the most statistically significant SNP and repeating this procedure until there were no SNPs with P < 0.05 (fixed-effects inverse-variance meta-analysis) left in the genome. Within each region, we applied conditional modeling to identify statistically independent effects (fig. S2). As a result, we identified 1961 non-MHC autosomal regions that included 4842 presumably statistically independent SNPs. We refer to these 4842 prioritized SNPs as “effects,” assuming that these SNPs tag a true causal genetic effect. Of these, 82 effects were genome-wide significant in the discovery analysis, and another 125 had P < 1 × 10−5 (fixed-effects inverse-variance meta-analysis).

In order to replicate these 4842 effects, we analyzed two large-scale independent sets of data. First, we designed the MS Chip to directly replicate each of the prioritized effects (9) and, after stringent quality check (table S4) (9), analyzed 20,360 MS subjects and 19,047 controls, which were organized into nine data sets. Second, we incorporated targeted genotyping data generated using the ImmunoChip platform on an additional 12,267 MS subjects and 22,625 control subjects that had not been used in either the discovery or the MS Chip subject sets (table S5) (3). Overall, we jointly analyzed data from 47,429 MS cases and 68,374 control subjects to provide a comprehensive genetic evaluation of MS susceptibility.

For 4311 of the 4842 effects (89%) that were prioritized in the discovery analysis, we could identify at least one tagging SNP in the replication data (table S6) (9); 156 regions had at least one genome-wide effect, and overall, 200 prioritized effects reached a level of genome-wide significance (GW) in these 156 regions (Fig. 1). Of these 200 effects, 62 represent secondary, independent, effects that emerged from conditional modeling within a given locus (table S7 and fig. S3) (9). The odds ratios (ORs) of these genome-wide effects ranged from 1.06 to 2.06, and the allele frequencies of the respective risk allele ranged from 2.1 to 98.4% in the European samples of the 1000 Genomes Project reference (mean, 51.3%; standard deviation, 24.5%) (table S8 and fig. S4). Of these 156 regions, 19.9% (31 out of 156) harbored more than one statistically independent GW effect. One of the most complex regions was the one harboring the EVI5 gene, which has been the subject of several reports with contradictory results (1013). In this locus, we identified four statistically independent genome-wide effects, three of which were found under the same association peak (Fig. 2A), illustrating how our approach and the large sample size clarify associations described in smaller studies and can facilitate functional follow-up of complex loci.

Fig. 1 The genetic map of multiple sclerosis.

The circos plot displays the 4842 prioritized autosomal non-MHC effects and the associations in chromosome X. Joint analysis (discovery and replication) P values are plotted as lines (fixed-effects inverse-variance meta-analysis). The green inner layer displays genome-wide significance (P < 5 × 10−8), the blue inner layer displays suggestive P values (1 × 10−5 < P >5 × 10−8), and the gray layer displays P values > 1 × 10−5. Each line in the inner layers represents one effect. Two hundred autosomal non-MHC and one in chromosome X genome-wide effects are listed. The vertical lines in the inner layers represent one effect, and the respective color displays the replication status (supplementary materials, materials and methods): green (genome-wide), blue (suggestive), and red (nonreplicated). Plotted on the outer surface are 551 prioritized genes. The inner circle space includes PPIs among genome-wide genes (green) and between genome-wide genes and suggestive genes (blue) that are identified as candidates by using PPI networks (9).

” data-hide-link-title=”0″ data-icon-position=”” href=”https://science.sciencemag.org/content/sci/365/6460/eaav7188/F2.large.jpg?width=800&height=600&carousel=1″ rel=”gallery-fragment-images-591409918″ title=”The genetic map of multiple sclerosis. The circos plot displays the 4842 prioritized autosomal non-MHC effects and the associations in chromosome X. Joint analysis (discovery and replication) P values are plotted as lines (fixed-effects inverse-variance meta-analysis). The green inner layer displays genome-wide significance (P < 5 × 10−8), the blue inner layer displays suggestive P values (1 × 10−5 < P >5 × 10−8), and the gray layer displays P values > 1 × 10−5. Each line in the inner layers represents one effect. Two hundred autosomal non-MHC and one in chromosome X genome-wide effects are listed. The vertical lines in the inner layers represent one effect, and the respective color displays the replication status (supplementary materials, materials and methods): green (genome-wide), blue (suggestive), and red (nonreplicated). Plotted on the outer surface are 551 prioritized genes. The inner circle space includes PPIs among genome-wide genes (green) and between genome-wide genes and suggestive genes (blue) that are identified as candidates by using PPI networks (9).”>

Fig. 1 The genetic map of multiple sclerosis.

The circos plot displays the 4842 prioritized autosomal non-MHC effects and the associations in chromosome X. Joint analysis (discovery and replication) P values are plotted as lines (fixed-effects inverse-variance meta-analysis). The green inner layer displays genome-wide significance (P < 5 × 10−8), the blue inner layer displays suggestive P values (1 × 10−5 < P >5 × 10−8), and the gray layer displays P values > 1 × 10−5. Each line in the inner layers represents one effect. Two hundred autosomal non-MHC and one in chromosome X genome-wide effects are listed. The vertical lines in the inner layers represent one effect, and the respective color displays the replication status (supplementary materials, materials and methods): green (genome-wide), blue (suggestive), and red (nonreplicated). Plotted on the outer surface are 551 prioritized genes. The inner circle space includes PPIs among genome-wide genes (green) and between genome-wide genes and suggestive genes (blue) that are identified as candidates by using PPI networks (9).

Fig. 2 Multiple independent effects in the EVI5 locus and chromosome X associations.

(A) Regional association plot of the EVI5 locus. Discovery P values (fixed-effects inverse-variance meta-analysis) are displayed. The layer tagged “Step 0” plots the associations of the marginal analysis, with the most statistically significant SNP being rs11809700 (ORT = 1.16; P = 3.51 × 10−15). The “Step 1” plots the associations conditioning on rs11809700; rs12133753 is the most statistically significant SNP (ORC = 1.14; P = 8.53 × 10−09). “Step 2” plots the results conditioning on rs11809700 and rs12133753, with rs1415069 displaying the lowest P value (ORG = 1.10; P = 4.01 × 10−5). Last, “Step 3” plots the associations conditioning on rs11809700, rs12133753, and rs1415069, identifying rs58394161 as the most statistically significant SNP (ORC = 1.10; P = 8.63 × 10−4). All four SNPs reached genome-wide significance in the respective joint (discovery plus replication) analyses (table S7). Each of the four independent SNPs—lead SNPs—are highlighted by use of a triangle in the respective layer. (B) Regional association plot for the genome-wide chromosome X variant. Joint analysis P values (fixed-effects inverse-variance meta-analysis) are displayed. Linkage disequilibrium, in terms of r2 based on the 1000 Genomes European panel, is indicated by use of a combination of color grade and symbol size. All positions are in human genome 19.

” data-hide-link-title=”0″ data-icon-position=”” href=”https://science.sciencemag.org/content/sci/365/6460/eaav7188/F3.large.jpg?width=800&height=600&carousel=1″ rel=”gallery-fragment-images-591409918″ title=”Multiple independent effects in the EVI5 locus and chromosome X associations. (A) Regional association plot of the EVI5 locus. Discovery P values (fixed-effects inverse-variance meta-analysis) are displayed. The layer tagged “Step 0” plots the associations of the marginal analysis, with the most statistically significant SNP being rs11809700 (ORT = 1.16; P = 3.51 × 10−15). The “Step 1” plots the associations conditioning on rs11809700; rs12133753 is the most statistically significant SNP (ORC = 1.14; P = 8.53 × 10−09). “Step 2” plots the results conditioning on rs11809700 and rs12133753, with rs1415069 displaying the lowest P value (ORG = 1.10; P = 4.01 × 10−5). Last, “Step 3” plots the associations conditioning on rs11809700, rs12133753, and rs1415069, identifying rs58394161 as the most statistically significant SNP (ORC = 1.10; P = 8.63 × 10−4). All four SNPs reached genome-wide significance in the respective joint (discovery plus replication) analyses (table S7). Each of the four independent SNPs—lead SNPs—are highlighted by use of a triangle in the respective layer. (B) Regional association plot for the genome-wide chromosome X variant. Joint analysis P values (fixed-effects inverse-variance meta-analysis) are displayed. Linkage disequilibrium, in terms of r2 based on the 1000 Genomes European panel, is indicated by use of a combination of color grade and symbol size. All positions are in human genome 19.”>

Fig. 2 Multiple independent effects in the EVI5 locus and chromosome X associations.

(A) Regional association plot of the EVI5 locus. Discovery P values (fixed-effects inverse-variance meta-analysis) are displayed. The layer tagged “Step 0” plots the associations of the marginal analysis, with the most statistically significant SNP being rs11809700 (ORT = 1.16; P = 3.51 × 10−15). The “Step 1” plots the associations conditioning on rs11809700; rs12133753 is the most statistically significant SNP (ORC = 1.14; P = 8.53 × 10−09). “Step 2” plots the results conditioning on rs11809700 and rs12133753, with rs1415069 displaying the lowest P value (ORG = 1.10; P = 4.01 × 10−5). Last, “Step 3” plots the associations conditioning on rs11809700, rs12133753, and rs1415069, identifying rs58394161 as the most statistically significant SNP (ORC = 1.10; P = 8.63 × 10−4). All four SNPs reached genome-wide significance in the respective joint (discovery plus replication) analyses (table S7). Each of the four independent SNPs—lead SNPs—are highlighted by use of a triangle in the respective layer. (B) Regional association plot for the genome-wide chromosome X variant. Joint analysis P values (fixed-effects inverse-variance meta-analysis) are displayed. Linkage disequilibrium, in terms of r2 based on the 1000 Genomes European panel, is indicated by use of a combination of color grade and symbol size. All positions are in human genome 19.

We also performed a joint analysis of available data on sex chromosome variants (9) and identified rs2807267 as genome-wide significant [odds ratio (OR) for T allele (ORT) = 1.07, P = 6.86 × 10−9; fixed-effects inverse-variance meta-analysis] (tables S9 and S10). This variant lies within an enhancer peak specific for T cells and is 948 base pair (bp) downstream of the RNA U6 small nuclear 320 pseudogene (RNU6-320P), a component of the U6 small nuclear ribonucleoprotein (snRNP) that is part of the spliceosome and responsible for the splicing of introns from pre-mRNA (Fig. 2B) (14). The nearest gene is VGLL1 (27,486 bp upstream) that has been proposed to be a co-activator of mammalian transcription factors (15). No variant in the Y chromosome had a P value lower than 0.05 (fixed-effects inverse-variance meta-analysis).

The MHC was the first MS susceptibility locus to be identified, and prior studies have found that the MHC harbors multiple independent susceptibility variants, including interactions within the class II human leukocyte antigen (HLA) genes (16, 17). We undertook a detailed modeling of this region to account for its long-range linkage disequilibrium and allelic heterogeneity using SNP data as well as imputed classical alleles and amino acids of the HLA genes in the assembled data. We confirmed prior MHC susceptibility variants (including a nonclassical HLA effect located in the TNFA/LST1 long haplotype) and extended the association map to uncover a total of 31 statistically independent effects at the genome-wide level within the MHC (Fig. 3 and table S11). Multiple HLA and nearby non-HLA genes have several independent effects that can now be identified because of our large sample; for example, the HLA-DRB1 locus has six statistically independent effects. Another finding involves HLA-B, which also appears to harbor six independent effects on MS susceptibility. The role of the nonclassical HLA and non-HLA genome in the MHC is also highlighted. One-third (9 out of 31) of the identified variants lie within either intergenic regions or in a long-range haplotype that contains several nonclassical HLA and other non-HLA genes (17).

Fig. 3 Independent associations in the major histocompatibility locus.

Regional association plot in the MHC locus. Only genome-wide statistically independent effects are listed. The order of variants in the x axis represents the order that these were identified. The size of the circle represents different values of –log10(P value) (fixed-effects inverse-variance meta-analysis). Different colors are used to depict class I, II, III, and non-HLA effects. y axis displays million base pairs.

” data-hide-link-title=”0″ data-icon-position=”” href=”https://science.sciencemag.org/content/sci/365/6460/eaav7188/F4.large.jpg?width=800&height=600&carousel=1″ rel=”gallery-fragment-images-591409918″ title=”Independent associations in the major histocompatibility locus. Regional association plot in the MHC locus. Only genome-wide statistically independent effects are listed. The order of variants in the x axis represents the order that these were identified. The size of the circle represents different values of –log10(P value) (fixed-effects inverse-variance meta-analysis). Different colors are used to depict class I, II, III, and non-HLA effects. y axis displays million base pairs.”>

Fig. 3 Independent associations in the major histocompatibility locus.

Regional association plot in the MHC locus. Only genome-wide statistically independent effects are listed. The order of variants in the x axis represents the order that these were identified. The size of the circle represents different values of –log10(P value) (fixed-effects inverse-variance meta-analysis). Different colors are used to depict class I, II, III, and non-HLA effects. y axis displays million base pairs.

Recently, we reported an interaction between HLA-DRB1*15:01 and HLA-DQA1*01:01 by analyzing imputed HLA alleles (16). In this work, we reinforced this analysis by analyzing SNPs, HLA alleles, and respective amino acids. We replicated the presence of interactions among class II alleles, but the second interaction term, besides HLA-DRB1*15:01, can vary depending on the other independent variants that are included in the model. First, we found that there are interaction models of HLA-DRB1*15:01 with other variants in MHC that explain better the data than our previously reported HLA-DRB1*15:01/HLA-DQA1*01:01 interaction term (fig. S5). Second, we observed that there is a group of HLA*DQB1 and HLA*DQA1 SNPs, alleles, and amino acids that consistently rank among the best models with HLA-DRB1*15:01 interaction terms (fig. S6). This group of HLA-DRB1*15:01–interacting variants is consistently identified regardless of the marginal effects of other statistically independent variants that are added in the model, implying that these interaction terms capture a different subset of phenotypic variance and can be explored after the identification of the marginal effects. Last, we performed a sensitivity analysis by including interaction terms of HLA-DRB1*15:01 in each step and selecting the model with the lowest Bayesian information criterion instead of testing only the marginal results of the variants, as we did in the main analysis (classical model MHC analysis) (table S12). This sensitivity analysis also resulted in 32 statistically independent effects with a genome-wide significant P value (fixed-effects inverse-variance meta-analysis) (table S12), of which one-third (9 out of 32) were different than the ones in classical model MHC analysis. The main differences between the results of the two approaches were the inclusion of interaction of HLA-DRB1*15:01 and rs1049058 in step 3 and the stronger association of HLA*DPB1/2 effects over HLA*DRB1 effects in the sensitivity model (tables S12 and S13 and fig. S6). Thus, overall, our MHC results are not strongly affected by the analytic model that we have selected.

Characterization of non–genome-wide effects

The commonly used threshold of genome-wide significance (P = 5 × 10−8) has played an important role in making human genetic study results robust; however, several studies have demonstrated that non–genome-wide effects explain an important proportion of the effect of genetic variation on disease susceptibility (18, 19). More importantly, several such effects are eventually identified as genome-wide significant, given enough sample size and true effects (3). Thus, we also evaluated the non–genome-wide effects that were selected for replication, had available replication data (n = 4111 effects), but did not meet a standard threshold of genome-wide significance (P < 5 × 10−8). Specifically, we decided to stratify these 4111 effects into two main categories: (i) suggestive effects (S; n = 416), and (ii) nonreplicated effects (NR; n = 3,695) (9). We used these categories in downstream analyses to further characterize the prioritized effects from the discovery study in terms of potential to eventually be replicated. We also included a third category: effects for which there were no data for replication in any of the replication sets [no data (ND); n = 532). Furthermore, to add granularity in each category, we substratified the suggestive effects into two groups: (1a) strongly suggestive (sS; n = 117; 5 × 10−8 < P < 1 × 10−5, fixed-effects inverse-variance meta-analysis) and (1b) underpowered suggestive (unS; n = 299). Of these two categories of suggestive effects, the ones in the sS category have a high probability of reaching genome-wide significance as we increase our sample size in future studies (table S14) (9).

Heritability explained

To estimate the extent to which we have characterized the genetic architecture of MS susceptibility with our 200 genome-wide non-MHC autosomal MS effects, we calculated the narrow-sense heritability captured by common variation (h2g), the ratio of additive genetic variance to the total phenotypic variance (18, 20). Only the 15 strata of data from the discovery set had true genome-wide coverage, and hence, we used these 14,802 MS subjects and 26,703 controls for the heritability analyses. The overall heritability estimate for MS susceptibility in the discovery set of subjects was 19.2% [95% confidence interval (CI), 18.5 to 19.8%). Heritability partitioning by using minor allele frequency or P value thresholds has led to substantial insights in previous studies (21), and we therefore applied a similar partitioning approach but in a fashion that took into consideration the study design and the existence of replication information from the two large-scale replication cohorts. First, we partitioned the autosomal genome into three components: (i) the super extended MHC (SE MHC), (ii) a component with the 1961 regions prioritized for replication (Regions), and (iii) the rest of the genome that had P > 0.05 (fixed-effects inverse-variance meta-analysis) in the discovery study (Nonassociated regions). Then, we estimated the h2g that can be attributed to each component as a proportion of the overall narrow-sense heritability observed. The SE MHC explained 21.4% of the h2g, with the remaining 78.6% being captured by the second component (Fig. 4A). Then, we further partitioned the non-MHC component into one that captured all 4842 statistically independent effects (Prioritized for replication), which explained the vast majority of the overall estimated heritability: 68.3%. The “Nonprioritized” SNPs in the 1961 regions explained 11.6% of the heritability, which suggests that there may be residual linkage disequilibrium (LD) with prioritized effects or true effects that have not yet been identified (Fig. 4B).

Fig. 4 Heritability partitioning.

Proportion of the overall narrow-sense heritability under the liability model (~19.2%) explained with different genetic components. (A) The overall heritability is partitioned in the SE MHC, the 1962 regions that include all SNPs with P <0.05 (Regions; fixed-effects inverse-variance meta-analysis), and the rest of genome with P >0.05 (Nonassociated regions). (B) The Regions are further partitioned to the seemingly statistically independent effects (Prioritized) and the residual effects (Nonprioritized). (C) The Prioritized component is partitioned on the basis of the replication knowledge to genome-wide effects (GW), suggestive (S), nonreplicated (ND), and no data (ND). The lines connecting the pie charts depict the component that is partitioned. All values were estimated by using the discovery data sets (n = 4802 cases and 26,703 controls).

” data-hide-link-title=”0″ data-icon-position=”” href=”https://science.sciencemag.org/content/sci/365/6460/eaav7188/F5.large.jpg?width=800&height=600&carousel=1″ rel=”gallery-fragment-images-591409918″ title=”Heritability partitioning. Proportion of the overall narrow-sense heritability under the liability model (~19.2%) explained with different genetic components. (A) The overall heritability is partitioned in the SE MHC, the 1962 regions that include all SNPs with P 0.05 (Nonassociated regions). (B) The Regions are further partitioned to the seemingly statistically independent effects (Prioritized) and the residual effects (Nonprioritized). (C) The Prioritized component is partitioned on the basis of the replication knowledge to genome-wide effects (GW), suggestive (S), nonreplicated (ND), and no data (ND). The lines connecting the pie charts depict the component that is partitioned. All values were estimated by using the discovery data sets (n = 4802 cases and 26,703 controls).”>

Fig. 4 Heritability partitioning.

Proportion of the overall narrow-sense heritability under the liability model (~19.2%) explained with different genetic components. (A) The overall heritability is partitioned in the SE MHC, the 1962 regions that include all SNPs with P <0.05 (Regions; fixed-effects inverse-variance meta-analysis), and the rest of genome with P >0.05 (Nonassociated regions). (B) The Regions are further partitioned to the seemingly statistically independent effects (Prioritized) and the residual effects (Nonprioritized). (C) The Prioritized component is partitioned on the basis of the replication knowledge to genome-wide effects (GW), suggestive (S), nonreplicated (ND), and no data (ND). The lines connecting the pie charts depict the component that is partitioned. All values were estimated by using the discovery data sets (n = 4802 cases and 26,703 controls).

We then used the replication-based categories described above to further partition the “Prioritized” heritability component, namely “GW,” “S,” “NR,” and “ND” (Fig. 4C). The GW captured 18.3% of the overall heritability. Thus, along with the contribution of the SE MHC (20.2% in the same model), we can explain ~39% of the genetic predisposition to MS with the validated susceptibility alleles. This can be extended to ~48% if we include the suggestive (S) effects (9.0%). The nonreplicated (NR) effects captured 38.8% of the heritability, which could imply that some of these effects might be falsely nonreplicated—that these are true effects that need further data to emerge robustly or that their effect may be true and present in only a subset of the data. However, few of the 3695 NR effects would fall in either of the above two cases; the vast majority of these effects are likely to be false-positive results.

Functional implications of the MS loci, enriched pathways, and gene sets

Next, we began to annotate the MS effects. To prioritize the cell types or tissues in which the 200 non-MHC autosomal effects may exert their effect, we used two different approaches: one that leverages atlases of gene expression patterns and another that uses a catalog of epigenomic features such as deoxyribonuclease hypersensitivity sites (DHSs) (8, 9, 2224). Significant enrichment for MS susceptibility loci was apparent in many different immune cell types and tissues, whereas there was an absence of enrichment in tissue-level central nervous system (CNS) profiles (Fig. 5). The enrichment is observed not only in immune cells that have long been studied in MS, such as T cells, but also in B cells, whose role has emerged more recently (25). Furthermore, although the adaptive immune system has been proposed to play a predominant role in MS onset (26), we now demonstrate that many elements of innate immunity, such as natural killer (NK) cells and dendritic cells, also display strong enrichment for MS susceptibility genes. At the tissue level, the role of the thymus is also highlighted, possibly suggesting a role of genetic variation in thymic selection of autoreactive T cells in MS (27). Public tissue-level CNS data—which are derived from a complex mixture of cell types—do not show an excess of MS susceptibility variants in annotation analyses. However, since MS is a disease of the CNS, we extended the annotation analyses by analyzing data generated from human iPSC-derived neurons as well as from purified primary human astrocytes and microglia (9). As seen in Fig. 6, enrichment for MS genes is seen in human microglia (P = 5 × 10−14) but not in astrocytes or neurons, suggesting that the resident immune cells of the brain may also play a role in MS susceptibility.

Fig. 5 Tissue- and cell-type enrichment analyses.

(A) Gene Atlas tissues and cell types gene expression enrichment. (B) DHS enrichment for tissues and cell types from the NIH Epigenetic Roadmap. Rows are sorted from immune cells or tissues to CNS-related ones. Both x axes display –log10 of Benjamini and Hochberg P values (FDR). The vertical black line highlights the threshold of significance for the enrichment analysis.

” data-hide-link-title=”0″ data-icon-position=”” href=”https://science.sciencemag.org/content/sci/365/6460/eaav7188/F6.large.jpg?width=800&height=600&carousel=1″ rel=”gallery-fragment-images-591409918″ title=”Tissue- and cell-type enrichment analyses. (A) Gene Atlas tissues and cell types gene expression enrichment. (B) DHS enrichment for tissues and cell types from the NIH Epigenetic Roadmap. Rows are sorted from immune cells or tissues to CNS-related ones. Both x axes display –log10 of Benjamini and Hochberg P values (FDR). The vertical black line highlights the threshold of significance for the enrichment analysis.”>

Fig. 5 Tissue- and cell-type enrichment analyses.

(A) Gene Atlas tissues and cell types gene expression enrichment. (B) DHS enrichment for tissues and cell types from the NIH Epigenetic Roadmap. Rows are sorted from immune cells or tissues to CNS-related ones. Both x axes display –log10 of Benjamini and Hochberg P values (FDR). The vertical black line highlights the threshold of significance for the enrichment analysis.

Fig. 6 Dissection of cortical RNA-seq data.

(A) A heatmap of the results of our analysis assessing whether a cortical eQTL is likely to come from one of the component cell types of the cortex: neurons, oligodendrocytes, endothelial cells, microglia, and astrocytes (in rows). Each column presents results for one of the MS brain eQTLs. The color scheme relates to the P value of the interaction term (linear regression), with red denoting a more extreme result. (B) The same results in a different form, comparing results of assessing for interaction with neuronal proportion (y axis) and microglial proportion (x axis). The SLC12A5 eQTL is significantly stronger when accounting for neuronal proportion, and CLECL1 is significantly stronger when accounting for microglia. The Bonferroni-corrected threshold of significance is highlighted by the dashed line. (C) Locus view of the SLC12A5/CD40 locus, illustrating the distribution of MS susceptibility and the SLC12A5 brain eQTL in a segment of chromosome 20 (x axis); the y axis presents the P value of association with MS susceptibility (top; fixed effects inverse-variance meta-analysis) or SLC12A5 RNA expression (bottom; linear regression). The lead MS SNP is denoted by a triangle; other SNPs are circles, with the intensity of the red color denoting the strength of LD with the lead MS SNP. (D) Plot of the level of expression, transcriptome-wide, for each measured gene in our cortical RNA-seq dataset (n = 455) (y axis) and purified human microglia (n = 10) (x axis) from the same cortical region. In blue, we highlight those genes with greater than fourfold increased expression in microglia relative to bulk cortical tissue and are expressed at a reasonable level in microglia. Each dot is one gene. Gray dots denote the 551 putative MS genes from our integrated analysis. SLC12A5 and CLECL1 are highlighted in red; in blue, we highlight a selected subset of the MS genes—many of them well-validated—which are enriched in microglia. For clarity, we did not include all of the MS genes that fall in this category.

” data-hide-link-title=”0″ data-icon-position=”” href=”https://science.sciencemag.org/content/sci/365/6460/eaav7188/F7.large.jpg?width=800&height=600&carousel=1″ rel=”gallery-fragment-images-591409918″ title=”Dissection of cortical RNA-seq data. (A) A heatmap of the results of our analysis assessing whether a cortical eQTL is likely to come from one of the component cell types of the cortex: neurons, oligodendrocytes, endothelial cells, microglia, and astrocytes (in rows). Each column presents results for one of the MS brain eQTLs. The color scheme relates to the P value of the interaction term (linear regression), with red denoting a more extreme result. (B) The same results in a different form, comparing results of assessing for interaction with neuronal proportion (y axis) and microglial proportion (x axis). The SLC12A5 eQTL is significantly stronger when accounting for neuronal proportion, and CLECL1 is significantly stronger when accounting for microglia. The Bonferroni-corrected threshold of significance is highlighted by the dashed line. (C) Locus view of the SLC12A5/CD40 locus, illustrating the distribution of MS susceptibility and the SLC12A5 brain eQTL in a segment of chromosome 20 (x axis); the y axis presents the P value of association with MS susceptibility (top; fixed effects inverse-variance meta-analysis) or SLC12A5 RNA expression (bottom; linear regression). The lead MS SNP is denoted by a triangle; other SNPs are circles, with the intensity of the red color denoting the strength of LD with the lead MS SNP. (D) Plot of the level of expression, transcriptome-wide, for each measured gene in our cortical RNA-seq dataset (n = 455) (y axis) and purified human microglia (n = 10) (x axis) from the same cortical region. In blue, we highlight those genes with greater than fourfold increased expression in microglia relative to bulk cortical tissue and are expressed at a reasonable level in microglia. Each dot is one gene. Gray dots denote the 551 putative MS genes from our integrated analysis. SLC12A5 and CLECL1 are highlighted in red; in blue, we highlight a selected subset of the MS genes—many of them well-validated—which are enriched in microglia. For clarity, we did not include all of the MS genes that fall in this category.”>

Fig. 6 Dissection of cortical RNA-seq data.

(A) A heatmap of the results of our analysis assessing whether a cortical eQTL is likely to come from one of the component cell types of the cortex: neurons, oligodendrocytes, endothelial cells, microglia, and astrocytes (in rows). Each column presents results for one of the MS brain eQTLs. The color scheme relates to the P value of the interaction term (linear regression), with red denoting a more extreme result. (B) The same results in a different form, comparing results of assessing for interaction with neuronal proportion (y axis) and microglial proportion (x axis). The SLC12A5 eQTL is significantly stronger when accounting for neuronal proportion, and CLECL1 is significantly stronger when accounting for microglia. The Bonferroni-corrected threshold of significance is highlighted by the dashed line. (C) Locus view of the SLC12A5/CD40 locus, illustrating the distribution of MS susceptibility and the SLC12A5 brain eQTL in a segment of chromosome 20 (x axis); the y axis presents the P value of association with MS susceptibility (top; fixed effects inverse-variance meta-analysis) or SLC12A5 RNA expression (bottom; linear regression). The lead MS SNP is denoted by a triangle; other SNPs are circles, with the intensity of the red color denoting the strength of LD with the lead MS SNP. (D) Plot of the level of expression, transcriptome-wide, for each measured gene in our cortical RNA-seq dataset (n = 455) (y axis) and purified human microglia (n = 10) (x axis) from the same cortical region. In blue, we highlight those genes with greater than fourfold increased expression in microglia relative to bulk cortical tissue and are expressed at a reasonable level in microglia. Each dot is one gene. Gray dots denote the 551 putative MS genes from our integrated analysis. SLC12A5 and CLECL1 are highlighted in red; in blue, we highlight a selected subset of the MS genes—many of them well-validated—which are enriched in microglia. For clarity, we did not include all of the MS genes that fall in this category.

We repeated the enrichment analyses for the S and NR effects, aiming to test whether these have a similar enrichment pattern with the 200 GW effects. The S effects exhibited a pattern of enrichment that is similar to that of the GW effects, with only B cell expression reaching a threshold of statistical significance (fig. S7). This provides additional circumstantial evidence that this category of variants may harbor true causal associations. On the other hand, the NR enrichment results seem to follow a rather random pattern, suggesting that most of these effects are indeed not truly MS-related (fig. S7).

The strong enrichment of the GW effects in immune cell types motivated us to prioritize candidate MS susceptibility genes by identifying those susceptibility variants, which affect RNA expression of nearby genes [cis expression quantitative trait loci effect (ciseQTL)] [±500 kilobase pairs (kbp) around the effect SNP] (9). Thus, we interrogated the potential function of MS susceptibility variants in naive CD4+ T cells and monocytes from 211 healthy subjects as well as peripheral blood mononuclear cells (PBMCs) from 225 remitting relapsing MS subjects. Out of the 200 GW MS effects, 36 (18%) had at least one tagging SNP (r2 ≥ 0.5) that altered the expression of 46 genes [false discovery rate (FDR) < 5%] in CD4+ naïve T cells (tables S15 and S16), and 36 MS effects (18%; 10 common with the CD4+ naïve T cells) influenced the expression of 48 genes in monocytes (11 genes in common with T cells). In MS PBMC, 30% of the GW effects (60 out of the 200) were cis-eQTLs for 92 genes in the PBMC MS samples, with several loci being shared with those found in healthy T cells and monocytes (26 effects and 27 genes in T cells, and 21 effects and 24 genes in monocytes, respectively) (tables S15 and S16).

Because MS is a disease of the CNS, we also investigated a large collection of dorsolateral prefrontal cortex RNA sequencing profiles from two longitudinal cohort studies of aging (n = 455 subjects), which recruit cognitively nonimpaired individuals (9). This cortical sample provides a tissue-level profile derived from a complex mixture of neurons, astrocytes, and other parenchymal cells, such as microglia and occasional peripheral immune cells. In these data, we found that 66 of the GW MS effects (33% of the 200 effects) were cis-eQTLs for 104 genes. Over this CNS and the three immune sets of data, 104 GW effects were cis-eQTLs for 203 different genes (n = 211 cis-eQTLs), with several appearing to be seemingly specific for one of the cell or tissue type (table S16). Specifically, 21.2% (45 out of 211 cis-eQTLs) of these cortical cis-eQTLs displayed no evidence of association [P > 0.05, for linear regression (9), with any SNP with r2 > 0.1] in the immune cell and PBMCs results and are less likely to be immune-related (tables S16 and S17).

To further explore the challenging and critical question of whether some of the MS variants have an effect that is primarily exerted through a nonimmune cell, we performed a secondary analysis of our cortical RNA-sequencing (RNA-seq) data in which we attempted to ascribe a brain cis-eQTL to a particular cell type. Specifically, we assessed our tissue-level profile and adjusted each cis-eQTL analysis for the proportion of neurons, astrocytes, microglia, and oligodendrocytes estimated to be present in the tissue: The hypothesis was that the effect of a SNP with a cell type–specific cis-eQTL would be stronger if we adjusted for the proportion of the target cell type (Fig. 6 and fig. S8). As anticipated, almost all of the MS variants present in cortex remain ambiguous; it is likely that many of them influence gene function in multiple immune and nonimmune cell types. However, the SLC12A5 locus is different; here, the effect of the SNP is significantly stronger when we account for the proportion of neurons (Fig. 6, A and B), and the CLECL1 locus emerges when we account for the proportion of microglia. SLC12A5 is a potassium/chloride transporter that is known to be expressed in neurons, and a rare variant in SLC12A5 causes a form of pediatric epilepsy (28, 29). Although this MS locus may therefore appear to be a good candidate to have a primarily neuronal effect, further evaluation found that this MS susceptibility haplotype also harbors susceptibility to rheumatoid arthritis (30) and a cis-eQTL in B cells for the CD40 gene (31). Thus, the same haplotype harbors different functional effects in very different contexts, illustrating the challenge in dissecting the functional consequences of autoimmune variants in immune function as opposed to the tissue targeted in autoimmune disease. However, CLECL1 represents a simpler case of a known susceptibility effect that has previously been linked to altered CLECL1 RNA expression in monocytes (26, 32); its enrichment in microglial cells, which share many molecular pathways with other myeloid cells, is more straightforward to understand. CLECL1 is expressed at low levels in our cortical RNA-seq profiles because microglia represent just a small fraction of cells at the cortical tissue level, and CLECL1’s expression level is 20-fold greater when we compare its level of expression in purified human cortical microglia with the bulk cortical tissue (Fig. 6). CLECL1 therefore suggests a potential role of microglia in MS susceptibility, which is underestimated in bulk tissue profiles that are available in epigenomic and transcriptomic reference data. Overall, many genes that are eQTL targets of MS variants in the human cortex are most likely to affect multiple cell types. These brain eQTL results and the enrichment found in analyses of our purified human microglia data therefore highlight the need for more targeted, cell-type–specific data for the CNS to adequately determine the extent of its role in MS susceptibility.

These eQTL studies begin to transition our genetic map into a resource outlining the likely MS susceptibility gene(s) in a locus and the potential functional consequences of certain MS variants. To assemble these single-locus results into a higher-order perspective of MS susceptibility, we turned to pathway analyses to evaluate how the extended list of genome-wide effects provides new insights into the pathophysiology of the disease. Acknowledging that there is no available method to identify all causal genes after genome-wide association study (GWAS) discoveries, we prioritized genes for pathway analyses while allowing several different hypotheses for mechanisms of actions (9). In brief, we prioritized genes that (i) were cis-eQTLs in any of the eQTL data sets outlined above, (ii) had at least one exonic variant at r2 ≥ 0.1 with any of the 200 effects, (iii) had high scores of regulatory potential by using a cell-specific network approach, and (iv) had a similar coexpression pattern as identified with DEPICT (33). Sensitivity analyses were performed that included different combinations of the above categories and included genes with intronic variants at r2 ≥ 0.5 with any of the 200 effects (9). Overall, we prioritized 551 candidate MS genes (table S18; sensitivity analyses are provided in table S19) to test for statistical enrichment of known pathways. Approximately 39.6% (142 out of 358) of the Ingenuity Pathway Analysis canonical pathways (34), which had overlap with at least one of the identified genes, were enriched for MS genes at a FDR < 5% (table S20). Sensitivity analyses that included different criteria to prioritize genes revealed a similar pattern of pathway enrichment (table S21) (9). The extensive list of susceptibility genes, which more than doubles the previous knowledge in MS, captures processes of development, maturation, and terminal differentiation of several immune cells that potentially interact to predispose to MS. In particular, the role of B cells, dendritic cells, and NK cells has emerged more clearly, broadening the prior narrative of T cell dysregulation that emerged from earlier studies (4). Given the overrepresentation of immune pathways in these databases, ambiguity remains as to where some variants may have their effect: Neurons and particularly astrocytes repurpose the component genes of many “immune” signaling pathways, such as the ciliary neurotrophic factor, nerve growth factor, and neuregulin signaling pathways that are highly significant in our analysis (table S20). These results—along with the results relating to microglia—emphasize the need for further dissection of these pathways in specific cell types to resolve where a variant is exerting its effect; it is possible that multiple, different cell types could be involved in disease because they all experience the effect of the variant.

Pathway and gene-set enrichment analyses can only identify statistically significant connections of genes in already reported, and in some cases validated, mechanisms of action. However, the function of many genes is yet to be uncovered, and even for well-studied genes, the full repertoire of possible mechanisms is still incomplete. To complement the pathway analysis approach and to explore the connectivity of our prioritized GW genes, we performed a protein-protein interaction (PPI) analysis using GeNets (9, 35). About one-third of the 551 prioritized genes (n = 190; 34.5%) were connected (P = 0.052; permutation-based P value), and these could be organized into 13 communities—subnetworks with higher connectivity (P < 0.002; permutation-based P value) (table S22). This compares with nine communities that could be identified by the previously reported MS susceptibility list (81 connected genes out of 307) (table S23) (3). Next, we leveraged GeNets to predict candidate genes on the basis of network connectivity and pathway membership similarity and tested whether our previous known MS susceptibility list could have predicted any of the genes prioritized by the newly identified effects. Of the 244 genes prioritized by new findings (out of the 551 overall prioritized genes), only five could be predicted given the old results (out of 70 candidates that emerge from the extrapolation of earlier data) (fig. S9 and table S24). In a similar fashion, we estimated that the list of 551 prioritized genes could predict 102 new candidate genes, four of which can be prioritized because they are in the list of suggestive effects. (Fig. 1, fig. S10, and table S25).

Discussion

This detailed genetic map of MS is a powerful substrate for annotation and functional studies and provides a new level of understanding for the molecular events that contribute to MS susceptibility. Although the exact amount of MS’s heritability varies given the data and method used (3638), we report that our findings can explain up to 48% of the heritability that can be estimated by using large-scale GWAS data. It is clear that these events are widely distributed across the many different cellular components of both the innate and adaptive arms of the immune system: Every major immune cell type is enriched for MS susceptibility genes. An important caveat is that many of the implicated molecular pathways, such as response to tumor necrosis factor–α and type I interferons, are repurposed in many different cell types, leading to an important ambiguity: Is risk of disease driven by altered function of only one of the implicated cell types, or are all of them contributing to susceptibility equally? This question highlights the important issue of the context in which these variants are exerting their effects. We have been thorough in our evaluation of available reference epigenomic data, but many different cell types and cell states remain to be characterized and could alter our summary. Further, interindividual variability has not been established in such reference data that are typically produced from one or a handful of individuals; thus, this issue is better evaluated in the eQTL data, where we have examined a range of samples and states in a large numbers of subjects. Overall, although we have identified putative functional consequences for the identified MS variants, the functional consequence of most of these MS variants requires further investigation.

Even where a function is reported, further work is needed to demonstrate that the effect is the causal functional change. This is particularly true of the role of the CNS in MS susceptibility; we mostly have data at the level of the human cortex, a complex tissue with many different cell types, including resident microglia and a small number of infiltrating macrophage and lymphocytes. MS variants clearly influence gene expression in this tissue, and we must now (i) resolve the implicated cell types and whether pathways shared with immune cells are having their MS susceptibility effect in the periphery or in the brain, and (ii) more deeply identify additional functional consequences that may be present in only a subset of cells, such as microglia or activated astrocytes, that are obscured in the cortical tissue level profile. A handful of loci are intriguing in that they alter gene expression in the human cortex but not in the sampled immune cells; these MS susceptibility variants deserve close examination to resolve the important question of the extent to which the CNS is involved in disease onset. Thus, our study suggests that although MS is a disease whose origin may lie primarily within the peripheral immune compartment where dysregulation of all branches of the immune system leads to organ-specific autoimmunity, there is subset of loci with a key role in directing the tissue-specific autoimmune response. This is similar to our previous examination of ulcerative colitis, in which we observed enrichment of genetic variants mapping to colon tissue (7). This view is consistent with our understanding of the mechanism of important MS therapies, such as natalizumab and fingolimod, that sequester pathogenic immune cell populations in the peripheral circulation to prevent episodes of acute CNS inflammation. It also has important implications as we begin to consider prevention strategies to block the onset of the disease by early targeting of peripheral immune cells.

An important step forward in MS genetics, for a disease with a 3:1 preponderance of women being affected, is robust evidence for a susceptibility locus on the X chromosome. Although chromosome X associations cannot be the sole explanation for the preponderance of women among MS patients, the discovery of an MS locus on the X chromosome is an exciting first step toward understanding the genetic contributions of this strong sex bias. This result also highlights the need for additional, dedicated genetic studies of the sex chromosomes in MS because existing data have not been fully leveraged (39). Future studies will also need to incorporate the interaction of the autosomal genome with factors that can affect the sex bias, such as hormones (40).

This genomic map of MS—the genetic map and its integrated functional annotation—is a foundation on which the next generation of projects will be developed. It is an important substrate with which to further dissect the genetic architecture of MS by accounting for the contribution of sex, evaluating the possibility of interaction among loci, and assessing other important factors, such as heterogeneity of effects across human populations or certain subsets of patients given the heterogeneity of this disease. In the current study, we have included individuals with either the relapsing remitting or the progressive form of MS because they are currently conceptualized to belong to the same disease spectrum. Further investigation may lead to the identification of variants that have an effect on the neurodegenerative component of MS, which is largely genetically distinct from MS susceptibility (41). Beyond the characterization of the molecular events that trigger MS, this map will also inform the development of primary prevention strategies because we can leverage this information to identify the subset of individuals who are at greatest risk of developing MS. Although insufficient by itself, an MS genetic risk score has a role to play in guiding the management of the population of individuals “at risk” of MS (such as family members) when deployed in combination with other measures of risk and biomarkers that capture intermediate phenotypes along the trajectory from health to disease (42). We thus report an important milestone in the investigation of MS and share a roadmap for future work: the establishment of a map with which to guide the development of the next generation of studies with high-dimensional molecular data to explore both the initial steps of immune dysregulation across both the adaptive and innate arms of the immune system, and second, the translation of this autoimmune process to the CNS, where it triggers a neurodegenerative cascade.

Materials and methods

Detailed materials and methods are listed in the supplementary materials (9). In brief, we analyzed genetic data from 15 GWASs of MS. For the autosomal non-MHC genome, we applied a partitioning approach to create regions of ±1 Mbp around the most statistically significant SNP. Then, we performed stepwise conditional analyses within each region to identify statistically independent effects (n = 4842). We replicated these effects in two large-scale replication cohorts: (i) nine data sets genotyped with the MS Replication Chip and (ii) eleven data sets genotyped with the ImmunoChip. Chromosomes X and Y were analyzed jointly across all the data sets, the discovery and replication. The extended MHC region was also analyzed jointly across all data sets. We further imputed HLA class I and II alleles and corresponding amino acids. Statistically independent effects in the autosomal non-MHC genome were grouped into four categories after replication: (i) genome-wide effects (GW), (ii) suggestive effects (S), (iii) nonreplicated (NR), and (iv) no replication data (ND). Narrow-sense heritability was estimated for various combinations of these effects, and the extended MHC region, to quantify the amount of the heritability our findings could explain. Next, we leveraged enrichment methods and tissue or cell reference data sets to characterize the potential involvement of the identified MS effects in the immune and central nervous system, at the tissue and cellular level. We developed an ensemble approach to prioritize genes putatively associated with the identify effects, leveraging cell-specific eQTL studies, network approaches, and genomic annotations. We performed pathway analyses to characterize canonical pathways statistically enriched for the putative causal genes. Last, we leveraged protein-protein interaction networks to quantify the degree of connectivity of the putative causal genes and identify new mechanisms of action.

The International Multiple Sclerosis Genetics Consortium

Nikolaos A Patsopoulos1,2,3,4, Sergio E. Baranzini5, Adam Santaniello5, Parisa Shoostari4,6,7*, Chris Cotsapas4,6,7, Garrett Wong1,3, Ashley H. Beecham8, Tojo James9, Joseph Replogle2,3,4,10, Ioannis S. Vlachos1,3,4, Cristin McCabe4, Tune H. Pers11, Aaron Brandes4, Charles White4,10, Brendan Keenan12, Maria Cimpean10, Phoebe Winn10, Ioannis-Pavlos Panteliadis1,4, Allison Robbins10, Till F. M. Andlauer13,14,15, Onigiusz Zarzycki1,4, Bénédicte Dubois16, An Goris16, Helle Bach Søndergaard17, Finn Sellebjerg17, Per Soelberg Sorensen17, Henrik Ullum18, Lise Wegner Thørner18, Janna Saarela19, Isabelle Cournu-Rebeix20, Vincent Damotte20,21, Bertrand Fontaine20,22, Lena Guillot-Noel20, Mark Lathrop23,24,25, Sandra Vukusic26,27,28, Achim Berthele14,15, Viola Pongratz14,15, Dorothea Buck14,15, Christiane Gasperi14,15, Christiane Graetz15,29, Verena Grummel14,15, Bernhard Hemmer14,15,30, Muni Hoshi14,15, Benjamin Knier14,15, Thomas Korn14,15,30, Christina M. Lill15,31,32, Felix Luessi15,31, Mark Mühlau14,15, Frauke Zipp15,31, Efthimios Dardiotis33, Cristina Agliardi34, Antonio Amoroso35, Nadia Barizzone36, Maria D. Benedetti37,38, Luisa Bernardinelli39, Paola Cavalla40, Ferdinando Clarelli41, Giancarlo Comi41,42, Daniele Cusi43, Federica Esposito41,44, Laura Ferrè44, Daniela Galimberti45,46, Clara Guaschino41,44, Maurizio A. Leone47, Vittorio Martinelli44, Lucia Moiola44, Marco Salvetti48,49, Melissa Sorosina41, Domizia Vecchio50, Andrea Zauli41, Silvia Santoro41, Nicasio Mancini51, Miriam Zuccalà52, Julia Mescheriakova53, Cornelia van Duijn53,54, Steffan D. Bos55, Elisabeth G. Celius55,56, Anne Spurkland57, Manuel Comabella58, Xavier Montalban58, Lars Alfredsson59, Izaura L. Bomfim60, David Gomez-Cabrero60,61,62, Jan Hillert60, Maja Jagodic60, Magdalena Lindén60, Fredrik Piehl60, Ilijas Jelčić63,64, Roland Martin63,64, Mirela Sospedra63,64, Amie Baker65, Maria Ban66, Clive Hawkins66, Pirro Hysi67, Seema Kalra68, Fredrik Karpe68, Jyoti Khadake69, Genevieve Lachance67, Paul Molyneux67, Matthew Neville68, John Thorpe70, Elizabeth Bradshaw10, Stacy J. Caillier5, Peter Calabresi71, Bruce A. C. Cree5, Anne Cross72, Mary Davis73, Paul W. I. de Bakker2,3,4†, Silvia Delgado74, Marieme Dembele71, Keith Edwards75, Kate Fitzgerald71, Irene Y. Frohlich10, Pierre-Antoine Gourraud5,76, Jonathan L Haines77, Hakon Hakonarson78,79, Dorlan Kimbrough3,80, Noriko Isobe5,81, Ioanna Konidari8, Ellen Lathi82, Michelle H. Lee10, Taibo Li83, David An83, Andrew Zimmer83, Lohith Madireddy5, Clara P. Manrique8, Mitja Mitrovic4,6,7, Marta Olah10, Ellis Patrick10,84,85, Margaret A. Pericak-Vance8, Laura Piccio71, Cathy Schaefer86, Howard Weiner87, Kasper Lage82, ANZgene, IIBDGC, WTCCC2, Alastair Compston64, David Hafler4,88, Hanne F. Harbo54,55, Stephen L. Hauser5, Graeme Stewart89, Sandra D’Alfonso90, Georgios Hadjigeorgiou33, Bruce Taylor91, Lisa F. Barcellos92, David Booth93, Rogier Hintzen94, Ingrid Kockum9, Filippo Martinelli-Boneschi41,42, Jacob L. McCauley8, Jorge R. Oksenberg5, Annette Oturai16, Stephen Sawcer62, Adrian J. Ivinson93, Tomas Olsson9, Philip L. De Jager4,10

1Systems Biology and Computer Science Program, Ann Romney Center for Neurological Diseases, Department of Neurology, Brigham & Women’s Hospital, Boston, MA 02115, USA. 2Division of Genetics, Department of Medicine, Brigham & Women’s Hospital, Harvard Medical School, Boston, MA, USA. 3Harvard Medical School, Boston, MA 02115, USA. 4Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA, USA. 5Department of Neurology, University of California at San Francisco, Sandler Neurosciences Center, 675 Nelson Rising Lane, San Francisco, CA 94158, USA. 6Department of Neurology, Yale University School of Medicine, New Haven, CT 06520, USA. 7Department of Genetics, Yale School of Medicine, New Haven, CT 06520, USA. 8John P. Hussman Institute for Human Genomics, University of Miami, Miller School of Medicine, Miami, FL 33136, USA. 9Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden. 10Center for Translational and Computational Neuroimmunology, Multiple Sclerosis Center, Department of Neurology, Columbia University Medical Center, New York, NY, USA. 11The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, 2100, Denmark. 12Center for Sleep and Circadian Neurobiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA. 13Max Planck Institute of Psychiatry, 80804 Munich, Germany. 14Department of Neurology, Klinikum rechts der Isar, Technical University of Munich, 81675 Munich, Germany. 15German competence network for multiple sclerosis. 16KU Leuven Department of Neurosciences, Laboratory for Neuroimmunology, Herestraat 49 bus 1022, 3000 Leuven, Belgium. 17Danish Multiple Sclerosis Center, Department of Neurology, Rigshospitalet, University of Copenhagen, Section 6311, 2100 Copenhagen, Denmark. 18Department of Clinical Immunology, Rigshospitalet, University of Copenhagen, Section 2082, 2100 Copenhagen, Denmark. 19Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland. 20ICM-UMR 1127, INSERM, Sorbonne University, Hôpital Universitaire Pitié-Salpêtrière 47 Boulevard de l’Hôpital, F-75013 Paris. 21UMR1167 Université de Lille, Inserm, CHU Lille, Institut Pasteur de Lille. 22CRM-UMR974 Department of Neurology Hôpital Universitaire Pitié-Salpêtrière 47 Boulevard de l’Hôpital F-75013 Paris. 23Commissariat à l′Energie Atomique, Institut Genomique, Centre National de Génotypage, Evry, France. 24Fondation Jean Dausset – Centre d’Etude du Polymorphisme Humain, Paris, France. 25McGill University and Genome Quebec Innovation Center, Montreal, Canada. 26Hospices Civils de Lyon, Service de Neurologie, sclérose en plaques, pathologies de la myéline et neuro-inflammation, F-69677 Bron, France. 27Observatoire Français de la Sclérose en Plaques, Centre de Recherche en Neurosciences de Lyon, INSERM 1028 et CNRS UMR 5292, F-69003 Lyon, France. 28Université de Lyon, Université Claude Bernard Lyon 1, F-69000 Lyon, France; Eugène Devic EDMUS Foundation against multiple sclerosis, F-69677 Bron, France. 29Focus Program Translational Neuroscience (FTN), Rhine Main Neuroscience Network (rmn2), Johannes Gutenberg University-Medical Center, Mainz, Germany. 30Munich Cluster for Systems Neurology (SyNergy), 81377 Munich, Germany. 31Department of Neurology, Focus Program Translational Neuroscience (FTN), and Immunology (FZI), Rhine-Main Neuroscience Network (rmn2), University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany. 32Genetic and Molecular Epidemiology Group, Institute of Neurogenetics, University of Luebeck, Luebeck, Germany. 33Neurology Department, Neurogenetics Lab, University Hospital of Larissa, Greece. 34Laboratory of Molecular Medicine and Biotechnology, Don C. Gnocchi Foundation ONLUS, IRCCS S. Maria Nascente, Milan, Italy. 35Department of Medical Sciences, Torino University, Turin, Italy. 36Department of Health Sciences and Interdisciplinary Research Center of Autoimmune Diseases (IRCAD), University of Eastern Piedmont, Novara, Italy. 37Centro Regionale Sclerosi Multipla, Neurologia B, AOUI Verona, Italy. 38Fondazione IRCCS Cà Granda, Ospedale Maggiore Policlinico, Italy. 39Medical Research Council Biostatistics Unit, Robinson Way, Cambridge CB2 0SR, UK. 40MS Center, Department of Neuroscience, A.O. Città della Salute e della Scienza di Torino and University of Turin, Torino, Italy. 41Laboratory of Human Genetics of Neurological complex disorder, Institute of Experimental Neurology (INSPE), Division of Neuroscience, San Raffaele Scientific Institute, Via Olgettina 58, 20132, Milan, Italy. 42Department of Biomedical Sciences for Health, University of Milan, Milan, Italy. 43University of Milan, Department of Health Sciences, San Paolo Hospital and Filarete Foundation, viale Ortles 22/4, 20139 Milan, Italy. 44Department of Neurology, Institute of Experimental Neurology (INSPE), Division of Neuroscience, San Raffaele Scientific Institute, Via Olgettina 58, 20132, Milan, Italy. 45Neurology Unit, Department of Pathophysiology and Transplantation, University of Milan, Dino Ferrari Center, Milan, Italy. 46Fondazione IRCCS Ca’ Granda, Ospedale Policlinico, Milan, Italy. 47Fondazione IRCCS Casa Sollievo della Sofferenza, Unit of Neurology, San Giovanni Rotondo (FG), Italy. 48Center for Experimental Neurological Therapies, Sant’Andrea Hospital, Department of Neurosciences, Mental Health and Sensory Organs, Sapienza University, Rome, Italy. 49Istituto Neurologico Mediterraneo (INM) Neuromed, Pozzilli, Isernia, Italy. 50Department of Neurology, Ospedale Maggiore, Novara, Italy. 51Laboratory of Microbiology and Virology, University Vita-Salute San Raffaele, Hospital San Raffaele, Milan, Italy. 52Department of Health Sciences and Interdisciplinary Research Center of Autoimmune Diseases (IRCAD), University of Eastern Piedmont, Novara, Italy. 53Department of Neurology, Erasmus MC, Rotterdam, Netherlands. 54Nuffield Department of Population Health, Big Data Institute, University of Oxford, Li Ka Shing Centre for Health Information and Discovery, Old Road Campus, Oxford OX3 7LF, UK. 55Department of Neurology, Institute of Clinical Medicine, University of Oslo, Norway. 56Department of Neurology, Oslo University Hospital, Oslo, Norway. 57Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway. 58Servei de Neurologia-Neuroimmunologia, Centre d’Esclerosi Múltiple de Catalunya (Cemcat), Institut de Recerca Vall d’Hebron (VHIR), Hospital Universitari Vall d’Hebron, Spain. 59Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden. 60Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden. 61Translational Bioinformatics Unit, NavarraBiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), IdiSNA, Pamplona, Navarra, Spain. 62Mucosal and Salivary Biology Division, King’s College, London Dental Institute, London, UK. 63Neuroimmunology and MS Research (nims), Neurology Clinic, University Hospital Zurich, Frauenklinikstrasse 26, 8091 Zurich, Switzerland. 64Department of Neuroimmunology and MS Research, Neurology Clinic, University Hospital Zürich, Frauenklinikstrasse 26, 8091 Zürich, Switzerland. 65University of Cambridge, Department of Clinical Neurosciences, Addenbrooke’s Hospital, BOX 165, Hills Road, Cambridge CB2 0QQ, UK. 66Keele University Medical School, University Hospital of North Staffordshire, Stoke-on-Trent ST4 7NY, UK. 67Department of Twin Research and Genetic Epidemiology, King’s College London, London, SE1 7EH, UK. 68NIHR Oxford Biomedical Research Centre, Diabetes and Metabolism Theme, OCDEM, Churchill Hospital, Oxford UK. 69NIHR BioResource, Box 299,University of Cambridge and Cambridge University Hospitals NHS Foundation Trust Hills Road, Cambridge CB2 0QQ, UK. 70Department of Neurology, Peterborough City Hospital, Edith Cavell Campus, Bretton Gate, Peterborough PE3 9GZ, UK. 71Department of Neurology, Johns Hopkins University School of medicine, Baltimore MD. 72Multiple sclerosis center, Department of neurology, School of medicine, Washington University St Louis, St Louis MO. 73Center for Human Genetics Research, Vanderbilt University Medical Center, 525 Light Hall, 2215 Garland Avenue, Nashville, TN 37232, USA. 74Multiple Sclerosis Division, Department of Neurology, University of Miami, Miller School of Medicine, Miami, FL 33136, USA. 75MS Center of Northeastern NY 1205 Troy Schenectady Rd, Latham, NY 12110, USA. 76Université de Nantes, INSERM, Centre de Recherche en Transplantation et Immunologie, UMR 1064, ATIP-Avenir, Equipe 5, Nantes, France. 77Population and Quantitative Health Sciences, Department of Epidemiology and Biostatistics, Case Western Reserve University, 10900 Euclid Avenue, Cleveland, OH 44106-4945 USA. 78Center for Applied Genomics, The Children’s Hospital of Philadelphia, 3615 Civic Center Blvd., Philadelphia, PA 19104, USA. 79Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia PA, USA. 80Department of Neurology, Brigham & Women’s Hospital, Boston, 02115 MA, USA. 81Departments of Neurology and Neurological Therapeutics, Neurological Institute, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka City, Fukuoka 812-8582 Japan. 82The Elliot Lewis Center, 110 Cedar St, Wellesley MA, 02481, USA. 83Broad Institute of Harvard University and MIT, Cambridge, 02142 MA, USA. 84School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia. 85Westmead Institute for Medical Research, University of Sydney, Westmead, NSW 2145, Australia. 86Kaiser Permanente Division of Research, Oakland, CA, USA. 87Ann Romney Center for Neurological Diseases, Department of Neurology, Brigham & Women’s Hospital, Boston, 02115 MA, USA. 88Departments of Neurology and Immunobiology, Yale University School of Medicine, New Haven, CT 06520, USA. 89Westmead Millennium Institute, University of Sydney, New South Wales, Australia. 90Department of Health Sciences and Interdisciplinary Research Center of Autoimmune Diseases (IRCAD), University of Eastern Piedmont, Novara, Italy. 91Menzies Research Institute Tasmania, University of Tasmania, Australia. 92UC Berkeley School of Public Health and Center for Computational Biology, USA. 93Westmead Millennium Institute, University of Sydney, New South Wales, Australia. 94Department of Neurology and Department of Immunology, Erasmus MC, Rotterdam, Netherlands. 95UK Dementia Research Institute, University College London, Gower Street, London WC1E 6BT, UK.

*Present address: Center for Computational Medicine, Peter Gilgan Centre for Research and Learning, Hospital for Sick Children (SickKids), Toronto, ON M5G 0A4, Canada.

†Present address: Vertex Pharmaceuticals, 50 Northern Avenue, Boston, MA 02210, USA.

Acknowledgments: We thank the Harvard Aging Brain Study (HABS; P01AG036694). We thank the Biorepository Facility and the Center for Genome Technology laboratory personnel (specifically S. West, S. Clarke, D. Martinez, and P. Whitehead) within the John P. Hussman Institute for Human Genomics at the University of Miami for centralized DNA handling and genotyping for this project. The IMSGC acknowledges W. Edgerly and L. Edgerly, J. Carlos and E. Carlos, M. Crowninshield, and W. Fowler and C. Fowler, whose enduring commitments were critical in creation of the Consortium. We thank the volunteers from the Oxford Biobank (www.oxfordbiobank.org.uk) and the Oxford National Institute for Health Research (NIHR) Bioresource for their participation. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR, or the Department of Health, German Ministry for Education and Research, German Competence Network MS (BMBF KKNMS). Funding: This investigation was supported in part by grants from the National MS Society (NMSS) to A.J.I. on behalf of the International MS Genetics Consortium (RG 4198-A-1 and AP-3758-A-16). Other grants include a Harry Weaver Neuroscience Scholar Award from the NMSS to P.L.D. (JF2138A1) as well as NIH grants RC2 GM093080 and R01AG036836. This work was also supported by a postdoctoral fellowship from the National Multiple Sclerosis Society (FG 1938-A-1) and a Career Independence Award from the National Multiple Sclerosis Society (TA 3056-A-2) to N.A.P. and National Multiple Sclerosis Society award AP3758-A-16. N.A.P. has been supported by Harvard NeuroDiscovery Center and an Intel Parallel Computing Center award, the U.S. National Multiple Sclerosis Society (grants RG 4680-A-1), and the NIH/NINDS (grant R01NS096212). T.A. was supported by the German Federal Ministry of Education and Research (BMBF) through the Integrated Network IntegraMent, under the auspices of the e:Med Programme (01ZX1614J), Swedish Medical Research Council; Swedish Research Council for Health, Working Life and Welfare, Knut and Alice Wallenberg Foundation, AFA insurance, Swedish Brain Foundation, and the Swedish Association for Persons with Neurological Disabilities. This study makes use of data generated as part of the Wellcome Trust Case Control Consortium 2 project (085475/B/08/Z and 085475/Z/08/Z), including data from the British 1958 Birth Cohort DNA collection (funded by the Medical Research Council grant G0000934 and the Wellcome Trust grant 068545/Z/02) and the UK National Blood Service controls (funded by the Wellcome Trust). The study was supported by the Cambridge NIHR Biomedical Research Centre, UK Medical Research Council (G1100125), and the UK MS society (861/07); NIH/NINDS: R01 NS049477, NIH/NIAID: R01 AI059829, NIH/NIEHS: R01 ES0495103; Research Council of Norway grant 196776 and 240102; NINDS/NIH R01NS088155; Oslo MS association and the Norwegian MS Registry and Biobank and the Norwegian Bone Marrow Registry; Research Council KU Leuven, Research Foundation Flanders; AFM, AFM-Généthon, CIC, ARSEP, ANR-10-INBS-01 and ANR-10-IAIHU-06; Research Council KU Leuven, Research Foundation Flanders; Inserm ATIP-Avenir Fellowship and Connect-Talents Award; German Ministry for Education and Research, German Competence Network MS (BMBF KKNMS); and Dutch MS Research Foundation. TwinsUK is funded by the Wellcome Trust, Medical Research Council, European Union, the NIHR-funded BioResource, Clinical Research Facility and Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust in partnership with King’s College London. The recall process was supported by the NIHR Oxford Biomedical Research Centre Programme; Italian Foundation of Multiple Sclerosis (FISM grants, Special Project “Immunochip” 2011/R/1, 2015/R/10); NMSS (RG 4680A1/1); and the MultipleMS EU project. We acknowledge the Lundbeck Foundation and Benzon Foundation for support (THP). This research was supported by grants from the Danish Multiple Sclerosis Society, the Danish Council for Strategic Research [grant 2142-08-0039], Novartis, Biogen (Denmark) A/S, the Sofus Carl Emil Friis og Hustru Olga Doris Friis Foundation, and the Foundation for Research in Neurology. The Observatoire Français de la Sclérose en Plaques (OFSEP) is supported by a grant provided by the French government and handled by the Agence Nationale de la Recherche, within the framework of the Investments for the Future program, under the reference ANR-10-COHO-002, by the Eugène Devic EDMUS Foundation against multiple sclerosis, and by the ARSEP Foundation. Competing interests: H.F.H. has received modest travel support and honoraria for advice or lecturing from Norwegian branches of Biogen Idec, Sanofi-Genzyme, Merck, Novartis, Roche, and Teva and a modest unrestricted research grant from Novartis, as well as research grants from the South Eastern Norwegian Health Authorities and the Research Council of Norway. D.H. has been a consultant or SAB for the following companies: Compass Therapeutics, EMD Serono, Genentech, Novartis Pharmaceuticals, Proclara Bioscience, Sanofi Genzyme, and Versant Venture. We believe none of these consulting activities or grants represent a conflict of interest for this work. M.S. received consulting fees or speaking honoraria from Biogen, Merck, Novartis, Sanofi, Teva, Roche. K.L. is a paid consultant and Chair of the Genetics Advisory Panel of Biogen (Cambridge, USA). K.L. is a cofounder and co-owner of Intomics (Copenhagen, DK). K.L. has been a paid consultant to Merck (Boston, USA) and GoldFinch Bio (Cambridge, USA). H.W. receives grant funding from the National Institutes of Health, National Multiple Sclerosis Society, Verily Life Sciences, EMD Serono, Biogen, Teva Pharmaceuticals, Sanofi, Novartis, Genentech, and Tilos Therapeutics as well as consulting fees from Genentech, Tiziana Life Sciences, IM Therapeutics, Magnolia Therapeutics, MedDay Pharmaceuticals, and vTv Therapeutics. P.d.B. owns stock in Vertex Pharmaceuticals. H.F.H. receives funds from the South Eastern Norwegian Health Authorities. Data availability: The GWASs used in the discovery phase are available in the following repositories: (i) dbGAP, phs000275.v1.p1, phs000139.v1.p1, phs000294.v1.p1, and phs000171.v1.p1; and (ii) European Genome-phenome Archive database, EGAD00000000120, EGAD00000000022, and EGAD00000000021. The MS Chip and ImmunoChip data are available from the respective EGA accession nos.: EGAS00001003216 and EGAS00001003219. The ImmVar data are available in the Gene Expression Omnibus (GEO): GSE56035. The MS PBMC data are available in GEO: GSE16214. Human Gene Atlas: http://snpsea.readthedocs.io/en/latest/data.html#geneatlas2004-gct-gz. ImmGen: http://snpsea.readthedocs.io/en/latest/data.html#immgen2012-gct-gz. The brain-related data are available in Synapse: www.synapse.org/#!Synapse:syn2580853/wiki/409844. The list of putative associated MS genes is available for public access in GeNets. (https://apps.broadinstitute.org/genets). A subset of the data are available for MS-related studies only because the parent studies consent does not permit deposition into a repository of genetic data: (i) The ANZGENE GWAS data are available by means of request to the ANZGENE Consortium; a direct request can be made through https://msra.org.au. The request should state the purpose of the study, its relation to MS, and a list of the data being requested. (ii) Sharing of individual participant data was not included in the informed consent of the Rotterdam MS study because there is potential risk of revealing participants’ identities, and it is not possible to completely anonymize the data. The Rotterdam MS data are available by email to k.kreft@erasmusmc.nl. The request should state the purpose of the study, its relation to MS, and a list of the data being requested. (iii) The Rotterdam Study control data are available to interested researchers upon request. Requests can be directed to the study’s data manager Frank J. A. van Rooij (f.vanrooij@erasmusmc.nl). The following website contains more information about this cohort: www.ergo-onderzoek.nl/wp/contact. Sharing of individual participant data was not included in the informed consent of the Rotterdam MS study, and there is potential risk of revealing participants’ identities because it is not possible to completely anonymize the data. (iv) The genetic data from MS case and controls recruited through Kaiser Permanente Division of Research and University of California, Berkeley, data are available from the Institutional Data Access/Ethics Committee at University of California, Berkeley (contact R. Harris, rharris@berkeley.edu, for researchers who meet the criteria for access to confidential data. Please reference the manuscript title and corresponding author in your communication). Corresponding summary statistics for these three GWAS studies (ANZGENE, Rotterdam, and Berkeley) are available upon request.

You may also like

Leave a Comment