AIRE is a transcriptional regulator playing a functional role in thymocyte education and negative selection by controlling the expression of peripheral antigens in the thymus. Recently, the AIRE gene was identified as a genetic risk factor for rheumatoid arthritis (RA) in genome wide association (GWA) studies performed in the Japanese population. According to the available data this association is restricted to the Asian population. However, different facts could influence the lack of association in Caucasian populations. The aim of this study was to further investigate the possible role of the AIRE gene in susceptibility to RA in a Caucasian population.
A total of 472 Spanish Caucasian RA patients and 475 ethnically matched controls were included in the study. Three single-nucleotide polymorphisms (SNPs) (rs2776377, rs878081 and rs1055311) with a minor allele frequency >0.05 in the Caucasian population which were not included in the high-throughput platforms used in the GWA studies performed in susceptibility to RA, and two SNPs (rs2075876 and rs1800520) associated with RA in the Japanese population, were selected and genotyped using TaqMan assays.
No significant differences in the distribution of the alleles of rs2776377, rs2075876, rs1055311 and rs1800520 SNPs between RA patients and controls were observed. Nevertheless, the frequency of the C allele of rs878081 was significantly higher among RA patients (80.5% vs. 74.6% in the control group, pc = 0.012, OR = 1.41, 95%CI 1.13-1.75). Regarding the distribution of the rs878081 genotypes, a higher frequency of CC homozygous individuals was found in the RA patient group (65.56% vs. 56.47% in the control group, pc = 0.013, OR = 1.47, 95%CI 1.12-1.93). The in silico analysis predicted lower affinity to the binding-site of a motif of the transcription NF-κB family and lower transcription levels of AIRE gene for the rs878081C risk variant
Our findings suggest that the AIRE gene is associated with susceptibility to RA in the Spanish population. Probably, this association has not been detected in the European population in the GWA studies because the earliest high-throughput platforms did not include SNP suitable markers (e.g. rs878081).
Rheumatoid arthritis (RA) is a chronic systemic inflammatory disease that often leads to disability from joint damage and inflammation. Although RA is an uncommon disease with a worldwide prevalence of approximately 1%, this pathology has a large economic and societal cost in terms of work-related disability . Both environmental and genetic factors are considered to be associated with the onset and progression of this disease. Genome-wide associations (GWA) studies have identified common genetic variations associated with numerous complex diseases . Contrary to the candidate gene approach, in which a limited number of genes chosen on the basis of known or suspected biological considerations are tested, the aim of GWA studies is to check association in the whole genome without a priori hypotheses. Many gene loci have been identified as risk factors for RA in different GWA studies in European and East Asian populations. Some of these loci have been found to be restricted to a particular ethnic group but others, such as, CCR6, STAT4 and TNFAIP3, have been described as associated with RA in different populations . Recently, the AIRE gene was identified as a genetic risk factor for RA in a GWA study performed in a Japanese population . AIRE is a transcriptional regulator primarily expressed in medullary thymic epithelial cells, playing a functional role in thymocyte education and negative selection by controlling the expression of peripheral antigens in the thymus . Therefore, AIRE is a good functional candidate in autoimmune diseases regardless of the population. In fact, mutations in this gene cause autoimmune polyendocrinopathy syndrome (APS1), which is one of the few known monogenic autoimmune diseases. Nevertheless, AIRE has not been identified as associated to RA in the European population, either in a large-scale GWA study or in a meta-analysis of GWA studies [6-10]. However, both of these studies had strong detection power, and therefore, the association of AIRE with RA, like that of PAD14, could be specific to some populations, such as in the Japanese study . However, this gene has different linkage disequilibrium (LD) blocks in European and Asian populations (Figure 1), and the earliest GWA high-throughput platforms do not include any adequate tag single-nucleotide polymorphisms (SNPs) for the European population. This fact could influence the lack of association in Caucasian populations, therefore, we decided to further investigate the possible role of the AIRE gene in susceptibility to RA in a Spanish population.
Figure 1. Haplotype blocks described in the region Chr21:44529072..44541978 of the HapMap Project in the CEU and the JPT populations. The linkage disequilibrium (LD) solid spine approach was used to define haplotype blocks. Standard color-coding was used for LD plots: white (r2 = 0), shaded gray (0 <r2 < 1), black (r2 = 1). Squares without a number indicate D' = 1. CEU, Utah residents with Northern and Western European ancestry from the CEPH collection; JPT, Japanese in Tokyo, Japan.
Materials and methods
A total of 472 RA patients (351 women and 121 men) who were unrelated were included in the study. All patients meet the American College of Rheumatology (ACR) revised RA criteria . The mean (SD) age of onset was 49.23 (14.8) years. Information on rheumatoid factor, anti-cyclic citrullinated peptide antibodies and Disease Activity Score in 28 joints (DAS28) for RA activity was obtained. Patients were rheumatoid factor-positive in 85.2% of the cases and anti-cyclic citrullinated peptide antibodies were present in 82.2% of them. The control population consisted of 475 ethnically matched healthy bone marrow donors who were unrelated. All the subjects were Spanish Caucasian, and they were recruited from two Southern Spanish hospitals: Hospital Universitario Virgen del Rocío (Sevilla) and Hospital Universitario Virgen de las Nieves (Granada). No significant differences in clinical features, ratio of rheumatoid factor and anti-cyclic citrullinated peptide antibodies and DAS28 score were observed between the two patient cohorts. The study was approved by all local ethical committees of the corresponding hospitals. Blood samples were obtained from subjects after they provided written informed consent, and were sent to Hospital Universitario Virgen del Rocio where the genotyping study was performed. Genomic DNA was extracted from blood leukocytes using QIAmp DNA Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer's recommendations and stored at -20ºC.
AIRE SNPs genotyping
An AIRE region (Chr21:44529072..44541978) [NCBI Reference Sequence: NC_000021.7] LD plot was generated from the Utah residents with Northern and Western European ancestry from the CEPH collection (CEU) with data obtained from HapMap (Release 28, Phase II+III, NCBI build 36 assembly, dbSNP b126). The LD solid spine approach was used to define haplotype blocks in the CEU population using the Haploview program (available at the website ) (Figure 1). The rs2075876, which is associated with RA in the Japanese population, was located in the LD block Chr21:44529072...44534480 encompassing a 5.4 Kb region between 5'UTR and exon 6 in the CEU population (Figure 1, CEU-block 1). A total of seven SNPs were found in the HapMap project within this block, four of them have a minor allele frequency (MAF) > 0.05 (rs2776377, rs878081, rs2075876 and rs1055311) and were the ones selected for this study (Table 1). The SNP haplotypes frequency estimation was performed using the Haploview sotfware. Additionally, rs1800520, a non-synonymous SNP (S278R), which is not included in HapMap project, was also genotyped. Genotyping of controls and RA patients was performed using TaqMan® SNP Genotyping Assays (Applied Biosystems, Barcelona, Spain) in a LightCycler 480 (Roche, Barcelona, Spain).
Table 1. MAF of the tag SNPs included in HapMap within the Chr21:44529072
Bioinformatic analysis of AIRE expression
To check the effect of the variants in the AIRE gene expression, two different approaches were used. First, the effects of the variants on regulatory motifs were investigated using the HaploReg database  and quantified as: Predicted relative affinity = LOD(alt) - LOD(ref) , where LOD is the logarithm of the odds, ref is the reference sequence, and alt is the alternative sequence. Therefore, a negative result means that the predicted relative affinity is higher for the reference sequence whereas a positive result states that the predicted relative affinity is higher for the alternative. Second, in order to analyze the predicted expression levels, a gene-expression dataset of lymphoblastoid cell lines derived from 210 unrelated individuals from different populations of HapMap was obtained from the Gene Expression Omnibus [GEO, accession number GSE6536] database .
Allele and haplotype frequency distributions were compared using the chi square test, and a corrected P-value (Pc) was calculated from 10,000 permutations (Haploview program). Pc-values < 0.05 were considered statistically significant. Odds ratios (ORs) and 95% CI were calculated according to Wolf's method using the Statcalc program (Epi Info 2002; Centers for Disease Control and Prevention, Atlanta, GA, USA). Statistical analysis of mRNA AIRE expression was performed by the Joncheere-Terepstra method using the SPSS sotfware (version 18).
Genotypes of rs2776377, rs878081, rs2075876, rs1055311 and rs1800520 were unequivocally assigned in 471 healthy controls and 465 RA patients. The successful rate of genotyping was > 98% for all the SNPs included, and the study population was found to be in Hardy-Weinberg equilibrium for all the polymorphisms analyzed (P > 0.05). Table 2 shows the frequencies of the four tag SNPs and the rs1800520 in RA patients and healthy controls. No significant differences in the distribution of the alleles of the tag SNPs, rs2776377, rs2075876 and rs1055311 were observed between RA patients and controls. Nevertheless, the frequency of the C allele of the rs878081 SNP was significantly higher among RA patients (80.5 vs. 74.6% in the control group, Pc = 0.012, OR 1.41, 95% CI 1.13, 1.75) and regarding the distribution of the rs878081 genotypes (Table 3), a higher frequency of CC homozygous individuals was found in the RA patient group (65.56% vs. 56.47% in the control group, Pc = 0.013, OR 1.47, 95% CI 1.12, 1.93). Alleles of the non-synonymous rs1800520 had the same frequency among patients and controls.
Table 2. Allelic frequencies of the SNPs studied in the AIRE in RA Spanish patients and controls
Table 3. Genotype frequencies of the SNP rs878081 in RA Spanish patients and controls
In the haplotype analysis, taking into account rs2776377, rs878081, rs2075876 and rs1055311, six haplotypes with frequencies > 0.05 were identified in our population (Table 4). According to our results, the four haplotypes bearing the rs878081C allele showed a higher frequency in RA patients compared with controls, although only one of them (GCGC) reached statistical significance before permutation testing (P = 0.018, OR 1.38, 95% CI 1.05, 1.82). On the contrary, the two haplotypes bearing the rs878081T alleles had a lower frequency in the RA patients and one of them (GTGC) remained statistically significant after permutation testing (Pc = 0.014, OR 0.67, 95% CI 0.52, 0.87). All the major haplotypes found in our population had the rs1800520C allele except the haplotype GCAC, which was found with both rs1800520 alleles (C 84% and G 16%).
Table 4. Allelic frequencies of the major haplotypes found in RA patients and controls
Case-only phenotype analysis of RA patients revealed no association between rs878081 alleles or haplotypes and mean age at diagnosis, rheumatoid factor, anti-cyclic citrullinated peptide antibodies or DAS28 score (data not shown).
The in silico study using the HaploReg database , predicts a motif binding site that spans the rs878081 location for transcription factors of the nuclear factor-kappa B (NF-κB family  The difference between the LOD score of the T allele and the LOD score of the C allele was +2 (positive), therefore, this model predicts alterations of the affinity to regulatory motifs, which is higher for the variant T than for the C allele. Additionally, when the AIRE expression levels were analyzed using the expression profiles from the GEO database , a statistically significant difference in the mean levels of expression of the rs878081 alleles was found (P = 0.013). The transcription of AIRE was decreased by the C allele compared with the T allele (Figure 2).
Figure 2. Comparison of expression levels of AIRE obtained from the GEO database according to the genotype of rs878081. Results of AIRE mRNA expression are shown in a boxplot. The P-value was obtained by the Joncheere-Terepstra method.
Replication studies in different populations have an important role in the validation of common genetic variation associated to complex diseases, strengthening the confidence of initial association reports. In the present work in a Spanish population we have replicated the association between the AIRE gene and RA that was recently identified in the Japanese population .
In the GWA study performed in the Japanese population, two SNPs located in the AIRE gene region, rs2075876 and rs760426, were described to be associated with RA. Nevertheless, association with this gene had not been found in European populations in previous GWA studies [6-10]. Failure to identify the association in European populations could be explained by several reasons. Some of these reasons may be based on differences between European and Asian populations in the polymorphism of this gene. According to HapMap data, the region Chr21:44529072..44541978 included 18 tag SNPs, of which 15 have an MAF > 0.20 in Japanese in Tokyo, Japan (JPT) vs. 10 in CEU (Table 1). Additionally, four of these tag SNPs with an MAF > 0.20 in JPT have an MAF < 0.05 in CEU, whereas in contrast, MAF > 0.20 in CEU and MAF < 0.05 is found only in one case. Specifically, the SNPs identified as risk markers in the Japanese population, have a much lower MAF in the Caucasian populations included in the HapMap project (rs2075876A 0.123 in CEU and 0.096 in Tuscan in Italy (TSI) and rs760426G in 0.159 CEU and 0.163 in TSI) than in the JPT (0.347 and 0.359, respectively) and other Asian populations (0.437 and 0.389 in Han Chinese in Beijing, China (CHB) and 0.417 and 0.413 in Chinese in Metropolitan Denver, Colorado (CHD) respectively) included in this project. Furthermore, the LD blocks in the AIRE region locus are different in Caucasian and Asian populations (Figure 1). The AIRE region is not included in GWA high-throughput platforms of Affimetrix® used in several RA susceptibility studies . Additionally, the earliest GWA high-throughput platforms of Illumina® used in RA susceptibility studies do not include any tag SNPs with an MAF higher than 0.2 for the European population within the LD block where the rs2075876 is located (the LD block between 5'UTR and exon 6 of the AIRE region) [7,9]. For these reasons, we focused our study on the LD block encompassing a 5.4 Kb around of 5' AIRE region locus. The rs878081 SNP was found to be associated to RA in our population, even with a higher OR (1.41) than the repoted in the Japanese population for the rs2075876 (OR = 1.18). Therefore, according to our results, some genetic variants with associations to disease that have been described to be restricted to a specific ethnic group could lack this association in other populations, because the tag SNPs of some genome regions included in GWA platforms are more or less suitable according to the population. This fact could also explain why some associations found in candidate gene studies may not be replicated in GWA studies, due to suboptimal representation and coverage of the risk variants with the tag SNPs used in the high-throughput platforms [17,18].
The association study performed with haplotypes showed that the susceptibility is strongly influenced by the rs878081 SNP, because all those haplotypes bearing the rs878081C allele showed a higher frequency in RA patients compared with controls. Despite the different distribution of haplotypes among European and Asian populations, the risk allele rs2075876A in the Japanese population is in LD with the risk allele rs878081C in the Spanish population. The rs878081 is located in exon 5 but it is a synonymous polymorphism, and therefore, it does not introduce functional alterations in the AIRE protein; hence, other non-synonymous SNPs could contribute to the disease. The HaploReg database predicts a lower affinity of the factor-binding site to the motif of transcription of the NF-κB family for the rs878081C risk allele [14,16]. Additionally, results of the in silico analysis of the dataset of lymphoblastoid cell lines of GEO showed that similarl to the rs2075876A risk allele , transcription levels of AIRE are lower for the rs878081C risk allele . It has been demonstrated that NF-κB2 is required for the transcriptional regulation of the AIRE gene in the thymus, contributing to the maintenance of central tolerance in an AIRE-dependent manner . According to our results, the lower affinity to the C allele for the transcription factor motif could be the cause of the lower transcription levels found in the in silico analysis. These lower levels of transcription of the gene could promote failure of negative selection in the thymus, and as a consequence, increase the survival of autoreactive T cell clones.
We also checked the association of the non-synonymous SNP, rs1800520 (S278R) located in the exon 7, which was found to be associated to RA in the Japanese population. Nevertheless, no differences in allele frequencies of this SNP were found when comparing patients and controls in our population. The MAF of the rs1800520 in the Spanish population was 0.017 vs. 0.420 in the Japanese population . This is probably the cause of the discrepancies observed in the association with rs1800520 in both populations. Finally, in our study we did not include the rs760426, which was also found to be associated with RA in the Japanese population, because the MAF of this SNP in the CEU population is < 0.2 and it is located in another LD block.
In summary, this is the first study establishing a relationship between the AIRE gene and the susceptibility to RA in a European population. This relationship could not have been detected in the GWA studies because of the differences in both the frequency of SNPs and in the structure of the linkage blocks between the different ethnic groups that would determine better or worse adequacy of the SNP markers included in the platforms.
ACR: American College of Rheumatology; APS1: autoimmune polyendocrinopathy syndrome; CEU: Utah residents with Northern and Western European ancestry from the CEPH collection; CHB: Han Chinese in Beijing, China; CHD: Chinese in Metropolitan Denver, Colorado; DAS28: Disease Activity Score in 28 joints; GEO: Gene Expression Omnibus; GWA: genome-wide association; JPT: Japanese in Tokyo, Japan; LD: linkage disequilibrium; MAF: minor allele frequency; OR: odds ratio; RA: arthritis rheumatoid; SNP: single-nucleotide polymorphism; TSI: Tuscan in Italy.
The authors declare that they have no competing interests.
JRGL, MAMC, ANR and MFGE contributed to the conception and design of the research, analysis and interpretation of the data, and drafting and revision of the manuscript. BTA, LOF and MCA performed the experiments and analyzed the data. MT, AG and JM contributed samples and patient information. All authors read and approved the final version.
This work was supported by Fondo de Investigaciones Sanitarias (FIS 10/1701), Fondos FEDER, Plan Andaluz de Investigación (PAI CTS-0197) and Consejería de Salud de la Junta de Andalucía (Exp. 0260/08).
Terao C, Yamada R, Ohmura K, Takahashi M, Kawaguchi T, Kochi Y, Human Disease Genomics Working Group; RA Clinical and Genetic Study Consortium, Okada Y, Nakamura Y, Yamamoto K, Melchers I, Lathrop M, Mimori T, Matsuda F: The human AIRE gene at chromosome 21q22 is a genetic determinant for the predisposition to rheumatoid arthritis in Japanese population.
Anderson MS, Venanzi ES, Klein L, Chen Z, Berzins SP, Turley SJ, von Boehmer H, Bronson R, Dierich A, Benoist C, Mathis D: Projection of an immunological self shadow within the thymus by the aire protein.
Plenge RM, Seielstad M, Padyukov L, Lee AT, Remmers EF, Ding B, Liew A, Khalili H, Chandrasekaran A, Davies LR, Li W, Tan AK, Bonnard C, Ong RT, Thalamuthu A, Pettersson S, Liu C, Tian C, Chen WV, Carulli JP, Beckman EM, Altshuler D, Alfredsson L, Criswell LA, Amos CI, Seldin MF, Kastner DL, Klareskog L, Gregersen PK: TRAF1-C5 as a risk locus for rheumatoid arthritis. A genomewide study.
Raychaudhuri S, Remmers EF, Lee AT, Hackett R, Guiducci C, Burtt NP, Gianniny L, Korman BD, Padyukov L, Kurreeman FA, Chang M, Catanese JJ, Ding B, Wong S, van der Helm-van Mil AH, Neale BM, Coblyn J, Cui J, Tak PP, Wolbink GJ, Crusius JB, van der Horst-Bruinsma IE, Criswell LA, Amos CI, Seldin MF, Kastner DL, Ardlie KG, Alfredsson L, Costenbader KH, Altshuler D, et al.: Common variants at CD40 and other loci confer risk of rheumatoid arthritis.
Gregersen PK, Amos CI, Lee AT, Lu Y, Remmers EF, Kastner DL, Seldin MF, Criswell LA, Plenge RM, Holers VM, Mikuls TR, Sokka T, Moreland LW, Bridges SL Jr, Xie G, Begovich AB, Siminovitch KA: REL, encoding a member of the NF-kappaB family of transcription factors, is a newly defined risk locus for rheumatoid arthritis.
Stahl EA, Raychaudhuri S, Remmers EF, Xie G, Eyre S, Thomson BP, Li Y, Kurreeman FA, Zhernakova A, Hinks A, Guiducci C, Chen R, Alfredsson L, Amos CI, Ardlie KG, BIRAC Consortium, Barton A, Bowes J, Brouwer E, Burtt NP, Catanese JJ, Coblyn J, Coenen MJ, Costenbader KH, Criswell LA, Crusius JB, Cui J, de Bakker PI, De Jager PL, Ding B, Emery P, et al.: Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci.
Arnett FC, Edworthy SM, Bloch DA, McShane DJ, Fries JF, Cooper NS, Healey LA, Kaplan SR, Liang MH, Luthra HS, Medsger TA Jr, Mitchell DM, Neustadt DH, Pinals RS, Schaller JG, Sharp JT, Wilder RL, Hunder GG: The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis.
Haploview program [http://www.broad.mit.edu/mpg/haploview/download.php] webcite
Stranger BE, Forrest MS, Dunning M, Ingle CE, Beazley C, Thorne N, Redon R, Bird CP, de Grassi A, Lee C, Tyler-Smith C, Carter N, Scherer SW, Tavaré S, Deloukas P, Hurles ME, Dermitzakis ET: Relative impact of nucleotide and copy number variation on gene expression phenotypes.
Wong D, Teixeira A, Oikonomopoulos S, Humburg P, Lone IN, Saliba D, Siggers T, Bulyk M, Angelov D, Dimitrov S, Udalova IA, Ragoussis J: Extensive characterization of NF-κB binding uncovers non-canonical motifs and advances the interpretation of genetic functional traits.