Recently, we proposed a classification of HLA-DRB1 alleles that reshapes the shared epitope hypothesis in rheumatoid arthritis (RA); according to this model, RA is associated with the RAA shared epitope sequence (72–74 positions) and the association is modulated by the amino acids at positions 70 and 71, resulting in six genotypes with different RA risks. This was the first model to take into account the association between the HLA-DRB1 gene and RA, and linkage data for that gene. In the present study we tested this classification for validity in an independent sample. A new sample of the same size and population (100 RA French Caucasian families) was genotyped for the HLA-DRB1 gene. The alleles were grouped as proposed in the new classification: S1 alleles for the sequences A-RAA or E-RAA; S2 for Q or D-K-RAA; S3D for D-R-RAA; S3P for Q or R-R-RAA; and X alleles for no RAA sequence. Transmission of the alleles was investigated. Genotype odds ratio (OR) calculations were performed through conditional logistic regression, and we tested the homogeneity of these ORs with those of the 100 first trio families (one case and both parents) previously reported. As previously observed, the S2 and S3P alleles were significantly over-transmitted and the S1, S3D and X alleles were under-transmitted. The latter were grouped as L alleles, resulting in the same three-allele classification. The risk hierarchy of the six derived genotypes was the same: (by decreasing OR and with L/L being the reference genotype) S2/S3P, S2/S2, S3P/S3P, S2/L and S3P/L. The homogeneity test between the ORs of the initial and the replication samples revealed no significant differences. The new classification was therefore considered validated, and both samples were pooled to provide improved estimates of RA risk genotypes from the highest (S2/S3P [OR 22.2, 95% confidence interval 9.9–49.7]) to the lowest (S3P/L [OR 4.4, 95% confidence interval 2.3–8.4]).
The pathogenesis of rheumatoid arthritis (RA) is multifactorial, involving both genetic and environmental factors. Although associations between some HLA-DRB1 alleles and RA were reported nearly three decades ago, the biological mechanism underlying this association remains unknown. The presence of the RAA sequence at positions 72–74 of the HLA-DR β-chain molecule for all HLA-DRB1 alleles known to be associated with RA led to the shared epitope (SE) hypothesis . This hypothesis received support from numerous case-control association studies in both Caucasian and non-Caucasian populations. However, studies testing the SE hypothesis have rejected this simple model, which stipulates that each SE allele confers the same risk [2-5].
Recently, Tezenas du Montcel and coworkers proposed a model of the SE component in RA . Those investigators reconsidered the SE hypothesis and generated a new classification of HLA-DRB1 alleles, based on their investigation using the MASC (Marker Association Segregation Chi Square) method , which was conducted in 100 trio families (one case and both parents) and 132 index cases from affected sibling pair families, all from the French Caucasian population. They proposed that the risk for developing RA depends on whether the RAA sequence occupies positions 72–74 and, if this is the case, on the amino acids at positions 71 and 70. For those RAA alleles, lysine (K) at position 71 conferred the highest risk, arginine (R) an intermediate risk, and alanine (A) or glutamic acid (E) the lowest risk. Glutamine (Q) or arginine (R) at position 70 conferred greater risk than did aspartic acid (D). This resulted initially in five allele groups, which were simplified to three allele groups defining six genotypes with different RA risks. This study was the first to model the HLA component in RA taking into account both association and linkage data, resulting in a reshaped SE hypothesis.
Here, we tested this classification for validity by replication in a new, independent sample of 100 French Caucasian trio families, evaluating the risk hierarchy of the proposed classification for homogeneity with that of the initial sample.
Materials and methods
Study design and study population
An association study using conditional logistic regression was performed to investigate the hierarchy of risks associated with HLA-DRB1 genotypes in an independent sample of trio families. The new independent sample (sample B), similar to that used to generate the new classification (sample A), included 100 trio families (one RA patient and both parents) of French Caucasian origin (criteria fulfilled for each of the four grandparents). DNA from all of the trio families included in samples A and B was collected between 1994 and 1998, as were initial clinical characteristics of the RA index patients. RA diagnosis met the 1987 American College of Rheumatology (formerly, the American Rheumatism Association) criteria . All individuals provided written informed consent, and the study was approved by the Hospital Bicêtre ethics committee (Kremlin-Bicêtre, Assistance Publique-Hôpitaux de Paris).
Clinical characteristics were updated in 2001 and 2002 for sample A and in 2004 for sample B. Four RA index patients in sample A and two RA index patients in sample B died between the time of DNA collection and the present study. The updated clinical characteristics of sample B were similar to those of sample A (the initial sample): 90% of RA patients in sample B were female versus 87% in sample A; the mean (± standard error) age at RA onset was 31 ± 9 years versus 32 ± 10 years; the mean (± standard error) disease duration was 16 ± 8 years versus 18 ± 7 years; erosions were present in 79% versus 90%; 76% were positive for serum rheumatoid factor versus 81%; and nodules were present in 19% versus 31%. Rheumatoid factor was considered positive when there was at least one positive rheumatoid factor finding during the course of the disease, as determined using latex fixation, Waaler Rose assay, or laser nephelometry.
Blood samples were collected for DNA extraction and genotyping. HLA-DRB1 typing was performed using the polymerase chain reaction-sequence specific primer (SSP) method using Dynal Classic SSP DR low resolution and the Dynal Classic high resolution SSP (Dynal Biotech, Lake Success, NY, USA) for subtyping of HLA-DRB1*01, *04, *11, *13 and *15 alleles. Sequencing of exon 2 of HLA-DRB1 was performed for all four HLA-DRB1*04 alleles, ambiguous with the Dynal Classic method. HLA-DRB1 allele frequencies of control genotypes (obtained by combining untransmitted parental alleles for each family) were similar between samples and were comparable to the allele frequencies reported for the French population in the 11th Histocompatibility Workshop .
HLA-DRB1 allele classification
HLA-DRB1 alleles were divided in two groups according to the presence or absence of the RAA sequence at positions 72–74, defining S and X alleles (Table 1). The S alleles were then subdivided into three categories, according to amino acid at position 71, as follows: S1 when an alanine or a glutamic acid was present at position 71 (A-RAA or E-RAA sequences; A-RAA alleles were too infrequent not to be pooled, as described previously ); S2 when a lysine was present (K-RAA sequence); and S3 when an arginine was present at position 71 (R-RAA sequence). Then S3 alleles were subdivided according to amino acid at position 70: S3D alleles encoding the D-R-RAA sequence and S3P alleles encoding the Q or R-R-RAA sequence. Because the S2 alleles had either Q or D at position 70, they had – by this '70-71-72/74' nomenclature – the Q or D-K-RAA sequence.
Table 1. Classification of HLA-DRB1 alleles
We first investigated transmission of the five alleles (S1, S2, S3D, S3P and X) using a χ2 test with one degree of freedom for each allele. Alleles with significant over-transmission from heterozygous parents to RA patients (>50%) are linked to and associated with RA. Alleles with significant under-transmission (<50%) exhibit no RA association and could be pooled for further analysis.
Then, for each genotype 'I', the odds ratio 'ORi' relative to a reference genotype and 95% confidence interval (CI) were calculated by conditional logistic regression. In this analysis, the genotypes observed for the RA patients were conditioned to the parents' genotypes [10,11]. The RA patient genotypes were compared using a likelihood ratio test with the pseudo-controls (i.e. the three other genotypes that could be formed by parental gametes). Given reference genotype with baseline risk termed β0, each OR βi (i = 1 ... n) was estimated by the maximization of the log likelihood (L):
ln(L) = β0 + β1X1 + β2X2 + ... + βnXn
Where Xi is an indicator taking value 1 for genotype 'i' and 0 for the other genotypes, and βi = log ORi, with β0 being the baseline risk for reference genotype. Likelihood computations and estimation were performed using the program developed by Clayton . All the results were produced using STATA software (David Clayton, Cambridge, UK).
In case of replication of the genotype risk hierarchy, a homogeneity test on genotypic ORs was performed between the two trio family samples. In this test, we considered that, if homogeneity was present, then Q = -2(ln(maxLAB) - (ln(maxLA) + (ln(maxLB)))) would follow a χ2 distribution with n degrees of freedom (n being the number of βis estimated). LA, LB and LAB were the maximum likelihood over βi in sample A, sample B and pooled samples A and B, respectively.
If homogeneity between the two samples was confirmed, then the classification was considered validated, and OR (95% CI) were estimated by conditional logistic regression on the entire sample (samples A and B combined).
Test of the shared epitope allele classification in the new independent sample
We first observed significant over-transmission of S2 alleles (53 S2alleles transmitted versus 33.5 alleles expected; P = 1.9 × 10-6) and of S3P alleles (47 S3P alleles transmitted versus 33.5; P = 0.001), as was previously reported . S1, S3D and X alleles were under-transmitted: 28 S1 alleles were transmitted versus 40 expected (P = 0.007), 11 S3D alleles was transmitted versus 18 expected (P = 0.02), and 30 X alleles were transmitted versus 44 expected (P = 0.003). These three low-risk alleles (S1, S3D and X) were pooled as L alleles, as reported previously. Thus, in subsequent analyses we considered only three alleles (S2, S3P and L alleles), with six corresponding genotypes.
The conditional logistic regression analysis provided the following hierarchy of genotype risks: S2/S3P and S2/S2 genotypes were associated with greatest risk for RA, with ORs of 19.5 and 18.0, respectively; these were followed by S3P/S3P, S2/L and S3P/L genotypes, with ORs of 8.7, 5.3 and 3.1, respectively (with the reference genotype being L/L; Table 2). This hierarchy was precisely the same as observed previously .
Table 2. Results of the odds ratio calculation on the replication sample (sample B)
Results of the homogeneity test
The homogeneity test on genotypic ORs between the new sample and the initial one resulted in a χ2 with five degrees of freedom of 1.3 (P = 0.80). Because this test was not statistically significant, we considered the two samples to be homogeneous and the new classification to be valid.
Odds ratio estimation on the pooled sample of 200 trio families
Because the two samples were homogeneous, ORs were estimated, by conditional logistic regression, for the pooled sample of 200 trio families (Table 3).
Table 3. Results of the odds ratio calculation on the global sample (samples A and B combined)
In the present study we validated the classification of HLA-DRB1 SE alleles in RA proposed by Tezenas du Montcel and coworkers . This is the first study to validate a model of the HLA-DRB1 component of RA based on the SE hypothesis , with detailed investigation of the SE through the contribution of SE single amino acids to RA susceptibility, taking into account both linkage and association data. This work results in a risk genotype hierarchy, for which we provide OR estimates. The ORs were obtained exclusively from trio families, providing unbiased estimates for the sample investigated; this contrasts with estimations derived from case-control studies, for which the population matching between cases and controls can be questioned.
Further studies in other Caucasian and non-Caucasian populations are required to validate this new classification fully and investigate population-specific effects. The ORs reported here relate to relatively early onset RA, as is found in trio families. Because the mean duration of RA in both samples was long (18 years in sample A and 16 years in sample B), selection (survivor) bias would be possible even if we had considered those RA index patients who died between the time of DNA collection and the present study. Investigation of a population with common, sporadic RA is needed to assess the potential clinical relevance of this new classification. Studies with larger sample size would be able to refine the 95% CI of the OR. In the present study non-overlapping 95% CIs were observed only between the S2/S3P highest risk genotype (OR 22.2, 95% CI 9.9–49.7) and the S3P/L lowest risk genotype (OR 4.4, 95% CI 2.3–8.4). A significant difference between other associated genotypes remains to be established. This would provide major clues that may help in deciphering the genetic component of RA, if significant differences could be correlated with distinct pathophysiological mechanisms. It was recently reported that the SE-RA association was confined to rheumatoid factor positive patients  or to anti-citrullin positive RA patients . The precise relationship between the HLA risk genotypes and rheumatoid factor or anti-citrullinated peptide antibodies should therefore also be determined. The interaction between HLA-DRB1 genotypes and any new RA gene established by association and linkage, such as PTPN22 [15,16], could be investigated taking this new classification into account. Ultimately, this could help in identifying other RA genetic factors that may specifically interact with only one of the HLA-DRB1 genotypes. Several previous studies indicated that other genes within HLA, such as the HLA class III region, probably contribute to RA risk [17,18]. The search for interactions between additional HLA class III genetic variants, not considered in the present study, and HLA-DRB1 genotypes taking this new classification into account would be of great interest.
Large sample size studies could refine the classification for infrequent alleles. In the present study we were unable to examine rigorously the amino acid at position 71 or at position 70, particularly for the S1 allele group, in which small sample size prevented study of the role played by the different alleles encoding the D-E-RAA motif. This D-E-RAA motif has been reported to be protective in the literature and constitutes an alternative SE hypothesis, although we obtained no support for it during our initial study . The different S2 sequences Q-K-RAA (*0401) and D-K-RAA (*1303) should be evaluated separately, because the presence of an aspartic acid at position 70 has been reported to influence susceptibility to RA . Similarly, the S3P sequences Q-R-RAA (*0101, *0102, *0404, *0405, *0408) and R-R-RAA (*1001) should be differentiated. The investigation of other amino acid positions from the third hypervariable region of the HLA-DR β-chain would be interesting, especially for positions 67 (for which the presence of an isoleucine might be important ) and 86 (as proposed by Gao and coworkers ).
Because numerous association studies have suggested that the primary role played by the SE might lie in the development of severe RA , the relevance of this classification should be evaluated for RA prognosis in prospective cohorts. A first investigation with the new classification already provides some support for a correlation with progression of radiographic damage . Indeed, it would be of great help to be able to identify those RA patients at risk for development of more severe disease, who may require more aggressive therapeutic management than patients with better prognosis.
In the present study we validated a first model of the effect of HLA-DRB1 on RA, reshaping the SE hypothesis and providing initial estimates for the resulting risk genotypes. Building on this new HLA genotype classification could lead to improvement in our understanding of the genetics, pathophysiology and potential clinical use in management of RA.
CI = confidence interval; OR = odds ratio; RA = rheumatoid arthritis; SE = shared epitope.
The authors declare that they have no competing interests.
LM, PC, EP-T, STM, FC-D and FC designed the study. LM, EP-T, IL, CP, JO, WF, SL, PQ, TB and FC acquired the data. LM, PC, EP-T, STM, BP, FCD and FC analyzed and interpreted the data. All authors read and approved the final manuscript.
The authors are grateful to the RA family members for their participation; Dr Pierre Fritz for reviewing the clinical data; and Dr J F Prudhomme, Dr C Bouchier, Pr J Weissenbach (Généthon), Mrs MF Legrand and Pr G Thomas (Fondation Jean-Dausset-CEPH) for technical help with the DNA samples.
The European Consortium on Rheumatoid Arthritis Families (ECRAF) was initiated with funding from the European Commission (BIOMED2) by T Bardin, D Charron, F Cornélis (coordinator), S Fauré, D Kuntz, M Martinez, JF Prudhomme and J Weissenbach (France); R Westhovens and J Dequeker (Belgium); A Balsa and D Pascuale-Salcedo (Spain); M Spyropoulou and C Stavropoulos (Greece); P Migliorini and S Bombardieri (Italy); P Barrera and L Van de Putte (The Netherlands); and H Alves and A Lopez-Vaz (Portugal)
This work was supported by Association Française des Polyarthritiques, Société Française de Rhumatologie, Association Rhumatisme et Travail, Association Polyarctique, Groupe Taitbout, Académie de Médecine, Association de Recherche sur la Polyarthrite, Genopole, Conseil Régional Ile de France, Fondation pour la Recherche Médicale, Université Evry-Val d'Essonne and unrestricted institutional support from Wyeth, Schering-Plough, Pfizer and Amgen.
Arthritis Rheum 1987, 30:1205-1213. PubMed Abstract
Am J Hum Genet 1996, 58:371-383. PubMed Abstract
Génin E, Babron MC, McDermott MF, Mulcahy B, Waldron-Lynch F, Adams C, Clegg DO, Ward RH, Shanahan F, Molloy MG, et al.: Modelling the major histocompatibility complex susceptibility to RA using the MASC method.
Tezenas du Montcel S, Michou L, Petit-Teixeira E, Osorio J, Lemaire I, Lasbleiz S, Pierlot C, Quillet P, Bardin T, Prum B, et al.: New classification of HLA-DRB1 alleles supports the shared epitope hypothesis of rheumatoid arthritis susceptibility.
Ann Hum Genet 1988, 52:247-258. PubMed Abstract
Arnett FC, Edworthy SM, Bloch DA, McShane DJ, Fries JF, Cooper NS, Healey LA, Kaplan SR, Liang MH, Luthra HS, et al.: The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis.
Arthritis Rheum 1988, 31:315-324. PubMed Abstract
Biometrics 1991, 47:53-61. PubMed Abstract
Cordell HJ, Clayton DG: A unified stepwise regression procedure for evaluating the relative effects of polymorphisms within a gene using case/control or family data: application to HLA in type 1 diabetes.
Stata software by Clayton D [http://www-gene.cimr.cam.ac.uk/clayton/software/stata] webcite
Klareskog L, Stolt P, Lundberg K, Kallberg H, Bengtsson C, Grunewald J, Ronnelig J, Harris HE, Ulfgren AK, Rantapaa-Dahlqvist S, et al.: A new model for an etiology of rheumatoid arthritis: smoking may trigger HLA-DR (shared epitope)-restricted immune reactions to autoantigens modified by citrullination.
Begovich AB, Carlton VE, Honigberg LA, Schrodi SJ, Chokkalingam AP, Alexander HC, Ardlie KG, Huang Q, Smith AM, Spoerke JM, et al.: A missense single-nucleotide polymorphism in a gene encoding a protein tyrosine phosphatase (PTPN22) is associated with rheumatoid arthritis.
Dieudé P, Garnier S, Michou L, Petit-Teixeira E, Glikmans E, Pierlot C, Lasbleiz S, Bardin T, Prum B, Cornélis F: Rheumatoid arthritis seropositive for the rheumatoid factor is linked to the protein tyrosine phosphatase nonreceptor 22–620 W allele.
Jawaheer D, Li W, Graham RR, Chen W, Damle A, Xiao X, Monteiro J, Khalili H, Lee A, Lundsten R, et al.: Dissecting the genetic complexity of the association between human leukocyte antigen and rheumatoid arthritis.
J Rheumatol 2001, 28:232-239. PubMed Abstract
Arthritis Rheum 1990, 33:939-941. PubMed Abstract
Gourraud PA, Boyer JF, Barnetche T, Abbal M, Cambon-Thomsen A, Cantagrel A, Constantin A: A new classification of HLA-DRB1 alleles differentiates predisposing and protective alleles for rheumatoid arthritis structural severity.