To provide a systematic overview on expression profiles, to determine a ranking score and to give insight in to the heterogeneity of individual patients within the group of arthritis patients.
Synovial tissue specimens of 10 rheumatoid arthritis (RA), 10 osteoarthritis and 10 normal donors were subjected to GeneChip HG-U133A expression profiling according to the standard protocol, starting with 5 μg total RNA and using 15 μg cRNA for hybridization. Signals were generated with the GeneChip Operating System and scaled to equal intensities of the whole array. Further analysis included t test and analysis of variance (ANOVA) statistics as well as functional profile component analysis. Classification was performed according to the Prediction Analysis for Microarrays algorithm. Systematic multiple testing with statistics and classification tools was programmed in perl using subgroups of patients and subgroups of genes.
To characterize the homogeneity of each group, ANOVA and t test statistics were applied using 'leave one out' and 'leave two out' for candidate selection. Subsequently, these one or two donors were tested for the predictive value of the selected candidates. This revealed that one RA donoe, if not participating in the selection process, grouped to osteoarthritis. Analysis for functional profile components showed less infiltration and less inflammation in this donor. However, if this RA donor contributed to the candidate selection, all RA patients were correctly classified. Furthermore, donors of other groups were also classified error-free. This demonstrates that RA with reduced molecular markers of inflammation can still be separated from osteoarthritis and that incorporation of such RA patients in the selection process of candidate genes is mandatory for correct classification.
To characterize the importance of each gene for classification, ranking of candidates was performed according to the significance level by t testing. Multiple subgroups were systematically tested and the ranking of genes was compared. Using an averaged rank list, gene sets were stepwise expanded and systematically tested for classification potential. In addition, the contribution of each gene to the correct classification was assigned to each donor. All together, this information can be visualized on a gene and donor specific way including annotation of significance, classification and proportionate contribution to classification.
Systematic multiple testing of gene expression profiles provides a precise overview on the quality of array data. It allows ranking of gene candidates, provides insight into patient specific contribution to classification and thus an individualized interpretation of gene expression data.