Research Article - (2016) Volume 7, Issue 2
Keywords: Indels; Coronary heart disease; ANNOVA; CVD treatment
Cardiovascular diseases (CVD) remain the leading cause of death among Europeans and around the world. The Global Burden of Disease study estimated that 29.6% of all deaths worldwide (15.6 million deaths) were caused by CVD in 2010, more than all communicable, maternal, neonatal and nutritional disorders combined, and double the number of deaths caused by cancers [1].
Although the number of diagnosed cases of CVD and coronary heart disease (CHD) in particular has been constantly rising there is clear evidence in most countries with available data that mortality and case-fatality rates from CHD and stroke have decreased substantially over the last 5–10 years but at differing rates. The number of CVD caused death cases has fallen quite substantially in western European countries while several Eastern European countries including Russia, Ukraine and Lithuania have shown little evidence of such trends [2].
These trends indicate that the possible reason for the contrasting changes in CVD caused hospitalizations and CVD caused deaths may be attributed to the advances made in healthcare quality associated with medicine based treatment and risk factor management. Pharmacogenomics based CVD drug prescription may help further reduce the number of CVD caused deaths and reduce adverse drug reactions which have been reported about 5 times more frequently in Lithuania between the years of 2007 and 2015 [3].
There is significant inter individual variability in the response to pharmacologic agents used for CVD treatment [4]. Plasma drug levels can differ by orders of magnitude when the same drug dose is administered to two patients who are similar in age and weight [5]. Many environmental factors can influence the variability in drug responses between patients including drug-drug interactions, concomitant diseases and ethnicity, however, genetic factors are also highly likely to play a major role especially in the case of non-inducible enzymes like CYP2D6 [6].
The aim of this study was to investigate the pharmacogenomic peculiarities of the Lithuanian population by analyzing the distribution patterns of allele and genotype frequencies of common and rare variants in the genes associated with drug efficacy and adverse drug reactions (ADR).
Study group
The study group included 98 (49 males and 49 females) selfreported healthy unrelated individuals from the Lithuanian population with at least 3 generations living in Lithuania. The individuals included in the study group adequately represent the Lithuanian population as the group includes individuals from all six Lithuanian ethnolinguistic groups (Figure 1).
Genomic marker selection
The most commonly used drugs for the treatment of CVD belong to the classes of non-steroid anti-inflammatory agents (NSAID), ACE inhibitors, β-blockers, antithrombotic and antiplatelet agents, statins and others. Genes were chosen for the analysis in this study based on the strength of evidence associating drugs belonging to these classes with genetic factors that affect their efficacy or risk of ADRs. The frequency of use of the drugs from these drug classes in the Lithuanian population was also taken into consideration. The selected genes are presented in Table 1.
Gene | Drug class | Gene | Drug class |
---|---|---|---|
ABCB1 | Antithrombotics, Antiarrhythmics, Diuretics, Statins | HTR3B | Statins |
ABCC2 | Antithrombotics, Statins | HTR7 | Statins |
ACE | ACE inhibitors, Statins | ITGB3 | Antithrombotics |
ADAMTS1 | Statins | KCNE1 | Antiarrhythmics |
ADRB1 | β-blockers | KCNE2 | Antiarrhythmics |
ADRB2 | β-blockers | KCNH2 | Antiarrhythmics |
AGT | ACE inhibitors | KCNJ2 | Antiarrhythmics |
AKAP9 | Antiarrhythmics | KCNQ1 | Antiarrhythmics |
APOE | Statins | KIF6 | Statins |
BCHE | Vasodilators | LPL | Statins |
BDKRB1 | ACE inhibitors | LTC4S | NSAIDs |
BDKRB2 | ACE inhibitors | MTHFR | Statins |
CBS | Statins | NOS3 | Vasodilators, β-blockers |
CETP | Statins | NAT1 | Vasodilators, Antiarrhythmics |
CYP11B2 | Vasodilators, diuretics, β-blockers | NAT2 | NSAIDs, Antiarrhythmics |
CYP1A2 | Antiarrhythmics | NPC1L1 | Statins |
CYP2C19 | β-blockers, Antithrombotics, Statins | P2RY1 | NSAIDs, Antithrombotics |
CYP2C8 | NSAIDs, Antiarrhythmics | P2RY12 | NSAIDs, Antithrombotics |
CYP2C9 | NSAIDs, β-blockers, Antithrombotics, Statins | PON1 | Antithrombotics |
CYP2D6 | β-blockers, Antiarrhythmics | PTGS1 | NSAIDs, Antithrombotics |
CYP3A4 | Statins, Antiarrhythmics | PTGS2 | NSAIDs, Antithrombotics |
CYP3A43 | Statins | SCN5A | Antiarrhythmics |
CYP3A5 | Statins, Antithrombotics | SLCO1B1 | Statins |
CYP4F2 | Antithrombotics | SNTA1 | Antiarrhythmics |
FCAR | Statins | UGT1A1 | Statins |
GNB3 | Diuretics | VKORC1 | Antithrombotics |
GRK5 | Vasodilators, β-blockers | XPNPEP2 | ACE inhibitors |
HMGCR | Statins |
Table 1: Pharmacogenomically relevant genes and associated drug classes.
Whole exome sequencing
For the analysis of SNVs and small indels the whole exome sequencing (WES) method was employed using the Applied Biosystems 5500 SOLiD™ Sequencer with the Agilent SureSelectXT Target Enrichment System and Life Technologies TargetSeq™ Exome Enrichment System without Exact Call Chemistry (ECC). During the experiment the protocols supplied by the kit providers were optimized and used [7]. Using either exome sequencing system important variants may be found outside of the targeted exonic regions [8]. Therefore, all identified variants with sufficient coverage and quality scores were included in the analysis. Over 80% of identified variants in the selected genes identified in this study have already been described and included in public databases including dbSNP, 1000 Genome and Exome Aggregation Consortium (ExAC) datasets. The frequencies of indels and identified SNV alleles were also compared to the Caucasian populations included in the mentioned datasets.
Bioinformatics and statistical analysis
Primary analysis of image acquisition and bead processing, application of quality metrics and color calls were performed within the SOLiD sequencer. Secondary and tertiary exome sequencing data analyses were carried out using Life TechnologiesTM proprietary LifeScopeTM 2.5.1 genomic analysis software. Standard analysis workflow of the exome data was carried out in three stages using targeted.resequencing.frag workflow with default parameters set [9]. Variants with coverage of more than 10x were used to filter out the less reliable calls [10]. Minimum read mapping quality value of 8 was used. A minimum allele ratio of 0.15 was used for a heterozygous genotype call. A color minimum quality value of 7 was used for a candidate allele to be called as SOLiD system uses color space coding.
ANNOVAR software was used to annotate the exome sequencing data and identify variants likely to be relevant for CVD treatment [11]. Chi-square test and where appropriate Fishers exact test were used to compare allele and genotype frequency distributions between populations and datasets with the significance threshold considered to be at p<0.05.
Small insertion and deletion analysis
Most of the indel variants identified in this study were previously unreported in literature or the analyzed datasets. They also were present in the Lithuanian population at a low frequency.
Chr | Start Pos | End Pos | Gene | Ref | Alt | 1000g EUR Alt | ExACEur Alt | LTU Alt |
---|---|---|---|---|---|---|---|---|
3 | 38622618 | 38622618 | SCN5A | G | - | - | - | 0,005 |
6 | 39507972 | 39507973 | KIF6 | TT | - | - | - | 0,005 |
7 | 87195476 | 87195476 | ABCB1 | A | - | - | - | 0,005 |
7 | 99273810 | 99273810 | CYP3A5 | - | C | 0,012 | 0,011 | 0,005 |
7 | 99434078 | 99434078 | CYP3A43 | A | - | 0,038 | 0,037 | 0,036 |
8 | 18258073 | 18258073 | NAT2 | - | A | - | - | 0,010 |
9 | 125145847 | 125145847 | PTGS1 | C | - | - | - | 0,005 |
10 | 101552021 | 101552021 | ABCC2 | G | - | - | - | 0,005 |
10 | 101577212 | 101577212 | ABCC2 | G | - | - | - | 0,005 |
10 | 115804365 | 115804365 | ADRB1 | - | T | - | - | 0,005 |
12 | 21391965 | 21391969 | SLCO1B1 | TATAT | - | - | 7,71E-05 | 0,005 |
19 | 16000503 | 16000503 | CYP4F2 | C | - | - | 1,50E-05 | 0,010 |
22 | 42524339 | 42524339 | CYP2D6 | A | - | - | - | 0,005 |
Table 2: Frequencies of exonicframeshiftIndels potentially relevant for drug metabolism.
Using the default LifeScopeTM software parameters 31,623 unique short indels were called in the study group from the Lithuanian population over 70 of which we called in the pharmacogenomically relevant genes. 13 of the called variants were identified as exonic frameshift variants (3 insertions and 10 deletions) (Table 2). Only two of the called variants were previously identified and included in the dbSNP142 database - CYP3A43 rs61469810 (c.74delA) and CYP3A5 rs200579169 (c.92dupG). Neither of these variants has been previously reported to be associated with the efficacy or ADRs caused by drugs metabolized by CYP3A43 and CYP3A5.
The CYP3A43 rs61469810 variant deletion allele was reported 10.5% of the cases in the total 1000 genome dataset and 5.4% in the ExAC dataset while it was found at a frequency of 3.8% in the 1000 genome European dataset and 3.7% in the ExAC NFE dataset. The data in Lithuanian population was in concordance with the datasets of European individuals. In the Lithuanian study group, a single individual was identified as heterozygous for the rs61469810 deletion allele and 3 individuals were homozygous for the deletion allele (deletion allele frequency 3,6%). The CYP3A5 rs200579169 variant insertion allele was identified in a single heterozygous individual representing a lower frequency (0.05%) than identified in the 1000 genome European and ExAC NFE datasets (1.2% and 1.1%
respectively).
A single exonic non-frameshift insertion in AKAP9 gene (c. 4003_4004insAAC, rs10644111) was identified. 34 individuals were heterozygous and 8 individuals were homozygous for these variants. The 25.5% insertion allele frequency was significantly lower than the frequencies reported in both total and European origin 1000 Genome datasets (42.4% and 38.7% respectively, p<0.001). It was also significantly lower compared to the frequencies in both total and non- Finnish European ExAC (ExAC NFE) datasets (both at 40, 1%, p<0.0001).
The analysis of indels in the introns of pharmacogenomically relevant genes revealed further 53 variants (39 deletions and 14 insertions) one of which (not found in dbSNP) was identified as a splice site mutation in the GRK5 gene g.121086027delG.
The identified indels that were included in dbSNP 1000 Genomes and the ExAC datasets were not previously associated with efficacy of CVD treatment drugs or ADRs caused by them, yet in a relatively small population sample 20% of subjects had at least one exonic indel variant in clinically actionable genes identified by Clinical Pharmacogenetics Implementation Consortium (CPIC) and included in the PharmGKB database showing the value of the non-candidate gene approach [12].
Single nucleotide variant analysis
In the 98 individuals from the Lithuanian population a total of 243,192 unique SNVs were called. Over 300 unique SNVs were called in the exons of the genes potentially important for CVD treatment. 184 variants in the exons were called as non-synonymous SNVs and 148 synonymous SNVs. Further 385 SNVs were identified in the introns, 28 in 3’ UTR and 11 in 5’ UTR region. Of the total 775 identified variants 631 were included in the dbSNP142 database. Of the identified exonic variants 256 SNVs were identified and included in the dbSNP142 database.
Of the identified 256 exonic SNVs 156 had the variant allele frequency of <1% and were not included in further frequency analysis. Of these variants 86 were included in the dbSNP142 dataset. The analysis of allele frequencies of these variants showed no significant differences between Lithuanian population and study groups of European descent included in the 1000 Genomes and ExAC datasets. With the only exception of the CYP11B2 rs4539 (c.518A>G) which had significantly higher G allele frequency (p<0,0001) in the EUR group of the 1000 Genome dataset (48.4%) and ExAC NFE dataset (45.2%) compared to the Lithuanian population (1,2%).
Chr | Gene | SNV | Ref | Alt | 1000g EUR Alt | ExAC EUR Alt | LTU Alt |
---|---|---|---|---|---|---|---|
chr8 | NAT2 | rs1801280 | T | C | 0.449 | 0.456 | 0.423 |
chr8 | NAT2 | rs1799930 | G | A | 0.282 | 0.290 | 0.224 |
chr8 | NAT2 | rs1208 | G | A | 0.562 | 0.567 | 0.526 |
chr8 | NAT2 | rs1799931 | G | A | 0.023 | 0.025 | 0.061 |
chr10 | ADRB1 | rs1801253 | G | C | 0.685 | 0.722 | 0.102 |
chr10 | CYP2C19 | rs28399504 | A | G | 0.001 | 0.003 | 0.005 |
chr10 | CYP2C19 | rs4244285 | G | A | 0.145 | 0.148 | 0.138 |
chr10 | CYP2C9 | rs1799853 | C | T | 0.124 | 0.127 | 0.066 |
chr10 | CYP2C9 | rs1057910 | A | C | 0.073 | 0.069 | 0.051 |
chr22 | CYP2D6 | rs1135840 | G | C | 0.454 | 0.448 | 0.383 |
chr22 | CYP2D6 | rs16947 | A | G | 0.657 | 0.657 | 0.541 |
chr22 | CYP2D6 | rs1065852 | G | A | 0.202 | 0.249 | 0.143 |
chr16 | VKORC1 | rs9934438 | G | A | 0.388 | NA | 0.087 |
Table 3: Frequencies of SNVs identified in the ClinVar database as associated with drug response.
Variants in the CYP11B2 (rs1799998) gene have been previously associated with Arterial hypertension and treatment of arterial hypertension with diuretics like furosemide, β-blockers like atenolol and others [13,14]. Using the ClinVar database which reports the relationships among human variations and phenotypes based on supporting evidence 13 variants in the analyzed exomes were identified as related to drug response (Table 3) [15].
Arylamine N-acetyltransferase 2 (NAT2) gene
High level of variation was identified in the NAT2 gene. The gene codes for an Arylamine N-acetyltransferases (NAT ) which are xenobiotic metabolizing enzymes. NAT2 expression is found predominantly in the liver, small intestine and colon tissues and thus is regarded as a typical xenobiotic metabolizing enzyme, though basal NAT2 mRNA levels can be found in most tissues [16,17]. Humans exhibit genetic polymorphism in NAT2 resulting in rapid, intermediate and slow acetylator phenotypes. Over 65 NAT2 variants possessing one or more SNPs in the 870-bp NAT2 coding region have been reported. Of the seven most frequent SNPs rs1801279 (c.191G>A), rs1041983 (c. 282C>T), rs1801280 (c.341T>C), rs1799929 (c.481C>T), rs1799930 (c. 590G>A), rs1208 (c.803A>G) and rs1799931 (c.857G>A) we identified rs1801280, rs1208, rs1799930 and rs1799931 which allows the inference of acetylator status with over 92% accuracy [18].
rs1801280 (c. 341T>C) which is a signature SNP for NAT2*5 allelic group which was shown in over 30 publications to confer a slow acetylation phenotype [19]. Based on the rs1801280 polymorphism the frequency of the NAT2*5 allele in the Lithuanian population is 42.3% which is lower, but not significantly different from 1000 Genomes and ExAC datasets.
The rs1208 G allele is present in a range of allelic NAT2 variants most notably *4 (wildtype) and *12 which are associated with rapid acetylator status, yet the rs1208 allele G is present in multiple NAT2* alleles that confer different phenotypes, thus separately it does not provide sufficient information to attribute acelylator status to the subject and therefore coverage of the other SNPs in these alleles is required.
rs1799930 (c.590G>A, p.Arg197Gln) is the signature SNP for the NAT2*6 allelic group. *6A is a common allele, with a global frequency of 26%, ranging from 13% in Baka Pygmies, 22% in Koreans, 27% in US Caucasians, and 32% in Russians [20]. In the Lithuanian population the frequency of the A allele is found at a frequency of 22.4% which is significantly less frequent than in 1000 Genome and ExAC datasets of European descent (28.2% and 29% respectively).
The rs1799931 (c.857G>A, p.Gly286Glu) is the signature SNP for NAT2*7 allelic group.*7A and *7B confer a slow acetylator phenotype but this may be dependent on substrate. This variant is often grouped with other variants that confer a slow acetylation phenotype. rs1799931 allele A is the only variant of the NAT2*7A allele, which should only be defined if other positions have been examined and confirmed to be wildtype by genotyping. The rs1799931 allele A is rare in Caucasian populations as represented by frequencies of 2.5% in the 1000 Genomes dataset and 2.5% in the ExAC dataset. The allele was identified significantly more often in the Lithuanian population at 6.1%.
None of the individually genotyped variants are sufficient to identify specific NAT2 alleles, yet their ability to predict allelic groups show that the prevalence of rare acetylator status in the Lithuanian population is significantly less common than in other populations of European descent identified in the 1000 Genomes and ExAC datasets.
The cytochrome P450 superfamily genes
SNV rs28399504 (c.1A>G) in the CYP2C19 gene the alternate G allele of which is strongly associated with decreased response to a popular antiplatelet agent Clopidogrel and is present in the CYP2C19*4, CYP2C19*4, CYP2C19*4A, CYP2C19*4B haplotypes was identified at a frequency 0.5% in the Lithuanian population which is higher than the frequencies reported in the 1000 Genomes and ExAC datasets (both at 0.1%) [21]. The A allele of the CYP2C19 rs4244285 (c. 681G>A) SNV was less common in the Lithuanian population compared to the same datasets. The presence of this allele may decrease the possibility of treatment with Clopidogrel being ineffective, yet no statistical significance was achieved when comparing the allele frequency distributions [22].
The cytochrome P450 2D6 (CYP2D6) is an enzyme highly important for pharmacogenomics and is now thought to be involved in the metabolism of up to 25% of commonly used drugs [23]. CYP2D6 polymorphisms have implications across many different therapeutic areas, as a diverse array of clinically used drugs are metabolized by CYP2D6 including antiarrhythmics (Flecainide, Mexiletine, Propafenone) and beta-blockers (Carvedilol, Metoprolol, Timolol) [24]. Depending on the activity of the CYP2D6 enzyme subjects can be divided in order of highest functioning to lowest: ultrarapid metabolizers (UM), extensive metabolizers (EM), intermediate metabolizers (IM), and poor metabolizers (PM) [24]. In the literature CYP2D6 allele frequencies are usually reported in terms of haplotypes. The separate variants interrogated in this study by themselves or in combination with each other are not sufficient for the identification of specific haplotypes as rs1135840 (c.1304G>C) C allele is present in over 50 different CYP2D6 haplotypes and rs16947 (c.T733C) in over 40, but as the variant T allele of the rs1065852 (c.100C>T) is present in both the non-functional CYP2D6*4 haplotype and the reduced function CYP2D6*10 haplotypes it is important to note that T allele is less common (14.3%) in the Lithuanian population compared to the reference datasets (1000Genomes and ExAC) used for this study (20.2% and 24.9% respectively).
CYP2C9 is a phase I drug-metabolizing cytochrome P450 (CYP450 ) enzyme that plays a major role in the oxidation of both xenobiotic and endogenous compounds. CYP2C9 is the enzyme responsible for the metabolism of the S-isomer of warfarin that is the key isoform responsible for the anticoagulant effect of the drug.
Patients with the poor metabolizer *2 (rs1799853. c.430C>T) and *3 (identified by rs1057910. c.1075A>C) haplotypes require lower doses of warfarin to achieve a similar anticoagulant as patients with at least one *1 (wild-type) haplotype [25,26]. CYP2C9 genotype accounts for only part of the variability in Warfarin efficacy as VKORC1 genotype as well as phenotypes including age and weight are also key factors in predicting the therapeutic dose for warfarin [27,28]. Both variants associated with the decreased CYP2C9 activity phenotypes were identified in the study. rs1057910 C allele in the Lithuanian population was found at a lower frequency (5.1%) then in the chosen reference datasets – 1000 genomes (7.3%) and ExAC (6.9%), but statistical significance was not achieved (p>0.05) while the rs1799853 T allele significantly less common in the Lithuanian study group - 6.6% vs. 12.4% and 12.7% respectively (p values 0.017 and 0.011 respectively).
Genes important for drug pharmacodynamics
rs9934438 (g.1173C>T) is a SNP in the first intron of VKORC1 gene and is in near perfect linkage disequilibrium with rs9923231. rs9934438 was the first SNP associated with the low dose warfarin requirement [29]. Although it is considered to have no functional effect, rs9934438 is commonly used to infer rs9923231 genotype and haplotypescontaining this variant. T allele is associated with response to a lower warfarin dose and as this allele is significantly common in the Lithuanian population (8.7%) compared to European descent subgroup of the 1000 Genomes dataset (the SNP was not identified in the ExAC dataset). It is thus reasonable to assume that it is likely that a patient from the Lithuanian population treated with warfarin will require a lower dose to achieve a therapeutic affect and are at a higher risk of adverse bleeding reactions when treated with regular warfarin dose than patients from other Caucasian populations especially considering the low frequency of the CYP2C9 *2 and *3 alleles in the Lithuanian population.
The rs1801253 (c.1165G>C, p.Gly389Arg) variant in the ADRB1 gene has been shown in multiple studies to be important for the treatment of hypertension, CHD and other CVDs [30]. The rs1801253 polymorphism is located in the cytoplasmic tail in the G-protein coupling domain [31]. In vitro studies of the c.1165G>C variant indicate that basal and agonist-simulated adenylyl cyclase activity is higher with the C allele present as a result of enhanced coupling to Gs [32]. Significant associations have been shown of the rs1801253 alleles and the efficacy of various drugs including β-blockers: carvedilol, metoprolol, Dobutamine, atenolol and others including ACE inhibitors and diuretics [33,34]. The C allele which is associated with decreased response to atenolol, carvedilol, diltiazem, metoprolol or verapamil is significantly less common in the Lithuanian population (10.2%) compared to 1000 Genomes and ExAC project European subsets (68.5% and 72.2% respectively, p<0,0001 in both cases) indicating that in the Lithuanian population a lack of response to treatment with a range β-blockers may be less likely than in other compared Caucasian populations [35].
Exome sequencing approach to testing the prevalence of pharmacogenomically relevant variants in a population proved effective as a range of different types of variants at varying frequencies was identified in the study group from the Lithuanian population. Using conservative quality control criteria to preserve specificity the applied method revealed over 300 exonic variants (13 indels and 300 SNVs) in the genes relevant for cardiovascular disease treatment. Every subject from the studied 98 individuals had at least 2 variants that could be relevant for CVD treatment. Given the number of different types of medications used to treat common CVDs like CHD the number of identified variants is significant. While a large number of exonic variants that were identified in the pharmacogenomically relevant genes in this study have not been previously associated with efficacy or ADRs caused by specific CVD drugs, the amino acid changes caused by the variants as well as protein function prediction (SIFT, PolyPhen) and conservation (PhyloP) scores indicate the potential relevance which should be investigated in further functional studies [36]. Furthermore, the used method provided sufficient coverage of intronic regions and UTRs to identify variants that were shown to affect the efficacy or ADRs caused by drugs used for CVD treatment thus further showing the utility of the approach. The in-depth analysis of both exome small indels and SNVs shows that several pharmacogenomically relevant variants are found in the Lithuanian population that distinguish it from other populations of European descent. Some of the identified variants in NAT2, CYP2C9 and VKORC1 genes in particular make a case for the move towards genetic testing in clinical practice as standard phenotype based drug dose attributions may not only fail to achieve a required therapeutic effect but cause adverse drug reactions.