Commentary - (2015) Volume 0, Issue 0
In order to co-express multiple genes for biotechnological and biomedical applications, several approaches have been used with varying degrees of success. Currently, internal ribosome entry site (IRES) elements and “selfcleaving” 2A peptides are the most widely used. The length of the IRES can be prohibitive and IRES-dependent translation of the second open reading frame is often significantly reduced. 2A peptides have gained in popularity due to their small size and ability to consistently produce discrete proteins at an equal level. Here, we promote the use of these sequences as the “go-to” technology for co-expression of multiple proteins.
Keywords: Protein co-expression, 2A, Biotechnology, Biomedicine
Many biotechnological and biomedical applications rely on the effective co-expression of multiple proteins. So far, multiple genes have been expressed via: (i) monocistronic vectors e.g. viral co-infection or co-transfection of plasmids expressing one protein each; fusion proteins; fusion proteins incorporating proteinase cleavage sites and (ii) polycistronic vectors where multiple genes are assembled either under the control of multiple promoters or a single promoter. IRES sequences have been used as a method to separate two coding sequences under the control of a single promoter. Despite their widespread use, they are relatively large (~600 base pairs) and expression of the downstream gene is significantly less efficient than the upstream gene [1]. A different approach using the 2A oligopeptide sequence allows multiple discrete proteins to be synthesized from a single strand of RNA, which also functions as a messenger RNA (mRNA). Not only is the 2A sequence smaller (54-174bp) than IRES elements, co-expression of proteins linked via 2A is independent of the cell type (cleavage activity is only dependent on eukaryotic ribosomes) and proteins are produced with equimolar stoichiometry [2-4]. The potential of this system is vividly demonstrated in the yeast Pichia pastoris by the functional co-expression of the carotenoid and violacein biosynthetic pathways encoded by nine genes from a polycistronic construct based on 2A peptides: the highest number of genes expressed in such a co-ordinated fashion so far [5]. IRES and 2A sequences have been used successfully in an impressive array of studies: at least 200 in the case of IRES elements and almost 900 in the case of 2A peptides making 2A the “go-to” option for protein coexpression [6-9]. Here we provide a short update on the history of 2Aand cover some applications of 2A co-expression technology.
The Foot-and-mouth disease virus (FMDV) 2A sequence (hereafter “F2A”) mediates “self-processing” by a novel translational effect variously referred to as ‘ribosome skipping’, ‘stop-go’ and ‘stop-carry on’ translation [10-12]. 2A peptide cleavage has been studied in various cell types using various recombinant polyproteins and artificial reporter polyprotein systems comprising chloramphenicol acetyltransferase (CAT), β-glucuronidase (GUS), and fluorescent proteins (FPs e.g. GFP, RFP, YFP) [13-16]. It was demonstrated that F2A plus the N-terminal proline of the 2B downstream protein co-translationally ‘self-cleaved’ at the glycyl-prolyl pair site corresponding to the 2A/2B junction (LLNFDLLKLAGDVESNPG↓P-) (Figure 1) [17-25]. Later work shows the length of the F2A used is also important for cleavage in vitro and in vivo - higher cleavage efficiency was reported when longer versions of 2A with N-terminal extensions derived from FMDV capsid protein1D upstream of 2A were used [18,20,26-33]. Importantly, the co- translational nature of this cleavage means that by including varioussignal sequences within 2A polyproteins, either up- or downstream of2A, proteins can be either targeted co-translationally to the exocyticpathway or post-translationally to different cellular compartments[34-36]. Of the many 2A peptides identified to date [10,27,37], fourhave been widely used in biotechnology and biomedicine: F2A, equinerhinitis A virus (“E2A”), porcine teschovirus-1 (“P2A”) and Thoseaasigna virus (“T2A”) (Table 1) [2-4,36,38-43].
Abbreviation | Source | 2A/2A-like sequence |
---|---|---|
F2A | Foot-and-mouth disease virus (FMDV) |
-PVKQLLNFDLLKLAGDVESNPG P- |
E2A | Equine rhinitis A virus | -QCTNYALLKLAGDVESNPG P- |
P2A | Porcine techovirus-1 | -ATNFSLLKQAGDVEENPG P- |
T2A | Thosea asigna virus | -EGRGSLLTCGDVESNPG P- |
Table 1: The resume of changes in ectonucleotidase activities in response to acute and chronic moderate exercise. é: increase; ê: decrease; NC: no change'
Figure 1: Schematic presentation of co-translational cleavage via “selfcleaving” 2A peptide. Gene sequences 1 (stop codon removed) and 2 are linked together into a single (trans) gene via a 2A sequence. The translation products are synthesized in an equimolar ratio, although, protein 1 upstream of 2A bears a C-terminal extension of 2A, and protein 2 bears an N-terminal proline residue.
Finally, it should be noted that (i) 2A remains as a C-terminal extension of the upstream gene, and (ii) proline forms the N terminus of the downstream gene. The presence of N-terminal proline does not seem to affect proteins which are metabolically stable [44], but when the authentic C-terminus is required for activity or subcellular targeting of certain proteins, they should either be encoded at the C-terminus of the polyprotein or followed by cleavage sequences of the mammalian Kex2p homologue, furin (-↓RRRR-, -↓RKRR-, -↓RRKR-), which removes the 2A “tag” [21,38]. The presence of 2A, however, can be used for detection of protein expression and localization using anti-2A antibodies [2,3].
F2A is the most widely used 2A sequence in plant biotechnology and has been used to target multiple proteins to various subcellular compartments [13,34,45,46], to improve disease resistance [47,48], drought-resistance [49] and nutritional value through metabolome engineering [10,50]. Vitamin A deficiency (VAD) is a major global health issue which affects hundreds of millions of people. This problem arises because rice, the staple food source in countries where VAD is prevalent, does not produce vitamin A or its precursor β-carotene, which have a number of vital functions in the body including growth. Since 2000, researchers have been engineering a transgenic variety of rice referred to as “golden rice” (Oryza sativa, GR) that includes the biosynthetic pathway for production of β-carotene [51,52]. Engineering the pathway into (carotenoid-free) rice endosperm requires two carotenoid biosynthetic genes, phytoene synthase (psy) and carotene desaturase (crtl) [53]. To avoid the problems of promoter interference (GR1 and 2 require two promoters), psy from Capsicum and crtl from Pantoea, were linked via 2A (psy-F2A-crtl; pPAC construct) or IRES (psy-IRES-crtl; pPIC construct) and placed under the control of the rice globulin promoter [50]. GR3 of transgenic PAC had a much more intense golden colour than did the PIC transformants, demonstrating that the 2A construct performed better than the IRES construct interms of carotenoid production.
F2A and ‘2A-like’ sequences have been used extensively in genetic engineering of T cells for adoptive cell therapies [39-41], human stem cells [42,54,55] and in the induction of pluripotent stem cells [23,25,43]. For example, the genes encoding four transmembrane proteins necessary for the assembly of the CD3 complex were coexpressed from a polycistronic vector containing three 2As [36] and coexpression of four transcription factors in a [KLF4-E2A-OCT3/4-T2ASOX2- P2A-c-Myc]-IRES-hrGFP construct resulted in the generation of induced pluripotent stem (iPS) cells from somatic cells [56]. High levels of functional monoclonal antibodies were produced by linking the antibody heavy and light chain sequences with F2A in the context of AAV-mediated gene transfer [21]. 2As were used to co-express tumor associated antigens (TCR) α- and β-chains to treat metastatic melanoma, colorectal cancer, renal cell carcinomas and many other types of malignant diseases [57-59]. Significant anti-tumour responses were observed in the clinic using monoclonal antibodies by coexpressing cytotoxic T-lymphocyte-associated antigen (CTLA-4) heavyand light chains [60].
Finally, we think ‘2A-like’ sequences are able to function both as a signal sequence and as a translational recoding element - this leads to partitioning of the translation products between two subcellular sites (dual protein targeting). We have identified some 2A-like sequences at the N-terminus of NLRs in the genome of the purple sea urchin Strongylocentrotus purpuratus that were putative signal sequences. Constructs encoding wild-type [Sp2A-cherryFPT2A- GFP] or a mutated, cleavage inactive form of Sp2A were used to transfect mammalian HeLa cells – with both constructs GFP wasevenly distributed throughout the cell, wild-type Sp2A lead to cherryFP localization throughout the cell and mutated Sp2A acting as a signallead to cherryFP localization in the exocytic pathway [37,61].
In conclusion, with the number of studies that have successfully used F2A and ‘2A-like’ sequences approaching 1000, these small peptides are proving to be the ‘go-to’ technology for co-expression ofmultiple proteins.