Whole genome sequencing and phylogeny for members of the microalgae genus Prototheca
2nd International Conference on Big Data Analysis and Data Mining
November 30-December 01, 2015 San Antonio, USA

Aren Ewing, Shane Brubaker, Aravind Somanchi, George Rudenko, Esther Yu, Nina Reyes, Karen Espina, Leon Xing, Riyaz Bhat Arthur Grossman and Scott Franklin

Solazyme Inc., USA

Posters-Accepted Abstracts: J Data Mining Genomics Proteomics

Abstract:

The taxonomy of the microalgae genus Prototheca has changed very little in the last 100 years. Originally named and classified using morphological techniques, Next-generation sequencing methods (NGS) have much to contribute to a more precise elucidation of the structure of this genus. We utilized NGS to sequence the whole genomes of 20 Prototheca species and performed de novo assemblies for all 20 genomes. We used these genomic assemblies to determine robust phylogenetic relationships between Prototheca species. Phylogenetic trees constructed using 16s and 23s plastid rRNA encoding sequences form paraphyletic groups suggesting the original strain classification and naming of some species may not hold true based on newer sequence-based methodologies. Whole plastid sequences were extracted from the genome assemblies for comparison to each other for gene content and level of conservation. Conserved genes were analyzed and additional phylogenetic trees were created to further analyze the phylogenetic relationships between the different species. Chlorella protothecoides was used as an out group throughout this study and has a functional photosynthetic plastid while the other Prototheca have lost their photosynthetic capacity. We constructed a conserved set of genes within the plastids of Prototheca that can be used for phylogenetic comparisons and potential observation of the transfer of genes from the plastid to the nuclear genome. This work provides a rich resource for studying the transition from photosynthetic to heterotrophic plastids, and provides insight into the power of NGS for refining phylogenetic relationships.

Biography :

Aren Ewing currently works at Solazyme Inc. as a Bioinformatics Scientist in the areas of phylogenetics, genome curation, and transcriptome analysis. He has previously worked at the Joint Genome Institute and Roche Pharmaceuticals as a Bioinformatician. He has a Master’s degree from the University of Hawaii – Manoa, and a B.S. from UC Davis in Biochemistry and Molecular Biology.