We have generated the genome sequence of Catharanthus roseus variety Sunstorm Apricot using whole genome shotgun sequencing. The analysis of the genome is described in Kellner et al. 2015. We generated an assembly of 523 Mb with an N50 scaffold size of 26.2 kbp and annotated a total of 33,829 genes. The genome can be downloaded here and analysis tools including a genome browser and BLAST search site have been constructed.
The Medicinal Plant Metabolomics Resource website, the metabolomics partner of the Medicinal Plant Genomics Resource, is available with metabolomics data from Atropa belladonna and Digitalis purpurea. Metabolite data from the remaining 12 medicinal plant species covered by this project will be made available soon. A press release is available with complete information about this metabolite resource.
Methods for transcriptome assembly as performed in the MPGR project are described in Gongora-Castillo et al. 2012. Analysis of the Camptotheca acuminata, Catharanthus roseus, and Rauvolfia serpentina transcriptomes are described in PLoS ONE and analysis of the Valeriana officinalius transcriptome assembly is described in Yeo et al. 2013. For more information about the release of the Final Version of the transcriptome data, please see the press release.
For more information about the release of Version 1 of the transcriptome data, please see the press release.
Natural products from plants serve as rich resources for drug development with almost 100 plant-derived compounds in clinical trials in 2007. Plant derived natural products have had a profound and lasting impact on human health and include compounds successfully used for decades such as digitalis, Taxol, vincristine, and morphine isolated from foxglove, periwinkle, yew, and opium poppy, respectively. The enormous structural diversity and biological activities of plant-derived compounds suggest that additional, medicinally relevant compounds remain to be discovered in plants.
While plant natural products continue to be a prime target for drug development, as evidenced by the number of ongoing clinical trials, the clinical potential of these compounds is often curtailed due to low production levels in plant species. For example, use of the blockbuster drug Taxol almost stopped in the early 1990's because the primary source, yew tree bark, could not be used as a sustainable source of the drug. In this particular instance, a Taxol precursor happened to be more readily available in a renewable part of the tree, and a semi-synthetic protocol could be developed to convert it into the drug. While fortuitous, more generalized solutions, such as metabolic engineering of effective plant and microbial production platforms, are urgently needed to ensure that the wealth of bioactive compounds found in plants enter the clinical pipeline and find widespread use in medicine.
High throughput transcriptome sequencing approaches provide a straightforward means for accessing the gene content in organisms with large genomes (i.e. > 100 Mb). Essentially any tissue (independent of genome size and availability of genetic or molecular tools in the organism) can be used to generate cDNAs from mRNA populations and sequenced to generate Expressed Sequence Tags (ESTs) that are assembled into a non-redundant set of sequences (contigs and singleton ESTs) to represent the transcriptome. The transcriptome sequences are then annotated for putative function using a suite of bioinformatic approaches such as sequence searches of protein databases, motif/domain identification, biochemical pathway mapping, and subcellular localization predictions. Transcript abundance data can also be used to provide in-depth expression profiles of individual genes on a per tissue/treatment basis. The deduced function, coupled with expression frequency, can facilitate identification of candidate genes pertinent to the pathway of interest as well as non-pathway targets (e.g. primary/intermediary metabolism) whose expression is consistent with synthesis of compounds.
Joe Chappell, Department of Plant & Soil Sciences, University of
Kentucky, Lexington, KY, 40546-0312
Email Joe a comment
Joe Chappell's Home Page
Dean DellaPenna, Department of Biochemistry & Molecular Biology, Michigan State University, East Lansing, MI, 48824-1319
Email Dean a comment
Dean DellaPenna's Home Page
Sarah E. O’Connor, Department of Biological Chemistry, John Innes Centre, Colney, Norwich, NR4 7UH, UK
Email Sarah a comment
O'Connor Group Home Page
We have released the genome of Catharanthus roseus variety Sunstorm Apricot as described in Kellner et al. 2015. See the genome browser, BLAST search site, and Dryad download page for access to the data.
Expression data released for 14 medicinal plants. Expression levels for the representative transcript (the longest transcript isoform) are provided from an array of tissues that were sequenced using RNA-seq for expression abundances. Expression levels are provided in FPKM values (Fragments per kilobase per transcript per million mapped reads). The information can be downloaded from the MPC FTP site.
Metabolomics data released for Atropa Belladonna and Digitalis purpurea. The information is available via Metabolomics.
We have released a set of functional annotation search tools for the assembled transcriptomes for each of the 14 medicinal plant species using keywords, Pfam domains, and sequence identifiers. The transcript and predicted protein sequences for the transcript assemblies are provided. Note that the assembly process using Velvet and Oases generates isoforms and each isoform has been annotated. Functional annotation includes alignments to UniRef, identification of InterPro domains, alignments to Arabidopsis thaliana genes, and alignment to ESTs and peptides from existing public sequences for these 14 medicinal plants. Expression levels for the representative transcript (the longest transcript isoform) are provided from an array of tissues that were sequenced using RNA-seq for expression abundances. Expression levels are provided in the form of FPKM values (Fragments per kilobase per transcript per million mapped reads). Note alternative isoforms generated from Oases are not annotated for expression abundances
We have planned server maintenance on the first Wednesday of every month. Web pages may be unavailable or only partially functional during server maintenance. There will be additional maintenance on the following dates where the MPGR web pages will be unavailable: August 5, 2013, December 2, 2013, April 7, 2014.