Dr Richard J. Edwards

Main Page
CV / Publications
School webpage
CMG webpage
Main Software
All Software
SeqSuite Blog
SeqSuite Homepage
Other Stuff
Google Scholar

Supplementary material for Jones et al. (2011)

This website provides access to the supplementary data and software to accompany the Emiliania huxleyi proteomics analysis of Jones et al. (2011):

  • Jones BM*, Edwards RJ*, Skipp PJ, O'Connor CD & Iglesias-Rodriguez MD (2011): Shotgun Proteomic Analysis of Emiliania huxleyi, a Marine Phytoplankton Species of Major Biogeochemical Importance. Marine Biotechnology 13(3): 496-504.

    Final tables of identifications are available as Supplementary Tables (MS Word) for Protein (PICSI) and EST (BUDAPEST) database searches.

    Supplementary Data

    MASCOT Search Data

    The Raw MASCOT results for the searches against the E. huxley ESTs and taxonomically-restricted protein datasets can be found in the Supplementary Data (MS Excel). EST sequences (~20Mb zipped) the protein search dataset (~54Mb zipped) are available on request.

    Results are also available for the decoy datasets here. Decoy sequence files are available on request (~38Mb zipped). NCBI-nr search results, processed using PICSI (see software, below) are available here.

    BUDAPEST Processing

    BUDAPEST is freely available from the SeqSuite website. The main results tables are available in Supplementary Data (MS Excel). Raw results files are also available here:

    File Description Data
    *.budapest.tdtMain tab delimited BUDAPEST output table (Tab delimited) NZEH; CCMP1516; CCMP371
    *.budapest.fasHit consensus sequence (FASTA format) NZEH; CCMP1516; CCMP371
    *.summary.txtSummary BUDAPEST text file NZEH; CCMP1516; CCMP371
    *.details.txtBUDAPEST details file by processed EST reading frame NZEH; CCMP1516; CCMP371

    Post-BUDAPEST Processing

    BUDAPEST consensus sequences (generated using FIESTA) were processed using HAQESAC. All alignments and trees generated by HAQESAC are available as Supplementary Data for both Protein and EST searches.

    For more information or materials, please e-mail:


    All software developed during this project is freely available from the SeqSuite homepage. The keys tools used in this study, where:

    Program Description Cite
    BUDAPEST Bioinformatics Utility for Data Analysis of Proteomics on ESTs. Main analysis pipeline for cleaning up results from Mascot searches against EST libraries. Developed for this paper. Jones et al. (2011) Marine Biotechnology 13(3): 496-504.
    FIESTA Fasta Input EST Analysis. EST assembly and annotation pipeline. Jones et al. (2011) Marine Biotechnology 13(3): 496-504.
    PICSI Protein Identification from Cross-Species Inference. Analysis pipeline for cleaning up results from Mascot searches against large, redundant cross-species protein databases (e.g. NCBI-nr). Developed for this paper. Jones et al. (2011) Marine Biotechnology 13(3): 496-504.
    HAQESAC Homologue Alignment Quality, Establishment of Subfamilies and Ancestor Construction. High-throughput generation of quality multiple sequence alignments and phylogenetic trees of protein families to assist protein annotation. Edwards et al. (2007) Nature Chem. Biol. 3(2):108-112.
    RJE_SEQ DNA/Protein sequence module. Used for generating random protein sequences for decoy database searches. http://www.soton.ac.uk/~re1u06/software/packages/seqsuite/

  • © RJ Edwards 2011. Last modified 9th February 2012.