Proteomic Tool Kit

Tools and resources for the analysis, visualization and characterization of proteomic data, including publically available proteomic databases of interest to plant science researchers.


AraCyc is a metabolic pathway database for Arabidopsis thaliana that contains information about both predicted and experimentally determined pathways, reactions, compounds, genes and enzymes. The Omics viewer is a software for displaying large scale data such as microarray gene expression results or proteomic data in the context of biochemical pathways.


ARAMEMNON is a database of plant membrane proteins, using Arabidopsis thaliana as reference model plant. The database also holds all putative membrane proteins of five other plant species, including rice and maize. Data on protein topology, predicted homologies and sequences is stored. 


The interactions database includes all interactions present in the Arabidopsis thaliana Protein Interactome Database, the Predicted Interactome for Arabidopsis, and Arabidopsis protein-protein interaction data curated from the literature by TAIR curators, BIOGRID and IntAct.


Comprehensive enzyme information edited from the scientific literature for many proteins across a range of organisms including Arabidopsis.

From the abstract: The BRENDA (BRaunschweig ENzyme Database) enzyme information system is the main collection of enzyme functional and property data for the scientific community. The content covers information on function, structure, occurrence, preparation and application of enzymes as well as properties of mutants and engineered variants. The number of manually annotated references is more than 100,000, the number of ligand structures almost 100,000. BRENDA now provides new viewing options such as the display of the statistics of functional parameters and the 3D view of protein sequence and structure features. Furthermore a ligand summary shows comprehensive information on the BRENDA ligands. The enzymes are linked to their respective pathways and can be viewed in pathway maps. It is possible to submit new, not yet classified enzymes to BRENDA, which then are reviewed and classified by the International Union of Biochemistry and Molecular Biology.


The Center for Eukaryotic Structural Genomics aims to increase the production of available 3-D protein structures. As part of this project CESG has produced a large number of Arabidopsis ORF Gateway clones, protein expression clones, small amounts of purified protein and over 40 3-D structures. Information on all ORFs studied to date are available by BLAST search. Protocols for producing recombinant proteins are also available.


Links to an extensive range of proteomics analysis software.

IntAct of EMBL and EBI

Protein-Protein Interaction Database from a range of organisms, including Arabidopsis.

From the abstract: IntAct is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. As from September 2011, IntAct contains approximately 275 000 curated binary interaction evidences from over 5000 publications.

MASC Proteomics Subcommittee, MASCP

Webpage of the MASC Proteomics subcommittee, established to facilitate the coordination of international research in Arabidopsis thaliana in the area of proteomics. Includes links to numerous tools and resources

MIAPE: The Minimum Information About a Proteomics Experiment

MIAPE sets the standards for describing proteomic experiments and samples. The website includes links to a list of papers published in Nature Biotechnology on standards for different proteomic analysis techniques. 


A web-based, searchable database of protein phosphorylation data from oil seed rape, Arabidopsis, maize and other species. 

From the abstract: P(3)DB provides a resource of protein phosphorylation data from multiple plants. With a web-based user interface, the database is browsable, downloadable and searchable by protein accession number, description and sequence. A BLAST utility was integrated and a phosphopeptide BLAST browser was implemented to allow users to query the database for phosphopeptides similar to protein sequences of their interest. With the large-scale phosphorylation data and associated web-based tools, P(3)DB will be a valuable resource for both plant and nonplant biologists in the field of protein phosphorylation.


Super-database containing information on the Arabidopsis thaliana proteome.

From the abstract: The pep2pro dataset, which is an organ-specific characterisation of the Arabidopsis thaliana proteome containing 14522 identified proteins based on 2.6 million peptide spectrum assignments. This dataset provides evidence of protein expression and reveals organ-specific processes.

PhosPhAt, the Arabidopsis Protein Phosphorylation Site Database

The Arabidopsis Protein Phosphorylation Site Database (PhosPhAt 3.0) contains information on Arabidopsis phosphorylation sites which were identified by mass spectrometry in large scale experiments by different research groups. The PhosPhAt service has a built-in plant specific phosphorylation site predictor trained on the experimental dataset for serine, threonine and tyrosine phosphorylation. Protein sequences or an Arabidopsis AGI gene identifier can be submitted to the predictor.

From the abstract: The PhosPhAt database provides a resource consolidating our current knowledge of mass spectrometry-based identified phosphorylation sites in Arabidopsis and combines it with phosphorylation site prediction specifically trained on experimentally identified Arabidopsis phosphorylation motifs. The database currently contains 1187 unique tryptic peptide sequences encompassing 1053 Arabidopsis proteins.

Plant Protein Phosphorylation DataBase

P3DB version 2.0.0 hosts protein phosphorylation data for 6 species from 23 experimental studies, containing 11,601 phosphoproteins, harboring 32,963 phosphosites. Datasets for a number of plant species are available, including Arabidopsis, rice and maize. 

From the abstract: With a web-based user interface, the database is browsable, downloadable and searchable by protein accession number, description and sequence. A BLAST utility was integrated and a phosphopeptide BLAST browser was implemented to allow users to query the database for phosphopeptides similar to protein sequences of their interest.

Plant Specific Database

A list of Arabidopsis proteins catagorised according to their presence in other species. information from multiple public databases (e.g. TIGR, MIPS, TAIR, SIGNAL) have been integrated. There is also a flexible search tool that should allow the analysis of single as well as lists of proteins of interest to individual researchers.


The protein plastid database allows users to use a BLAST search or search for plastid typ information from several plant plastid proteomes. 

From the abstract: plprot was established as a plastid proteome database to provide information about the proteomes of chloroplasts, etioplasts and undifferentiated plastids.

Proteomic Standard Initiative

The Proteomics Standards Initiative (PSI) aims to define community standards for data representation in proteomics to facilitate data comparison, exchange and verification.


Structures of 33 Arabidopsis thaliana proteins described and visualised.

Seed Proteome

Seed proteome databases for Arabidopsis and sugar beet (coming soon). Includes protein catalogues and protocols. 

SUBA, SUB-cellular location database for Arabidopsis proteins

A tool to investigate subcellular localisation of proteins in Arabidopsis through the unification of disparate datasets. The web accessible interface allows the construction of powerful user based queries resulting in a one-stop-shop for protein localisation.

From the abstract: The localisation data in SUBA encompasses 10 distinct subcellular locations, >6743 non-redundant proteins and represents the proteins encoded in the transcripts responsible for 51% of Arabidopsis expressed sequence tags. The SUBA database provides a powerful means by which to assess protein subcellular localisation in Arabidopsis.

The Plant Proteome Database

PPDB is a Plant Proteome DataBase for Arabidopsis thaliana and maize (Zea mays) allowing users to search protein-encoding gene models in Arabidopsis, maize and rice. Every predicted protein in all species can be searched for experimental and other information (even if not experimentally identified).

From the abstract: Experimental identification is based on in-house mass spectrometry (MS) of cell type-specific proteomes (maize), or specific subcellular proteomes (e.g. chloroplasts, thylakoids, nucleoids) and total leaf proteome samples (maize and Arabidopsis). So far more than 5000 accessions both in maize and Arabidopsis have been identified. In addition, more than 80 published Arabidopsis proteome datasets from subcellular compartments or organs are stored in PPDB and linked to each locus.

The Predicted Arabidopsis Interactome Resource (PAIR)

A database of Arabidopsis protein-protein interactions, predicted and experimentally reported, collected from the major interaction databases.

From the abstract: The predicted Arabidopsis interactome resource comprises of 5990 experimentally reported molecular interactions in Arabidopsis thaliana together with 145,494 predicted interactions. PAIR predicts interactions by a fine-tuned support vector machine model that integrates indirect evidences for interaction, such as gene co-expressions, domain interactions, shared GO annotations, co-localizations, phylogenetic profile similarities and homologous interactions in other organisms (interologs). These predictions were expected to cover 24% of the entire Arabidopsis interactome, and their reliability was estimated to be 44%. PAIR features a user-friendly query interface, providing rich annotation on the relationships between two proteins. A graphical interaction network browser has also been integrated into the PAIR web interface to facilitate mining of specific pathways.

UniProt Knowledgebase

The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information. As well as the protein knowledgebase, there is a tool to search for sequence clusters and a sequence archive.