Applied Bioinformatics Group


A   A   A
Sections
Home > Research > Systems Biology > Protein Subcellular Localization Prediction

Skip to content. | Skip to navigation

Protein Subcellular Localization Prediction

Assigning subcellular localization to a protein is an important step towards elucidating its interaction partners, function, and potential role(s) in the cellular machinery. Computational tools offer an attractive complement to time-consuming and laborious experimental methods.

We have designed several systems for predicting the subcellular localization of eukaryotic proteins from the amino acid sequence:

 

YLoc (2010)

Our most recent subcellular localization predictor is YLoc. YLoc offers interpretable predictions including a textual reasoning why this prediction was made. In addition, a confidence score estimates how a reliable a prediction is. YLoc+, a special version of YLoc, is specialized on predicting proteins with multiple localization. Despite its simple architecture and interpretable predictions, YLoc performs on-par with state-of-the-art predictors. Users can access YLoc via the web service or a SOAP interface.

 

SherLoc2 (2009)

SherLoc2 combines the amino acid information, knowledge on domains, and phylogenetic profiles with text-based information. It predicts 11 locations of eukaryotic cells with very high prediction accuracy. In addition, it offers users to include background knowledge by describing their protein.

 

MultiLoc2 (2009)

MultiLoc2 is a high-accuracy subcellular localization predictor. Similar to its predecessor, it combines information on amino acid composition and N-terminal targeting signals. In addition, knowledge on domains in form of GO terms and phylogenetic profiles are used to gain a higher prediction performance. It is available in two versions: A low-resultion version is specialized on predicting the location of globular proteins. A high-resolution version predicts all 11 main locations of eukaryotic cells.

 

SherLoc (2006)

SherLoc combines the information obtained from MultiLoc such as amino acid composition and N-terminal sorting signals with text-based information from PubMed abstracts. It supports 11 eukaryotic localizations.

 

MultiLoc / TargetLoc (2006)

MultiLoc integrates several sources of sequence-based information in order to assign subcellular localization and supports 11 eukaryotic localizations. MultiLoc integrates several sources of relevant sequence-based information i.e. N-terminal targeting sequences, amino acid composition, and sequence motifs, in order to provide reliable predictions on a proteome-wide scale. MultiLoc is based on support vector machines (SVMs). TargetLoc, the low resolution version of MultiLoc, was constructed to distinguish globular proteins and support 4 and 3 localizations for plant and non-plant, respectively.

 

Lokero (2005)

The oldest of our methods predicts four different localizations: cytoplasm, mitochondrion, extracellular space, and nuclus. It is based on the overall amino acid composition and is available as SVM-based or nearest neighbor-based predictor.

Cooperations

We are currently cooperating with Prof. Dr. Klaus Harter's group (Center for Plant Molecular Biology ZMBP, Tübingen), Prof. Dr. Alfred Meixner's group (Department for physical and theoretical chemistry, Tübingen), and Prof. Dr. Thilo Stehle's group (Department for Biochemistry, Tübingen) on the subcellular localization, structure, intracellular dynamics, and function of sensor histidine kinases in plants.
This cooperation is supported by a LGFG funding Promotionsverbund: "Pflanzliche Sensorhistidinkinasen: Struktur, intrazelluläre Dynamik und Funktion".

People working in this area

 

Torsten Blum, Sebastian Briesemeister

References