mTM-align

Help

About the server

The server consists of two related modules: fast search of structure database and multiple protein structure alignment. The database search is speeded up based on a heuristic algorithm and a hierarchical organization the structures in the database. The multiple protein structure alignment is performed using the fast and accurate algorithm mTM-align.


The database search consists of three steps:

  • To speed up the database search, the sequence of the query structure is first searched by PSI-BLAST against a non-redundant structure domain database DOM50 (contains about 40,000 structures with pairwise sequence identity <50%). If hits with e-value <0.001 are found, an iterative method (named Walk) (that considers the pairwise TM-score of the structures in DOM50) will be used to expand these hits to more structures in DOM50. If no hits are found by PSI-BLAST, the query structure will be compared against the DOM50 database using fTM-align (a fast version of TM-align), which is 5-10 times faster than TM-align and with similar accuracy.
  • The top 200 templates from DOM50 are then expanded to the whole domain database (with about 500,000 structures), to find all templates with TM-score>0.5 to the query structure.
  • An automated multiple structure alignment is then performed using the top 10 templates. The users are also able to select other templates for making multiple structure alignment.
  • The multiple structure alignment is built with three steps:

  • Generating all the Pairwise structure alignments for the input structures with TM-align.
  • A structure-based phylogenetic tree is constructed using the UPGMA algorithm.
  • Progressive construction of a multiple protein structure alignment using the branching order from the phylogenetic tree.
  • flow.png

    Figure 1. The flowchart of mTM-align server.


    navbar.png

    Figure 2. The Navigation bar.


    For fast database search, you are allowed to either paste your structure into text box, or upload your structure as a PDB file (Figure 3). If you choose "Advanced Option" and select 'Yes' for the option "Split or not", your structure will be cut into domains by PDP (protein domain parser).

    submit_search.png

    Figure 3. The submssion page of Fast Search of Structure Database.


    Domain Split

    You can choose one of them to start search (Figure 4).

    domain.png

    Figure 4. The domain selection page of Fast Search of Structure Database.


    Input of Multiple Protein Structure Alignment

    For Multiple Protein Structure Alignment, you should upload a tarball with at least two structures in it (Figure 5).

    submit_mTM-align.png

    Figure 5. The submssion page of Multiple Protein Structure Alignment.


    The output of Fast Search of Structure Database is a list of templates (Figure 6). You can click on "View the automated MSTA with the top 10 templates" to view the alignment built with the top 10 templates. You can click on the template name to download the PDB file of each template or click the "link" to the link in the RCSB PDB website. In addition, you are also able to select the templates that you are interested in to perform multiple structure alignment.

    search_result.png

    Figure 6. The search result page of Fast Search of Structure Database.


    Output of Multiple Protein Structure Alignment

    The output of Multiple Protein Structure Alignment is shown in Figure 7. You can view the alignment or download it in fasta format. For visualization of the alignment, you can choose to view all structures or only the common core region.

    mTM-align.png

    Figure 7. The result page of Multiple Protein Structure Alignment.


    The Difference in the Performance of Search with the Chain and the Domain Databases

    The difference in performance of search with the chain and the domain databases is shown in Figure 8, assessed on a dataset of 500 structures from the SCOPe database. A structure in the top n of the result list is defined as a true positive (TP) if its fold definition SCOPe is the same as the query. The mean precision (p(n)) and recall (r(n)) are used to measure the difference in performance of search with the chain and the domain databases.

    database_difference.png

    Figure 8. The difference in the performance of the search against the chain (PDBC) and the domain (DOM) databases.


    Reference

  • R Dong, S Pan, Z Peng, Y Zhang, J Yang, mTM-align: a server for fast protein structure database search and multiple protein structure alignment, Nucleic Acids Research, 46: W380–W386 (2018).
  • R Dong, Z Peng, Y Zhang, J Yang, mTM-align: an algorithm for fast and accurate multiple protein structure alignment, Bioinformatics, 34: 1719-1725 (2018).

  • Need more help?

    If you have more questions or comments about the server, please email yangjynankai.edu.cn.