UniDoc: Unified Domain Cutter

The UniDoc domain prediction modeling are generally summarized in a webpage, the link of which is sent to the users after the decomposing is completed (UniDoc structure-based domain parsing output or UniDoc sequence-based domain recognition output). This page includes a detailed explanation on the data listed on the UniDoc output page.

About UniDoc

The input to UniDoc is protein sequence or 3D structure, as shown in Figure 1, the UniDoc works as follows.
(1) When protein structure is submitted, the distance matrix is extracted from 3D structure. Then, we use the hierarchical clustering to decompose the protein.
(2) When protein sequence is submitted, the distance matrix predicted by our recent deep learning based structure prediction algorithm trRosetta. Then, Then, we use the hierarchical clustering to decompose the protein.

Figure 1. The flowchart of the UniDoc algorithm.

Structure-based domain parsing result

Figure 2. The structure-based domain parsing result.

Sequence-based domain recognition result

Figure 3. The sequence-based domain recognition result.

Need more help?

If you have more questions or comments about the server, please email yangjy

sdu.edu.cn.

Reference

Zhu et al, A unified approach to protein domain parsing with inter-residue distance matrix, Bioinformatics, 39: btad070 (2023).