Available multiple sequence alignment formats
The trRosettaRNA server now supports using a multiple sequence alignment as input. Available formats for the input multiple sequence alignment are:
A3M/AFA format
The A3M/AFA format consists of aligned fasta, in which alignments are shown with inserts as lower case characters, matches as upper case characters, deletions as ' - ', and gaps aligned to inserts as ' . '.
Note that gaps aligned to inserts can be omitted in the A3M/AFA format.
In the standard A3M format, sequences are separated by ' > '. See an example:
>example
CCGCCGCGCCATGCCTGTGGCGGAAACCGCCGCGCCATGCCTGTGGCGG
>1
------------GCCCGTGGCGGAAACCGCCGCGCC-------------
>2
CCGCCGCGCCGTGCCTGTG-----------CGCGCCGTGCCTGTGGCGC
>3
CCGCCGCGCTGTGGCTGTGGCGGAA-CCGCCGCGGCATGCTCACG-CGG
>4
GCGCCGCGCCGTGCCTGTGGCGGAA-CCGCCGGGACGTGCCCGT--CGC
>5
GCGCCGCGCCGCG--CGCCGCG-----CGCCGCGCCTTGCCTGTGGCGC
>6
CCGCCGCGCCATGCGGT---TGGCCACCTCCGCGCCATGCCAGCGGCGC
The separator line can be omitted. In this case, each sequence should be written into one line. See an example:
CCGCCGCGCCATGCCTGTGGCGGAAACCGCCGCGCCATGCCTGTGGCGG
------------GCCCGTGGCGGAAACCGCCGCGCC-------------
CCGCCGCGCCGTGCCTGTG-----------CGCGCCGTGCCTGTGGCGC
CCGCCGCGCTGTGGCTGTGGCGGAA-CCGCCGCGGCATGCTCACG-CGG
GCGCCGCGCCGTGCCTGTGGCGGAA-CCGCCGGGACGTGCCCGT--CGC
GCGCCGCGCCGCG--CGCCGCG-----CGCCGCGCCTTGCCTGTGGCGC
CCGCCGCGCCATGCGGT---TGGCCACCTCCGCGCCATGCCAGCGGCGC
FASTA format
The FASTA format consists of aligned fasta, in which lower and upper case are equivalent; ' . ' and ' - ' are equivalent.
In the standard FASTA format, sequences are separated by ' > '. See an example:
>example
CCGCCGCGCCATGCCTGTGGCGGAAACCGCCGCGCCATGCCTGTGGCGG
>1
------------GCCCGTGGCGGAAACCGCCGCGCC-------------
>2
CCGCCGCGCCGTGCCTGTG-----------CGCGCCGTGCCTGTGGCGC
>3
CCGCCGCGCTGTGGCTGTGGCGGAA-CCGCCGCGGCATGCTCACG-CGG
>4
GCGCCGCGCCGTGCCTGTGGCGGAA-CCGCCGGGACGTGCCCGT--CGC
>5
GCGCCGCGCCGCG--CGCCGCG-----CGCCGCGCCTTGCCTGTGGCGC
>6
CCGCCGCGCCATGCGGT---TGGCCACCTCCGCGCCATGCCAGCGGCGC
The separator line can be omitted. In this case, each sequence should be written into one line. See an example:
CCGCCGCGCCATGCCTGTGGCGGAAACCGCCGCGCCATGCCTGTGGCGG
------------GCCCGTGGCGGAAACCGCCGCGCC-------------
CCGCCGCGCCGTGCCTGTG-----------CGCGCCGTGCCTGTGGCGC
CCGCCGCGCTGTGGCTGTGGCGGAA-CCGCCGCGGCATGCTCACG-CGG
GCGCCGCGCCGTGCCTGTGGCGGAA-CCGCCGGGACGTGCCCGT--CGC
GCGCCGCGCCGCG--CGCCGCG-----CGCCGCGCCTTGCCTGTGGCGC
CCGCCGCGCCATGCGGT---TGGCCACCTCCGCGCCATGCCAGCGGCGC
A2M format
The A2M format consists of aligned fasta, in which alignments are shown with inserts as lower case characters, matches as upper case characters, deletions as ' - ', and gaps aligned to inserts as ' . '.
In the standard A2M format, sequences are separated by ' > '. See an example:
>example
CCGCCGCGCCATGCC.GTGGCGGAAACCGCCGCGCCATGCCTGTGGCGG
>1
------------GCCcCGTGGCGGAAACCGCCGCGCC-------------
>2
CCGCCGCGCCGTGCC.TGTG-----------CGCGCCGTGCCTGTGGCGC
>3
CCGCCGCGCTGTGGC.TGTGGCGGAA-CCGCCGCGGCATGCTCACG-CGG
>4
GCGCCGCGCCGTGCC.TGTGGCGGAA-CCGCCGGGACGTGCCCGT--CGC
>5
GCGCCGCGCCGCG--gCGCCGCG-----CGCCGCGCCTTGCCTGTGGCGC
>6
CCGCCGCGCCATGCGaGT---TGGCCACCTCCGCGCCATGCCAGCGGCGC
The separator line can be omitted. In this case, each sequence should be written into one line. See an example:
CCGCCGCGCCATGCC.GTGGCGGAAACCGCCGCGCCATGCCTGTGGCGG
------------GCCcCGTGGCGGAAACCGCCGCGCC-------------
CCGCCGCGCCGTGCC.TGTG-----------CGCGCCGTGCCTGTGGCGC
CCGCCGCGCTGTGGC.TGTGGCGGAA-CCGCCGCGGCATGCTCACG-CGG
GCGCCGCGCCGTGCC.TGTGGCGGAA-CCGCCGGGACGTGCCCGT--CGC
GCGCCGCGCCGCG--gCGCCGCG-----CGCCGCGCCTTGCCTGTGGCGC
CCGCCGCGCCATGCGaGT---TGGCCACCTCCGCGCCATGCCAGCGGCGC
STO format
The STO (Stockholm) format consists of a header line with a format and version identifier; mark-up lines starting with "#=GF","#=GC","#=GS" or "#=GR"; alignment lines with the sequence name and aligned sequence; a "//" line indicating the end of the alignment.
Alignments are shown with inserts as lower case characters, matches as upper case characters, and gaps as ' . ' or ' - '.
See an example:
# STOCKHOLM 1.0
#=GF ID raiA
#=GF AC RF03072
#=GF DE raiA RNA
#=GF AU Weinberg Z; 0000-0002-6681-3624
#=GF GA 101.3
#=GF NC 87.7
#=GF TC 101.4
#=GF SE Weinberg Z
#=GF SS Published; PMID:28977401;
#=GF TP Cis-reg; riboswitch;
#=GF BM cmbuild -F CM SEED
#=GF CB cmcalibrate --mpi CM
#=GF SM cmsearch --cpu 4 --verbose --nohmmonly -T 30.00 -Z 2958934 CM SEQDB
#=GF DR SO; 0000035; riboswitch;
#=GF RN [1]
#=GF RM 28977401
#=GF RT Detection of 224 candidate structured RNAs by comparative analysis of
#=GF RT specific subsets of intergenic regions.
#=GF RA Weinberg Z, Lunse CE, Corbino KA, Ames TD, Nelson JW, Roth A, Perkins KR,
#=GF RA Sherlock ME, Breaker RR
#=GF RL Nucleic Acids Res. 2017;45:10811-10823.
#=GF CC Actinobacteria, Firmicutes. Lineage (Negativicutes)
#=GF WK RaiA_RNA_motif
#=GF SQ 488
URS0000D6CD03_12908/1-217 GCAAAUCUCCCA--GUAG-----GCCGGU-GUGGG-GUCAAA-AACCAG-GUCAGCUA--G-GCGAA
URS0000D687FA_12908/1-217 GCAAACCUCCCA--GUAG-----GCAGGU-GUGGG-GUCAAAAAAUCAG-GUCAGCUA--A-GUAAA
URS0000D67AD2_12908/1-205 GCGAGACCCGCA--GCAG-----GCGAGU-GUGGG-GGAAAA-GACCAG-GUCAGCCG--G-AUAAC
#=GC seq_cons gcgAAagccCCa..Gcag.....GCGAGU.gUGGG.GuCAAA.aaCCAG.GUCAGccg..g.gcggg
//
Need more help?
If you have more questions or comments about the server, please email yangjysdu.edu.cn.