trRosettaRNA: RNA structure prediction with transform-restrained Rosetta

Available multiple sequence alignment formats

The trRosettaRNA server now supports using a multiple sequence alignment as input. Available formats for the input multiple sequence alignment are:

A3M/AFA
FASTA
A2M
STO

A3M/AFA format

The A3M/AFA format consists of aligned fasta, in which alignments are shown with inserts as lower case characters, matches as upper case characters, deletions as ' - ', and gaps aligned to inserts as ' . '. Note that gaps aligned to inserts can be omitted in the A3M/AFA format.

In the standard A3M format, sequences are separated by ' > '. See an example:

>example CCGCCGCGCCATGCCTGTGGCGGAAACCGCCGCGCCATGCCTGTGGCGG >1 ------------GCCCGTGGCGGAAACCGCCGCGCC------------- >2 CCGCCGCGCCGTGCCTGTG-----------CGCGCCGTGCCTGTGGCGC >3 CCGCCGCGCTGTGGCTGTGGCGGAA-CCGCCGCGGCATGCTCACG-CGG >4 GCGCCGCGCCGTGCCTGTGGCGGAA-CCGCCGGGACGTGCCCGT--CGC >5 GCGCCGCGCCGCG--CGCCGCG-----CGCCGCGCCTTGCCTGTGGCGC >6 CCGCCGCGCCATGCGGT---TGGCCACCTCCGCGCCATGCCAGCGGCGC

The separator line can be omitted. In this case, each sequence should be written into one line. See an example:

CCGCCGCGCCATGCCTGTGGCGGAAACCGCCGCGCCATGCCTGTGGCGG ------------GCCCGTGGCGGAAACCGCCGCGCC------------- CCGCCGCGCCGTGCCTGTG-----------CGCGCCGTGCCTGTGGCGC CCGCCGCGCTGTGGCTGTGGCGGAA-CCGCCGCGGCATGCTCACG-CGG GCGCCGCGCCGTGCCTGTGGCGGAA-CCGCCGGGACGTGCCCGT--CGC GCGCCGCGCCGCG--CGCCGCG-----CGCCGCGCCTTGCCTGTGGCGC CCGCCGCGCCATGCGGT---TGGCCACCTCCGCGCCATGCCAGCGGCGC

FASTA format

The FASTA format consists of aligned fasta, in which lower and upper case are equivalent; ' . ' and ' - ' are equivalent.

In the standard FASTA format, sequences are separated by ' > '. See an example:

The separator line can be omitted. In this case, each sequence should be written into one line. See an example:

A2M format

The A2M format consists of aligned fasta, in which alignments are shown with inserts as lower case characters, matches as upper case characters, deletions as ' - ', and gaps aligned to inserts as ' . '.

In the standard A2M format, sequences are separated by ' > '. See an example:

>example CCGCCGCGCCATGCC.GTGGCGGAAACCGCCGCGCCATGCCTGTGGCGG >1 ------------GCCcCGTGGCGGAAACCGCCGCGCC------------- >2 CCGCCGCGCCGTGCC.TGTG-----------CGCGCCGTGCCTGTGGCGC >3 CCGCCGCGCTGTGGC.TGTGGCGGAA-CCGCCGCGGCATGCTCACG-CGG >4 GCGCCGCGCCGTGCC.TGTGGCGGAA-CCGCCGGGACGTGCCCGT--CGC >5 GCGCCGCGCCGCG--gCGCCGCG-----CGCCGCGCCTTGCCTGTGGCGC >6 CCGCCGCGCCATGCGaGT---TGGCCACCTCCGCGCCATGCCAGCGGCGC

The separator line can be omitted. In this case, each sequence should be written into one line. See an example:

CCGCCGCGCCATGCC.GTGGCGGAAACCGCCGCGCCATGCCTGTGGCGG ------------GCCcCGTGGCGGAAACCGCCGCGCC------------- CCGCCGCGCCGTGCC.TGTG-----------CGCGCCGTGCCTGTGGCGC CCGCCGCGCTGTGGC.TGTGGCGGAA-CCGCCGCGGCATGCTCACG-CGG GCGCCGCGCCGTGCC.TGTGGCGGAA-CCGCCGGGACGTGCCCGT--CGC GCGCCGCGCCGCG--gCGCCGCG-----CGCCGCGCCTTGCCTGTGGCGC CCGCCGCGCCATGCGaGT---TGGCCACCTCCGCGCCATGCCAGCGGCGC

STO format

The STO (Stockholm) format consists of a header line with a format and version identifier; mark-up lines starting with "#=GF","#=GC","#=GS" or "#=GR"; alignment lines with the sequence name and aligned sequence; a "//" line indicating the end of the alignment. Alignments are shown with inserts as lower case characters, matches as upper case characters, and gaps as ' . ' or ' - '.

See an example:

# STOCKHOLM 1.0 #=GF ID raiA #=GF AC RF03072 #=GF DE raiA RNA #=GF AU Weinberg Z; 0000-0002-6681-3624 #=GF GA 101.3 #=GF NC 87.7 #=GF TC 101.4 #=GF SE Weinberg Z #=GF SS Published; PMID:28977401; #=GF TP Cis-reg; riboswitch; #=GF BM cmbuild -F CM SEED #=GF CB cmcalibrate --mpi CM #=GF SM cmsearch --cpu 4 --verbose --nohmmonly -T 30.00 -Z 2958934 CM SEQDB #=GF DR SO; 0000035; riboswitch; #=GF RN [1] #=GF RM 28977401 #=GF RT Detection of 224 candidate structured RNAs by comparative analysis of #=GF RT specific subsets of intergenic regions. #=GF RA Weinberg Z, Lunse CE, Corbino KA, Ames TD, Nelson JW, Roth A, Perkins KR, #=GF RA Sherlock ME, Breaker RR #=GF RL Nucleic Acids Res. 2017;45:10811-10823. #=GF CC Actinobacteria, Firmicutes. Lineage (Negativicutes) #=GF WK RaiA_RNA_motif #=GF SQ 488 URS0000D6CD03_12908/1-217 GCAAAUCUCCCA--GUAG-----GCCGGU-GUGGG-GUCAAA-AACCAG-GUCAGCUA--G-GCGAA URS0000D687FA_12908/1-217 GCAAACCUCCCA--GUAG-----GCAGGU-GUGGG-GUCAAAAAAUCAG-GUCAGCUA--A-GUAAA URS0000D67AD2_12908/1-205 GCGAGACCCGCA--GCAG-----GCGAGU-GUGGG-GGAAAA-GACCAG-GUCAGCCG--G-AUAAC #=GC seq_cons gcgAAagccCCa..Gcag.....GCGAGU.gUGGG.GuCAAA.aaCCAG.GUCAGccg..g.gcggg //

Need more help?

If you have more questions or comments about the server, please email yangjy

sdu.edu.cn.