Question: How to compare assembly from Trinity and Velvet/Oases
3
gravatar for Paul
4.1 years ago by
Paul1.3k
European Union
Paul1.3k wrote:

dear all,

I am trying to compare my result form different de-novo transcriptom assemblers, but I am not sure how to do that. If I understand right - output from Trinity does not provide scaffolding - so I have to compare just my contig lengths from my outputs.

But in Trinity default output (Trinity.fasta) has together transcript isoforms and gene isoforms - should I separate just transcript isoforms from Trinity.fasta and count statistic (average,min,max contig length) to compare with Velvet/Oases? Or can I run awk script bellow to compare assemblers for whole Trinity.fasta (together transcripts and gene isoforms)?

awk script is:

awk 'BEGIN {flag=0; 
print "Contig ID\tContig Length \tA \tT \tG \tC \tN \tOtherCharacters "}  
{if ($0~/^>/) {if (flag==1) {tot= aCount + tCount + gCount + cCount + nCount + xCount; print id "\t" tot "\t" aCount "\t" tCount "\t" gCount "\t" cCount "\t" nCount "\t" xCount;}
id=$0;flag=1; aCount=gCount=cCount=tCount=nCount=xCount=0;}
else{aCount+=gsub(/[aA]/,"A",$0);tCount+=gsub(/[tT]/,"T",$0); gCount+=gsub(/[gG]/,"G",$0);cCount+=gsub(/[cC]/,"C",$0);nCount+=gsub(/[nN]/,"N",$0);xCount+=gsub(/[^ATGCNatgcn]/,"X",$0);}}   END{tot=aCount + tCount + gCount + cCount + nCount + xCount; print id" \t "tot" \t "aCount "\t" tCount "\t" gCount "\t" cCount "\t" nCount "\t" xCount;}' Trinity.fasta

Thank you for any explanation how to compare outputs.

 

ADD COMMENTlink modified 4.1 years ago by vahapel160 • written 4.1 years ago by Paul1.3k
1

Please reformat the script to make it more readable.

ADD REPLYlink written 4.1 years ago by Biomonika (Noolean)3.0k
1
gravatar for vahapel
4.1 years ago by
vahapel160
Turkey
vahapel160 wrote:

Hi,

As far as I know, if you are doing a basic descriptive statistics such as; average,min,max contig length etc., you don’t need to remove the isoforms from Trinity.fa. Trinity and Oases make a their own cluster tools; component and locus, respectively.

ADD COMMENTlink written 4.1 years ago by vahapel160
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1076 users visited in the last hour