Downstream Analysis Of Assembled Transcripts (Cufflinks, Trinity)
2
5
Entering edit mode
10.4 years ago
Ian Fiddes ▴ 70

Are there any good available programs/scripts for analyzing assembled transcripts? I imagine something like a script to blastn, blastx and tblastx each transcript and report the best hit. Something like that wouldn't be too hard to write, but I don't want to re-invent the wheel, and I also am concerned that sometimes the highest scoring hit reported by blast has a lot of gaps and is not the right result, while a lower scoring shorter hit is more likely correct, but the only way I can think of to accurately determine this is manually.

The genome I am interested in is poorly annotated, and particularly bad in my region of interest, so just using a reference gtf with my cufflinks transcripts would not be very helpful.

cufflinks trinity blast analysis • 5.2k views
ADD COMMENT
0
Entering edit mode
10.4 years ago
Lhl ▴ 730

Have you figure this out? I am in the same situation. Would be happy to know any solutions.

ADD COMMENT
0
Entering edit mode

I forgot to check back here, sorry. I got shown the software blast2go on seqanswers, which does what I want, but unfortunately is quite slow and cumbersome. I am currently working on a (crappy) python script to do this for me, for now with Cuffdiff output. My approach is:

1) Take only significant hits in isoform_sig.diff and convert it to a bed file taking each TCONS as a gene (one line). 2) use fastaFromBed in the fastx toolkit to create a fasta file. 3) Run that through my script using wwwblast/qblast in biopython to slowly blast against the database of my choice on NCBI.

ADD REPLY
0
Entering edit mode
10.4 years ago
Rt ▴ 90

The following downstream analyses are supported as part of Trinity:

  • Aligning the RNA-seq reads back to the Trinity transcripts for visualization in IGV and abundance estimation using RSEM. Link
  • Using EdgeR and Bioconductor for analyzing differentially expressed transcripts. Link
  • Extract likely protein-coding regions from Trinity transcripts. Link

Extracting likely protein-coding regions from Trinity transcripts has greatly reduced blastp runtime. You may try blast2go for further analysis.

ADD COMMENT

Login before adding your answer.

Traffic: 1642 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6