Question: How To Transform The Colorspace Data Obtained From Sra Into A Format Suitable For The Tophat Rna-Seq Pipeline?
gravatar for dfernan
7.0 years ago by
United States
dfernan660 wrote:


I need some help trying to know how to analyze RNA-Seq data from ABI 5500 solid sequencing data.

I need to develop the following pipeline:

  1. Download data from GEO GSE39860 - done using ascp - no need for help here. The data was downloaded from here.
  2. Use SRA toolkit to transform the data into color fasta, qualities or some file format that tophat can run - here I have spent most of the time without luck! Any ideas on which command to use from the SRA toolkit to get the data in a good format for TopHat?
  3. How to properly run Tophat with the previous data?

for 3, once I have a good command for the SRA part 2 I plan to run the following command for tophat, does it look good?

module load bio/bowtie-0.12.7
module load bio/tophat-2.0.4.Linux_x86_64
tophat2 --color --quals --library-type fr-secondstrand -G ucsc_mm9.gtf -o <pathtohere>/bowtie-indexes/mm9/mm9c <pathtohere>/fq/SRR534610.csfasta <pathtohere>/rnaseq/fq/SRR534610.qual

Please if anyone have some advice let me know, I'd be very glad to hear suggestions!


geo sra tophat bowtie rna-seq • 4.9k views
ADD COMMENTlink modified 7.0 years ago • written 7.0 years ago by dfernan660

So you got several errors - don't you think that posting these might help people diagnose your problem? Are you sure your genome indexes are from a colorspace build?

ADD REPLYlink written 7.0 years ago by Daniel Swan13k

@daniel thanks, yes I did get that error as well, then tried the color indexed still get problems. Posting on all the trials and errors I think would make this Q more like a debugging question. However, I am looking for experience and suggestions - mainly on how to use SRAtoolkit to transform SRA data to cfasta and quals format, not the alternative python stuff I tried out of ignorance regarding the SRA toolkit step, step 2 in the analysis.

ADD REPLYlink modified 7.0 years ago • written 7.0 years ago by dfernan660

still not clear what you are asking, what do your csfasta and quality files contain? If these are the colorspace data then there is nothing else that the SRA tools can do for you and involving them only confounds the question. If your question is how to run tophat on colorspace reads then make sure to read the manual, especially where it says that you must make sure to invoke bowtie1 in the pipeline, other issues may also apply.

ADD REPLYlink written 7.0 years ago by Istvan Albert ♦♦ 82k

@irtsvan Thanks! Ok I tried clarifying the question a bit more without specifying what I tried before, I think now should be better...

ADD REPLYlink modified 7.0 years ago • written 7.0 years ago by dfernan660

@Istvan Albert basically I have SRA files, not fastac, qual files, and I tried looking into the SRA documentation but it does not look very detailed... not clear to me which ones of the SRA tools I should use for appropiate transform of the data to bowtie/tophat suite.

ADD REPLYlink modified 7.0 years ago • written 7.0 years ago by dfernan660
gravatar for Dan
7.0 years ago by
Dan510 wrote:

Try Google

ADD COMMENTlink written 7.0 years ago by Dan510
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1611 users visited in the last hour