How To Transform The Colorspace Data Obtained From Sra Into A Format Suitable For The Tophat Rna-Seq Pipeline?
1
0
Entering edit mode
9.5 years ago
dfernan ▴ 710

Hi,

I need some help trying to know how to analyze RNA-Seq data from ABI 5500 solid sequencing data.

I need to develop the following pipeline:

  1. Download data from GEO GSE39860 - done using ascp - no need for help here. The data was downloaded from here.
  2. Use SRA toolkit to transform the data into color fasta, qualities or some file format that tophat can run - here I have spent most of the time without luck! Any ideas on which command to use from the SRA toolkit to get the data in a good format for TopHat?
  3. How to properly run Tophat with the previous data?

for 3, once I have a good command for the SRA part 2 I plan to run the following command for tophat, does it look good?

module load bio/bowtie-0.12.7
module load bio/tophat-2.0.4.Linux_x86_64
tophat2 --color --quals --library-type fr-secondstrand -G ucsc_mm9.gtf -o <pathtohere>/bowtie-indexes/mm9/mm9c <pathtohere>/fq/SRR534610.csfasta <pathtohere>/rnaseq/fq/SRR534610.qual

Please if anyone have some advice let me know, I'd be very glad to hear suggestions!

Thanks!

tophat sra geo bowtie rna-seq • 5.5k views
ADD COMMENT
1
Entering edit mode

So you got several errors - don't you think that posting these might help people diagnose your problem? Are you sure your genome indexes are from a colorspace build?

ADD REPLY
0
Entering edit mode

@daniel thanks, yes I did get that error as well, then tried the color indexed still get problems. Posting on all the trials and errors I think would make this Q more like a debugging question. However, I am looking for experience and suggestions - mainly on how to use SRAtoolkit to transform SRA data to cfasta and quals format, not the alternative python stuff I tried out of ignorance regarding the SRA toolkit step, step 2 in the analysis.

ADD REPLY
1
Entering edit mode

still not clear what you are asking, what do your csfasta and quality files contain? If these are the colorspace data then there is nothing else that the SRA tools can do for you and involving them only confounds the question. If your question is how to run tophat on colorspace reads then make sure to read the manual, especially where it says that you must make sure to invoke bowtie1 in the pipeline, other issues may also apply.

ADD REPLY
1
Entering edit mode

@irtsvan Thanks! Ok I tried clarifying the question a bit more without specifying what I tried before, I think now should be better...

ADD REPLY
0
Entering edit mode

@Istvan Albert basically I have SRA files, not fastac, qual files, and I tried looking into the SRA documentation but it does not look very detailed... not clear to me which ones of the SRA tools I should use for appropiate transform of the data to bowtie/tophat suite.

ADD REPLY
2
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 1612 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6