Hi All,
I was doing differential gene and transcript expression analysis by following the Tophat protocol (published in Nature protocol in March 2012) on the same example D. melanogaster as it is mentioned in the paper. But I am confused of an error/warning message at step 1 itself: align the RNA -seq reads to the genome using the command:
[root@BIO-DT-415 RNA_SEQ]# tophat -p 8 -G genes.gtf -o C1_R1_thout genome C1_R1_1.fq C1_R1_2.fq
Here, genes.gtf
was collected from iGenome package of Ensembl_BDGP5.25 of the species. The reference genome "genome.fa" was also from the same package.
After executing the command , I am getting the following error:
`[2014-11-25 14:05:25] Beginning TopHat run (v2.0.13)
-----------------------------------------------
[2014-11-25 14:05:25] Checking for Bowtie
Bowtie version: 2.2.4.0
[2014-11-25 14:05:25] Checking for Bowtie index files (genome)..
[2014-11-25 14:05:25] Checking for reference FASTA file
[2014-11-25 14:05:25] Generating SAM header for genome
IOError: [Errno 2] No such file or directory: 'C1_R1_1.fq'
I fail to understand what went wrong, please let me know how to create 'C1_R1_1.fq' .
I have following files from the RAW data downloaded from GEO: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE32038
NGS/RNA_SEQ/GSE32038_RAW/GSM794483_C1_R1.accepted_hits.bam
NGS/RNA_SEQ/GSE32038_RAW/GSM794483_C1_R1.transcripts.gtf.gz
NGS/RNA_SEQ/GSE32038_RAW/GSM794484_C1_R2.accepted_hits.bam
NGS/RNA_SEQ/GSE32038_RAW/GSM794484_C1_R2.transcripts.gtf.gz
NGS/RNA_SEQ/GSE32038_RAW/GSM794485_C1_R3.accepted_hits.bam
NGS/RNA_SEQ/GSE32038_RAW/GSM794485_C1_R3.transcripts.gtf.gz
NGS/RNA_SEQ/GSE32038_RAW/GSM794486_C2_R1.accepted_hits.bam
NGS/RNA_SEQ/GSE32038_RAW/GSM794486_C2_R1.transcripts.gtf.gz
NGS/RNA_SEQ/GSE32038_RAW/GSM794487_C2_R2.accepted_hits.bam
NGS/RNA_SEQ/GSE32038_RAW/GSM794487_C2_R2.transcripts.gtf.gz
NGS/RNA_SEQ/GSE32038_RAW/GSM794488_C2_R3.accepted_hits.bam
NGS/RNA_SEQ/GSE32038_RAW/GSM794488_C2_R3.transcripts.gtf.gz
I am not sure or I am totally lost on how to get this working. Please let me know your suggestions, I am totally new to this right now I am sounding very stupid, but I am totally helpless. Need your views guys.
Thanks a ton!
Ateeq
Hi Devon !!!
Thanks a lot ! i got the files and its perfectly working till now . I am sure i will get few more doubts, i will come back here again.
Thank you so much, Its great help for me .
Hi Devon,
I tried aligning the RNA-seq reads to the genome using tophat, used the following command
It's throwing error:
I am not able to decode this, can you help me please ?
Thanks a lot.
-Ateeq
Usually that happens when you run out of memory. The only real way to diagnose why it failed is to run that line manually yourself and see what the underlying error message is that it reports. This is assuming that all of the temp. files are still there.
Thanks Devon, thanks for coming to my rescue again! I have increased the RAM size to 16 gigs, now it's running fine. Hoping to finish this protocol by this week! I am sure I will be back again to this forum. Thank you so much, it's great help for me!