Question

Tophat unable to produce accepted_hits.bam file for single end read data

0

Entering edit mode

8.2 years ago

nikhilvgbt • 0

I have been working on RNA-Seq data from last 6 months, and I am able to create accepted_hits.bam file for paired end data, but I am not able to create the accepted_hits.bam file for single end data using Tophat.

The command line I used for single end data is:

Tophat -p 12 -G hg19.gtf -o C1_R1_thout genome SRR154213.fastq
Tophat -p 12 -G hg19.gtf -o C1_R2_thout genome SRR154214.fastq

After running this command I got the accepted_hits.bam which is of 600 byte size file which is not correct aligned file.

Please let me know if I have to change the command line arguments.

Thank you

RNA-Seq Tophat • 2.8k views

ADD COMMENT • link updated 21 months ago by Ram 43k • written 8.2 years ago by nikhilvgbt • 0

0

Entering edit mode

You presumably received an error message at some point...

ADD REPLY • link 8.2 years ago by Devon Ryan 104k

0

Entering edit mode

There was no error while running tophat, the Tophat ran successfully for 2 hours using above command ..

ADD REPLY • link 8.2 years ago by nikhilvgbt • 0

0

Entering edit mode

Even if the bam wasn't written, a log should have been, what was contained in that?

ADD REPLY • link 8.2 years ago by andrew.j.skelton73 6.5k

0

Entering edit mode

Hello andrew.j.skelton73 and Devon Ryan sir,

This is my log file content which tells about successfully completion of alignment, but still the accepted_hit.bam is of 1kb

[2016-01-22 00:57:24] Beginning TopHat run (v2.0.9)
-----------------------------------------------
[2016-01-22 00:57:24] Checking for Bowtie
          Bowtie version:     2.1.0.0
[2016-01-22 00:57:24] Checking for Samtools
        Samtools version:     0.1.19.0
[2016-01-22 00:57:24] Checking for Bowtie index files (genome)..
[2016-01-22 00:57:24] Checking for reference FASTA file
[2016-01-22 00:57:24] Generating SAM header for genome
    format:         fastq
    quality scale:     phred33 (default)
[2016-01-22 00:57:36] Reading known junctions from GTF file
[2016-01-22 00:57:40] Preparing reads
     left reads: min. length=49, max. length=49, 11503394 kept reads (146 discarded)
[2016-01-22 00:59:30] Building transcriptome data files..
[2016-01-22 00:59:50] Building Bowtie index from genes.fa
[2016-01-22 01:14:33] Mapping left_kept_reads to transcriptome genes with Bowtie2
[2016-01-22 01:16:33] Resuming TopHat pipeline with unmapped reads
[2016-01-22 01:16:33] Mapping left_kept_reads.m2g_um to genome genome with Bowtie2
[2016-01-22 01:24:32] Mapping left_kept_reads.m2g_um_seg1 to genome genome with Bowtie2 (1/2)
[2016-01-22 01:28:55] Mapping left_kept_reads.m2g_um_seg2 to genome genome with Bowtie2 (2/2)
[2016-01-22 01:31:12] Searching for junctions via segment mapping
[2016-01-22 01:39:45] Retrieving sequences for splices
[2016-01-22 01:41:38] Indexing splices
[2016-01-22 01:42:18] Mapping left_kept_reads.m2g_um_seg1 to genome segment_juncs with Bowtie2 (1/2)
[2016-01-22 01:43:40] Mapping left_kept_reads.m2g_um_seg2 to genome segment_juncs with Bowtie2 (2/2)
[2016-01-22 01:45:57] Joining segment hits
[2016-01-22 01:49:19] Reporting output tracks
-----------------------------------------------
[2016-01-22 01:53:16] A summary of the alignment counts can be found in C1_R1_tophout/align_summary.txt
[2016-01-22 01:53:16] Run complete: 00:55:52 elapsed

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 8.2 years ago by nikhilvgbt • 0

0

Entering edit mode

What are the contents of "C1_R1_tophout/align_summary.txt"? If it says that you should have alignments then try producing an unsorted SAM file rather than the sorted BAM file (i.e., use --no-convert-bam). That unsorted SAM file is actually produced first anyway, so this should be faster and allow you to start narrowing down where the error is occurring.

ADD REPLY • link 8.2 years ago by Devon Ryan 104k