BAM to fastq with Tophat aligned files
2
1
Entering edit mode
7.3 years ago
frameshift ▴ 10

Hi all,

I have paired end files previously aligned via Tophat,These files per sample are aligned and unmapped reads bam files. I wanted to convert the files fastq to rerun the alignment using STAR.

I have tried a couple of things but wanted to know if there were better suggestions.

1) Merge the aligned and unaligned files (samtools merge), sort by name, and then run samtools fastq to give separate fastq files that I could run our alignment with.

This seemed to run ok but STAR gave errors saying that the two fastq files were inconsistent in regard to their file lengths.

That error seemed to stem from some reads having 3 reads of the same name. I think this gives files that are unequal between the fq1 and fq2 files.

2) The other way that I was going to try this was to exclude any improper mates using the flag (-F 2) with samtools before running the conversion. I think this might be appropriate, but I'm not sure if I want to discard reads that may map.

Does that seem to be the best approach for this or if anyone has any better insight that would be very helpful! Darryl

RNA-Seq • 2.8k views
ADD COMMENT
1
Entering edit mode
7.3 years ago
mastal511 ★ 2.1k

I used bedtools bamtofastq, and I found that reads that tophat aligned more than once end up present more than once in the fastq file. I solved the problem by using Picard tools SamToFastq. However, my dataset was single end reads.

With paired-end reads, I would merge the bam files first, as you have done.

ADD COMMENT
1
Entering edit mode
7.3 years ago
Jeffin Rockey ★ 1.3k

Please try BamUtil bam2FastQ

http://genome.sph.umich.edu/wiki/BamUtil:_bam2FastQ

ADD COMMENT
0
Entering edit mode

Thanks! I tried both your suggestions BamUtil and bedtools and both seemed remove alignments that were incorrect.

ADD REPLY

Login before adding your answer.

Traffic: 3123 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6