Question: Which Aligner Should I Use For Mapping Bacterial Rna-Seq Data
8.2 years ago
Kssr110 wrote:

The reads are paired end,90 bp. I have tried using bowtie and tophat.Bowtie gave around 85 % of alignment.While tophat gave around 92 % alignment.Although Tophat calls Bowtie, any ideas why the percent alignment is higher in case of Tophat??

As bacteria do not have splice junctions,does running tophat in anyway affect the alignment.How does running tophat with GTF file affect the alignment?

I prefer using tophat since I would like to do differential analysis using tophat output with cufflinks software.Any ideas would be appreciated.

written 8.2 years ago by Kssr110
8.2 years ago
Ryan Thompson
TSRI, La Jolla, CA
Ryan Thompson wrote:

You are probably running Bowtie in paired-end mode, which means that Bowtie requires a pair to either map together or not at all. Tophat maps each read singly because it is looking for splicing events, so it cannot assume that a pair of reads will map near each other because that pair might span an intron.

So the difference in mapping percentage is probably just because Tophat allows reads to map singly. You can do the same thing with Bowtie as well by first running in paired-end mode and using the --un option to collect unaligned pairs, and then mapping the unaligned pairs in single-end mode.

Also, I believe Cufflinks accepts any bam file as input. It is not dependent on Tophat.

written 8.2 years ago by Ryan Thompson

Cufflinks does accept any bam file as input, but there is one restriction: there must be XS tags for the spliced alignments.

written 8.2 years ago by Mikael Huss

Well, bowtie doesn't do spliced alignment, and bacteria don't splice their RNA, so I don't think that should be a problem in this case, either technically or biologically. But good point regardless, for anyone working in eukaryotes.

written 8.2 years ago by Ryan Thompson
6.4 years ago
Josh Herr
University of Nebraska
Josh Herr wrote:

This is a problem the authors of the Tuxedo suite have touched on, hence, this newish paper: EDGE-pro: Estimated Degree of Gene Expression in Prokaryotic Genomes

written 6.4 years ago by Josh Herr
8.2 years ago
Swbarnes2 wrote:

You could probably just align to a refseq file of transcripts with anything. That would be easy to count from the .bam file.

written 8.2 years ago by Swbarnes2
