Question: Can Tophat Be Used For Mapping Dna-Seq (Whole Genome) Data
1
gravatar for KS
6.3 years ago by
KS360
KS360 wrote:

Hello,

Can tophat be used for mapping DNA-Seq (Whole genome) Data.

Basically, while reading through the articles most of them mentioned usage of Tophat for RNA-seq data and BWA or Bowtie for DNA-Seq (Whole Genome) mapping. Is it ok to use tophat for mapping of DNA-Seq reads?

Thanks

Suz

tophat bwa • 3.6k views
ADD COMMENTlink modified 2.9 years ago by ghv80 • written 6.3 years ago by KS360

@JC is Straight answer but, Please refer to this Is tophat the only mapper to consider for RNA-seq data?

In contrast to DNA-sequence alignment, RNA-seq mapping algorithms have two additional challenges. First, because genes in eukaryotic genomes contain introns, and because reads sequenced from mature mRNA transcripts do not include these introns, any RNA-seq alignment program must be able to handle gapped (or spliced) alignment with very large gaps.

Tophat2

ADD REPLYlink modified 5.5 years ago • written 5.5 years ago by Medhat8.3k

A few questions, following up with the thread:

If tophat is not to be used for DNA-seq, would Bowtie2 be preferred over Bowtie?

Also, other than IGV to see where in the genome the reads map, is there a 'tuxedo' pipeline for DNAseq? For RNA-seq there is a plethora of R packages (edgeR, DESeq2.....), but I cannot seem to find something that gides me through the DNA-seq analysis. I'd like to assess distribution of significantly enriched genomic areas when comparing mutants vs wild type.

Thanks.

ADD REPLYlink written 2.9 years ago by ghv80

Please don't add new questions using SUBMIT ANSWER to old threads. You should post this as a new question/post.

That said the question you are asking at the end of your post is not making much sense. Enriched genomic areas (as in copy number)?

ADD REPLYlink written 2.9 years ago by genomax68k
3
gravatar for JC
6.3 years ago by
JC7.9k
Mexico
JC7.9k wrote:

No.

TopHat is designed to map reads to a reference allowing splicing. In your case, the reads are not spliced because are genomic, so don't waste your time and resources and use Bowtie/BWA directly.

ADD COMMENTlink written 6.3 years ago by JC7.9k
0
gravatar for seidel
6.3 years ago by
seidel6.8k
United States
seidel6.8k wrote:

I have a different answer than JC, although I actually agree with him. If you look at the tophat paper (Trapnell, Pachter & Salzberg, 2009), you'll see that tophat is a gapped read aligner that first uses bowtie to map reads to the genome, and then it uses the resulting read pile ups to build a potential splice database, and then it takes all the reads that did not align the first time, and sees if any of them can align if they are split between read piles. Thus, in general, if you simply want to map DNA-Seq reads for whole genome mapping, as JC points out tophat would be pointless. However, if you had reason to believe that your reference genome contains lots of gaps, then it seems tophat could be used to potentially detect gap differences between your sequenced genome, and your reference genome. There is some evidence that some genomes (i.e. flatworm) may contain many locations with small gaps, and these gaps are highly heterologous between individuals.

ADD COMMENTlink written 6.3 years ago by seidel6.8k
2

Tophat still would not be appropriate, because it is not looking for just any gap. It is looking for introns. That means that it is looking for the gap to start and end with splice donor/acceptor sites. For gapped alignment of DNA-seq reads it would be better to find a gapped aligner that makes no assumptions about the nucleotide content at gap edges.

ADD REPLYlink written 6.3 years ago by Ryan Thompson3.4k

I didn't realize tophat examined splice/donor acceptor sites. Thanks for the info.

ADD REPLYlink written 6.3 years ago by seidel6.8k

I agree with JC in that TopHat is not a gapped read aligner. In TopHat the computational segmentation of the unaligned reads precedes a second round of alignment guided by known splice junctions. In other words, my understanding is that TopHat aims at aligning reads across gaps created by splicing events, whereas gapped read aligners like BWA attempt to align across gaps derived from INDELs.

ADD REPLYlink written 6.3 years ago by GPR320
0
gravatar for KS
6.3 years ago by
KS360
KS360 wrote:

Thanks for the info. It really helped me a lot

ADD COMMENTlink written 6.3 years ago by KS360
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1248 users visited in the last hour