Question: Tophat 1.4 Output (Accepted_Hits.Bam)
gravatar for thecuriousbiologist
6.7 years ago by
United States
thecuriousbiologist430 wrote:

I have read the Tophat 1.4 manual, however I do not seem to understand the outputs very well.

I used a reference genome for my alignment with Tophat and I got a file called "accepted_hits.bam"

This file's cigar strings don't contain any soft clips (S). Can someone explain me why this is happening ?

Also, this was for the reference genome. If I wanted to align to a junction database, should I just use the junction reference instead of the genome reference ? Or does Tophat align to junctions automatically ?

Very confused with the documentation.

tophat bam output • 3.0k views
ADD COMMENTlink modified 4.8 years ago by Biostar ♦♦ 20 • written 6.7 years ago by thecuriousbiologist430
gravatar for Ido Tamir
6.7 years ago by
Ido Tamir5.0k
Ido Tamir5.0k wrote:
  • tophat splits the reads, but still requires everything to match. You could have mismatches in the beginning or end, but these would be scored as M in the CIGAR string. You have to look in the MD tag for mismatches. Old tophat versions dont have the optional MD tag. You can generate it with samtools calmd.

  • You always need a reference sequence. You can use a junction file or a GTF file in addition to this. Tophat automatically tries to align reads that can not be mapped completely by aligning them in a split fashion, thus discovering junctions. You can turn this off, modify it with parameters or supply your own junctions. But you should also have a junctions.bed file in your output and see some reads with a CIGAR like 20M500N50M.

ADD COMMENTlink modified 6.7 years ago • written 6.7 years ago by Ido Tamir5.0k

Can the 500N part of the CIGAR contain soft clips ?

ADD REPLYlink written 6.7 years ago by thecuriousbiologist430

500N is a gap tophat is marking an intron

ADD REPLYlink written 5.2 years ago by karl.stamm3.5k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1529 users visited in the last hour