Question: extract unique sequences on bam file
0
gravatar for SeaStar
5 weeks ago by
SeaStar10
Ocean
SeaStar10 wrote:

I ve aligned a list of sequence to a transcript to observe if this sequences are present in the transcript. I produced a BAM file. This is the result of my Alignement.

Left reads:
          Input     :  27849344
           Mapped   :     16442 ( 0.1% of input)
            of these:      7259 (44.1%) have multiple alignments (56 have >100)
Right reads:
          Input     :  27849344
           Mapped   :     15211 ( 0.1% of input)
            of these:      8390 (55.2%) have multiple alignments (56 have >100)
 0.1% overall read mapping rate.

Aligned pairs:      5975
     of these:      3776 (63.2%) have multiple alignments
                     194 ( 3.2%) are discordant alignments
 0.0% concordant pair alignment rate.
~

Now I have to see that I don't count the same read twice, because I have changed the multiple alignment parameters. If a read maps on two different sequences, it is always the same read. I have to count the reads as unique. So how can I made this check?

bam align sequence R • 200 views
ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by SeaStar10
4

Now I have to see that I don't count the same read twice

Before doing that, I would worry about the 0.1% overall read mapping rate.... Do you think it is normal to have such a low alignment rate ?

ADD REPLYlink written 5 weeks ago by Carlo Yague4.6k
1

I ve aligned a list of sequence to a transcript

This is likely because full dataset is being aligned to a transcript. Not generally recommended but probable explanation for that 0.1% figure in this case.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by genomax70k

I'VE aligned a set of transposon to a transcript to observe if these transposable elements are presents. It is a experimental test. I've used tophat

ADD REPLYlink written 5 weeks ago by SeaStar10

What exactly are you trying to show? Is the following interpretation correct.

  1. Threre are reads that align to the transcript of interest.
  2. In case where 1 is true the read aligns only once?
ADD REPLYlink written 5 weeks ago by genomax70k

Yes, it is correct. I want to check if the reads don't align 2 or more times the same sequence

ADD REPLYlink written 5 weeks ago by SeaStar10

Depends on the alignment tool you used and whether or it includes such tags.

Also please use the search function.

ADD REPLYlink written 5 weeks ago by benformatics1.1k

I'VE used tophat. I would make this analysis on R

ADD REPLYlink written 5 weeks ago by SeaStar10

I'VE used tophat. I would make this analysis on R

There is no logical overlap between those two. One is a splice-aware alignment program and other is a statistical programming environment.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by genomax70k

Sorry, I would to analize the bam file produced with tophat in r environment

ADD REPLYlink written 5 weeks ago by SeaStar10

What kind of analysis?

ADD REPLYlink written 5 weeks ago by genomax70k

A kind of analysis that allows me to count the repetitive elements in the trascripts. Anyway I ve tried a solution. I opened the bam with samtools, and I made unique() to count the unique reads that drop in some elements of interest. For example I made unique to all the reads presents in SINEs

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by SeaStar10
1
gravatar for genomax
5 weeks ago by
genomax70k
United States
genomax70k wrote:

I assume you used tophat2 for this alignment. Refer to this post which explains mapping qualities for tophat2: Tophat2 Mapping Qualities

I suggest that you try to see how many reads show up with following commands (you are just getting counts):

$ samtools view -c -q 30 your.bam
$ samtools view -c -q 40 your.bam
$ samtools view -c -q 45 your.bam
$ samtools view -c -q 50 your.bam

That last command will likely show reads that are uniquely mapped. You can remove -c if you want to capture those to a file.

ADD COMMENTlink modified 5 weeks ago • written 5 weeks ago by genomax70k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1687 users visited in the last hour