Question: get identical number of read1 and read2 aligned
0
gravatar for oghzzang
5 weeks ago by
oghzzang10
oghzzang10 wrote:

Dear Biostar users.
This is my samtools flagstat output.

39219750 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
39219750 + 0 mapped (100.00% : N/A)
39219750 + 0 paired in sequencing
19658242 + 0 read1
19561508 + 0 read2
39219750 + 0 properly paired (100.00% : N/A)
39219750 + 0 with itself and mate mapped
0 + 0 singletons (0.00% : N/A)
1 + 0 with mate mapped to a different chr
1 + 0 with mate mapped to a different chr (mapQ>=5)

In this situation, why are not read1 and read2 read identical?

ADD) I want to get paired read and properly mapped read. so I run this command in linux

samtools view -b -f 3 -F 780 input.bam > properly_ex_qual_unmap_1_mapped.bam and flagstat result (above) is from this bam.

I can't understand this result. Because in flagstat result all reads were properly mapped and all reads were mapped.

Thanks for your help.

dna fastq • 186 views
ADD COMMENTlink modified 4 weeks ago • written 5 weeks ago by oghzzang10
1

Did you use bwa mem? If so, the difference is due to supplemental alignments.

ADD REPLYlink written 5 weeks ago by Devon Ryan81k

Thank you Ryan.

I checked my bamfile Header. This bam file is maybe made of bwa aln and samse.

I can't understand this situation...

Because in the flagstat output, they are all properly paired and all mapped and all paired in sequencing . :(

Then, how can I extract exactly identical read1 and read2?

Thank you for your reply.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by oghzzang10
1

what do you mean by identical reads?

ADD REPLYlink written 5 weeks ago by Antonio R. Franco3.7k

I'm sorry for my confused word.

I mean

I want to get only paired reads that have identical header name.

If so, the number of read1 and read2 will be identical.

Isn't it?

Thanks

ADD REPLYlink written 5 weeks ago by oghzzang10

Go to this page Meaning of sam flags Every mapped read containing the 2 value in their flag, is a read mapped in a proper pair. So by using

samtools view -f 2 -b your.bam > proper_pairs.bam

you theoretically can get your properly mapped reads

ADD REPLYlink written 5 weeks ago by Antonio R. Franco3.7k
2

alternatively, to extract the mapped reads whose mates are also mapped, u can use

samtools view -F 12
ADD REPLYlink written 5 weeks ago by Carlo Yague4.0k
1

Check the number of lines in the two original fastq files you had. Perhaps one has a few extra reads.

ADD REPLYlink written 5 weeks ago by Devon Ryan81k

Thanks for Franco, Yague, Ryan.

1st. According to Yague and Franco advice, I run samtools view and flagstat

1) samtools view -b -f 2 -F 12 input.bam > f2F12.bam

39219750 + 0 in total (QC-passed reads + QC-failed reads)

0 + 0 secondary

0 + 0 supplementary

0 + 0 duplicates

39219750 + 0 mapped (100.00% : N/A)

39219750 + 0 paired in sequencing

19658242 + 0 read1

19561508 + 0 read2

39219750 + 0 properly paired (100.00% : N/A)

39219750 + 0 with itself and mate mapped

0 + 0 singletons (0.00% : N/A)

1 + 0 with mate mapped to a different chr

1 + 0 with mate mapped to a different chr (mapQ>=5)

2) samtools view -b -F 12 input.bam > F12.bam

40294316 + 0 in total (QC-passed reads + QC-failed reads)

0 + 0 secondary

0 + 0 supplementary

0 + 0 duplicates

40294316 + 0 mapped (100.00% : N/A)

39486133 + 0 paired in sequencing

19793464 + 0 read1

19692669 + 0 read2

39219750 + 0 properly paired (99.33% : N/A)

39486133 + 0 with itself and mate mapped

0 + 0 singletons (0.00% : N/A)

219437 + 0 with mate mapped to a different chr

219437 + 0 with mate mapped to a different chr (mapQ>=5)

this result and my 1st question's output are same. The reason why my 1st question's output is from bam (samtools view -b -f 3 -F 780 input.bam > properly_ex_qual_unmap_1_mapped.bam ).

2nd. According to Ryan's advice, I checked my raw fastq files. But this bam file is too old, so raw fastq file was deleted.

3rd. In Ryan's post (link: https://www.biostars.org/p/233441/#233455), different number of # read1 and # read2 is related to singletons. But my flagstat result seems like "no singleton".

I need only paired reads ( identical number of read1 and read2 ). But I can't find method.

Your all answer is helped me. Thanks.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by oghzzang10

Hello.

I think I found a problem.

In f2F12.bam file ( this bam file was made of "samtools view -f 2 -F 12 input.bam > f2F12.bam") [samtools -f 2 -F 2 input.bam > f2F2.bam --> mistake],

a read's flag value is 99.

According to https://broadinstitute.github.io/picard/explain-flags.html, 99 mean follwing 4things.

read paired (0x1)
read mapped in proper pair (0x2)
mate reverse strand (0x20)
first in pair (0x40)

But in f2F12.bam file, the read have no pair reads......

I think other reads are same reason.

Can I get only exactly and truly paired reads using program or command? delete unmatched read?

I think my bam file is too bad.

Thanks

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by oghzzang10

Did you actually do -f 2 -F 2 or did you mean -f 2 -F 12? -f 2 and -F 2 are opposites, so it's unclear what the results of that would be.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Devon Ryan81k

oh I'm sorry for my mistake.

I do "samtools view -f 2 -F 12".

ADD REPLYlink written 4 weeks ago by oghzzang10

You might need to just post the BAM file somewhere. Then we can have a look at what's going on. My guess is that this file was filtered at some point, which is why there's a discrepancy in the mates.

ADD REPLYlink written 4 weeks ago by Devon Ryan81k
1

You can also try to use samtools fixmate to update mate-related flags before running flagstats.

ADD REPLYlink written 4 weeks ago by Carlo Yague4.0k
1
gravatar for Antonio R. Franco
5 weeks ago by
Spain. Universidad de Córdoba
Antonio R. Franco3.7k wrote:

Simply because some reads did not mapped under the conditions indicated by the program

ADD COMMENTlink written 5 weeks ago by Antonio R. Franco3.7k

Thank you Franco.

If so, can I extract identical read1(forward) and read2(reverse) from this bam?

Thank you for your reply.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by oghzzang10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1562 users visited in the last hour