Question: Counting paired-end sequencing mapped reads
0
gravatar for heir_of_isildur88
23 months ago by
heir_of_isildur8810 wrote:

Hi all,

I have a very basic question here. With a paired-end sequencing, how do we count the number of mapped reads?

I did a flagstat on my file which I have already filtered with the flags 83 & 163 for mapped proper and properly paired.

The flagstat results are as below:

69640 + 0 in total (QC-passed reads + QC-failed reads)

0 + 0 duplicates

69640 + 0 mapped (100.00%:-nan%)

69640 + 0 paired in sequencing

34820 + 0 read1

34820 + 0 read2

69640 + 0 properly paired (100.00%:-nan%)

69640 + 0 with itself and mate mapped

0 + 0 singletons (0.00%:-nan%)

0 + 0 with mate mapped to a different chr

0 + 0 with mate mapped to a different chr (mapQ>=5)

So from the results, does 69640 means that 69640 reads mapped, or 69640 paired-reads mapped (as in R2---R1 is counted as 1 read)? Should we divide the number with 2 if only one end of the reads were counted?

A bit confused here. Hope someone can help. Thank you very much.

sequencing • 1.5k views
ADD COMMENTlink modified 23 months ago by Pierre Lindenbaum119k • written 23 months ago by heir_of_isildur8810
2
gravatar for Pierre Lindenbaum
23 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum119k wrote:
  • 69640 reads where mapped
  • 69640 reads where mapped and they were part or a paired-end experiment/assay
  • 34820 came from the R1/forward fastq
  • 34820 came from the R2/reverse fastq
  • 69640 read are correctly mapped with their mate (paired-end experiment, good distance, same contig...)
ADD COMMENTlink written 23 months ago by Pierre Lindenbaum119k

Thanks Pierre for the reply.

But I still don't quite understand. I attached a picture to make my query clearer.

Picture1
image url

In the picture, the flagstat showed 6 reads; 3 from R1 & 3 from R2. I understand that. But if you view the sam file in a genomic viewer like IGV, the first image appears. But if I view them as pairs, it will look like the second one. So in the second picture, we actually only have 3 fragments of the gene mapped, with each fragment made from a pair of reads. So shouldn't we divide the number of reads by 2 to get the total number of fragments mapped? The number given in flagstat gives the number of individual reads, without taking into consideration the pair, right?

ADD REPLYlink modified 23 months ago • written 23 months ago by heir_of_isildur8810

So shouldn't we divide the number of reads by 2 to get the total number of fragments mapped?

yes, and this number would be "properly paired" /2

The number given in flagstat gives the number of individual reads, without taking into consideration the pair, right?

again, the number of 'correct fragments' would be "properly paired" /2

ADD REPLYlink written 23 months ago by Pierre Lindenbaum119k

OK, that clarifies a lot of things.

But then, what about when the "properly paired" number is odd? It's not always an even number, right? For example, R1 can map to 2 different R2 fragments, how do you count those?

I am quite new to sequencing, so I would like to know what is the general consensus on reporting the number of reads in paired-end sequencing? The total number of reads or the one divided by 2 i.e. 'correct fragments'?

ADD REPLYlink written 23 months ago by heir_of_isildur8810
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1821 users visited in the last hour