Question: featureCounts for paired bam files and dealing with multimapping reads?
0
gravatar for bioinfo8
2.8 years ago by
bioinfo8120
bioinfo8120 wrote:

Hi,

I have multiple paired-end bam files from rnaseq experiments (hisat2 aligned). I extracted properly-paired reads, sorted, indexed and ran featureCounts using the following command (as per http://bioinf.wehi.edu.au/featureCounts/):

featureCounts -p -t exon -g gene_id -a species.gtf -o bam.featureCounts *.bam

While it is still running, I can see the following in the log file:

||                 Threads : 1                                                ||
||                   Level : meta-feature level                               ||
||              Paired-end : yes                                              ||
||         Strand specific : no                                               ||
||      Multimapping reads : not counted                                      ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
||          Chimeric reads : counted                                          ||
||        Both ends mapped : not required                                     ||

I am confused regarding whether I should count multimapping reads or not and how it would effect the differential gene expression analysis? I have gone through A: Dealing with multimapping reads in featureCounts post, but would like to know more opinion.

Also the userguide says: "Due to the mapping ambiguity, it is recommended that multi-mapping reads should be excluded from read counting (default behavior of featureCounts program) to produce as accurate counts as possible."

Thanks!

ADD COMMENTlink modified 2.8 years ago by Devon Ryan96k • written 2.8 years ago by bioinfo8120
4
gravatar for Devon Ryan
2.8 years ago by
Devon Ryan96k
Freiburg, Germany
Devon Ryan96k wrote:

You should not count multimapping alignments; the default featureCounts settings are appropriate.

ADD COMMENTlink written 2.8 years ago by Devon Ryan96k

A read from a gene can map to both the parent gene and as well as to a similar pseudogene. Removing ambiguous reads will under represent the parent gene even though it was expressed and counting them will over-represent the pseudogene. What do we do in such a situation?

ADD REPLYlink written 19 days ago by Arindam Ghosh300
1

We still ignore multimappers if your tool needs integers.

ADD REPLYlink written 19 days ago by Devon Ryan96k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1389 users visited in the last hour