Question

Run featureCounts with Gencode GTF

0

Entering edit mode

6.1 years ago

tianleivv ▴ 50

I want to extract transcriptomic reads from BAM file, I do this using featureCounts software with Gencode GTF annotation, which include annotation from both ENSEMBL and HAVANA. Will this bias my results? For example, HIST1H2BK has two exons annotated inENSEMBL and one exon annotated in HAVANA as following: 1) chr6 ENSEMBL exon 27114188 27114619 2) chr6 ENSEMBL exon 27106073 27106460 3) chr6 HAVANA exon 27114197 27114577 exon1 and exon3 overlapping a lot, which could be the same exon in different annotation databases. featureCounts counted reads located all 3 exons, which mean the overlapped exon counted twice:

HIST1H2BK chr6;chr6;chr6 27106073;27114188;27114197 27106460;27114619;27114577 -;-;-

Will this bias the results? Should I just use one of ENSEMBL and HAVANA?

Thanks! Leo

RNA-Seq • 2.2k views

ADD COMMENT • link 6.1 years ago by tianleivv ▴ 50

score 1 · Accepted Answer · 2018-03-08

I read through featureCounts and found that: "A read is said to overlap a feature if at least one read base is found to overlap the feature. For paired-end data, a fragment (or template) is said to overlap a feature if any of the two reads from that fragment is found to overlap the feature. By default, featureCounts does not count reads overlapping with more than one feature (or more than one meta-feature when summarizing at meta-feature level). Users can use the -O option to instruct featureCounts to count such reads (they will be assigned to all their overlapping features or meta-features). Note that, when counting at the meta-feature level, reads that overlap multiple features of the same meta-feature are always counted exactly once for that meta-feature, provided there is no overlap with any other meta-feature. For example, an exon-spanning read will be counted only once for the corresponding gene even if it overlaps with more than one exon. " I believe this means it will not bias the results.