Question: mistakenly ran featureCounts in paired-end mode on single-read data
0
gravatar for clboozy
18 months ago by
clboozy0
clboozy0 wrote:

I am looking through an old pipeline that was run over a year ago in preparation for submitting data to GEO. I discovered that although the sequencing in this experiment was single-read (vs. paired-end), I had run featureCounts in paired-end mode (with a parameter of -p). According to the featureCounts documentation, the -p flag has the following definition: "If specified, fragments (or templates) will be counted instead of reads. This option is only applicable for paired-end reads." Did adding this parameter by mistake affect the run at all? Or did it not matter as all samples were single-read anyway?

rna-seq featurecounts • 1.1k views
ADD COMMENTlink modified 18 months ago by swbarnes29.1k • written 18 months ago by clboozy0
1

As far as I know, it doesn't effect results if you use -p on SE data. But you could quickly check it by running on any bam you have (with and without -p)

ADD REPLYlink written 18 months ago by geek_y11k
1

Indeed, the scientist within you should run it with and without, and then cross-compare results.

ADD REPLYlink written 18 months ago by Kevin Blighe67k

Agreed! That's what I would have done if I still had access to the bam files... unfortunately, I do not.

ADD REPLYlink written 18 months ago by clboozy0

Cool. I would have hoped that featureCounts issued a warning message, at least (?). Keep in mind that these counting methods are fairly rudimentary - one can perform read count abundance using BEDTools or custom scripts, if one wishes.

ADD REPLYlink written 18 months ago by Kevin Blighe67k

except that featureCounts is blazingly fast and comes with tons of options

ADD REPLYlink written 18 months ago by Friederike6.5k

Thanks! Unfortunately, I don't have access to any SE bam files currently.

ADD REPLYlink written 18 months ago by clboozy0
1
gravatar for h.mon
18 months ago by
h.mon31k
Brazil
h.mon31k wrote:

As geek_y and Kevin Blighe said, test for yourself.

Even if do not have access to the original bams, you can easily grab single-end fastq (in which case you will have to align) and / or bam files, and run featureCounts twice, withand without -p. The result of this test will tell you if your original counts are correct or not.

You don't want to submit potentially bogus results to GEO (with your name on it) based on "but this internet guy told me my counts were fine", do you?

ADD COMMENTlink modified 18 months ago • written 18 months ago by h.mon31k
1
gravatar for swbarnes2
18 months ago by
swbarnes29.1k
United States
swbarnes29.1k wrote:

I doubt it will matter. When you have paired end reads, you need the software to understand that if it sees the same read name twice; one read1 and on read2, aligning to the same gene twice, it has to not count those as two separate reads, since they came from one fragment.

That won't be a problem with a single end dataset.

ADD COMMENTlink modified 18 months ago • written 18 months ago by swbarnes29.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1926 users visited in the last hour