Question: featurecounts output changes the underscores in the sample name into dots
0
gravatar for wangdp123
27 days ago by
wangdp123220
Oxford
wangdp123220 wrote:

Hi there,

In the latest version Rsubread package, I run the featurecounts analysis on a few samples but the output has changed the underscores in the sample names into dots, which is very inconvenient.

For example,

the original bam file name: sample_1.bam

the new sample name: sample.1.bam

Is there any way to avoid this conversion?

Many thanks,

Tom

featurecounts • 120 views
ADD COMMENTlink modified 27 days ago by Gordon Smyth1.4k • written 27 days ago by wangdp123220

featureCounts() of Rsubread does not generate bam files but uses them as input.

You are probably using the align() function from the said package. If that is the case, you might want to check the output_file argument. Just change the default output_file = paste(readfile1,"subread",output_format,sep=".") to output_file = paste(readfile1,"subread",output_format,sep="_").

ADD REPLYlink written 27 days ago by Haci220
1

OP uses featureCounts and the issue is real, it replaces underscore by dot. The output (column) name in the count matrix is the name of the bam file. Will move this to comment.

ADD REPLYlink written 27 days ago by ATpoint29k
2
gravatar for Gordon Smyth
27 days ago by
Gordon Smyth1.4k
Australia
Gordon Smyth1.4k wrote:

No, you can't avoid the conversion. featureCounts is trying to protect the names from systems that can't handle punctuation in variable names, but I agree it is not necessary here.

Of course you had to input the file names to featureCounts in the first place:

fc <- featureCounts(files, ...)

so you can easily put them back at the end:

colnames(fc$counts) <- files

I personally like to use

colnames(fc$counts) <- limma::removeExt(basename(files))
ADD COMMENTlink modified 27 days ago • written 27 days ago by Gordon Smyth1.4k
0
gravatar for ATpoint
27 days ago by
ATpoint29k
Germany
ATpoint29k wrote:

Why don't you simply grep the sample names from disk and then put them as colnames back into the output? Something like this:

tmp.files <- list.files(path = "~/Desktop/", pattern = ".bam", full.names = TRUE)

tmp.names <- sapply(strsplit(tmp.files, split="\\/"), function(x) rev(x)[1])

countmatrix<-featureCounts(files = tmp.files,
                           annot.ext = "~/Desktop/test.saf")

colnames(countmatrix$counts) <- c(tmp.names)
colnames(countmatrix$stat)   <- c("Status", tmp.names)
ADD COMMENTlink modified 27 days ago • written 27 days ago by ATpoint29k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 756 users visited in the last hour