R removes 1st column (gene-id) from featureCounts count.txt table
1
1
Entering edit mode
13 months ago
Pegasus ▴ 100

Hi all,

I generated a count.txt for sorted.bam files using featureCounts on Linux following the RNA-SEQ data analysis steps.

1- Using txt.editor, I checked the count.text file and found the following columns;

geneid  Chr Start   End Strand  Length  sample1 sample2 etc

However,

2- The first column name (geneid) was removed when I opened the file using R.

(EMPTY) Chr Start   End Strand  Length  sample1 sample2 etc
  1. both colnames(), and rownames() did not show me the geneid title.

  2. I tried changing or adding the name of the 1st column, but R changed the 2nd column name, so replacing (Chr) with geneid.

So, why did R remove the geneid name, and how can I add it in in which I can advance to edgeR.

Any help you can provide is greatly appreciated

featureCounts RNA-Seq R • 955 views
ADD COMMENT
1
Entering edit mode

Not an R expert.

It sounds as if the first column is being used as an index, and is therefore not named. It may be helpful to others if you paste the output of head count.txt and show the exact Rstudio command used to open the file.

ADD REPLY
1
Entering edit mode

That's unusual. Please show the command to read the file.

ADD REPLY
1
Entering edit mode
13 months ago
Pegasus ▴ 100

Thanks for the reply, I could fix it using the steps below. I will keep the script here in case anyone else faces the same issue

  1. Read in the count matrix without row names: counts <- read.table("feature.counts", header = TRUE, check.names = FALSE)

  2. Add a new column with the gene IDs: counts <- cbind(geneid = rownames(counts), counts)

  3. Remove the row names: rownames(counts) <- NULL

ADD COMMENT

Login before adding your answer.

Traffic: 1724 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6