convert STAR ReadsPerGene.out.tab to Deseq2 count table
1
1
Entering edit mode
7.1 years ago
jonessara770 ▴ 240

Hello,

How can I format STAR ReadsPerGene.out.tab for stranded RNA to Deseq2 count matrix format?

N_unmapped 663817 663817 663817

N_multimapping 1232697 1232697 1232697

N_noFeature 5408215 8176752 5438049

N_ambiguous 108068 315 106321

NM_214429 0 0 0

NM_214220 0 0 0

NM_001143697 51 0 51

NM_001164649 1 0 1

NM_001044603 2 0 2

NM_001244242 479 1 478

NM_001244241 286 1 285

NM_001177907 428 0 428

thanks

RNA-Seq • 7.4k views
ADD COMMENT
9
Entering edit mode
7.1 years ago
h.mon 35k

I use the following code snippet:

ff <- list.files( path = "./counts", pattern = "*ReadsPerGene.out.tab$", full.names = TRUE )
counts.files <- lapply( ff, read.table, skip = 4 )
counts <- as.data.frame( sapply( counts.files, function(x) x[ , number ] ) )
ff <- gsub( "[.]ReadsPerGene[.]out[.]tab", "", ff )
ff <- gsub( "[.]/counts/", "", ff )
colnames(counts) <- ff
row.names(counts) <- counts.files[[1]]$V1

Depending on the strandedness of you library, number may be 2,3 or 4.

ADD COMMENT
0
Entering edit mode

Hello! Sorry for bringing up this old thread, but I have been trying to convert my readspergene file to a count matrix for DESeq2 as well.

Everything works fine for me, until the colnames(counts) <- ff I face this error: Error in names(x) <- value : 'names' attribute [27] must be the same length as the vector [0]

I assume this is because I have missing values in my dataset, however everything looks fine after briefly looking through my ReadsPerGene.out.tab. Could I enquire if anyone has came across this error as well and what can I do to solve it? Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6