combine RNA-seq data count and len
1
0
Entering edit mode
4.6 years ago
mikysyc2016 ▴ 120

I have two txt file. One include transcription id and gene length, another one have transcription id and each sample reads, i want to combine them as one transcription id and reads count and length, how I can do it. I know vlookup can, but it is not good for big data. Thanks!

RNA-Seq • 951 views
ADD COMMENT
0
Entering edit mode

Give an example of both files and the intended output please.

ADD REPLY
0
Entering edit mode

It looks like: one is :

Transcript KO1 KO2 KO3 WT1 WT2 WT3 78 79 81 66 68 70 27 28 29 NM_001011874 3 0 0 2 3 0 1 0 0 1 0 0 1 3 2 NM_001195662 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 NM_011283 0 0 0 2 1 0 0 0 0 0 0 0 0 0 0 NM_011441 769 153 314 871 158 399 289 224 888 275 270 1031 285 1360 821

.... another one is :

Transcript length NR_040439 1687 NM_013715 1239 NM_026493 4354 NM_001164233 2195 NM_027584 2328 NM_001102430 7042 NM_172851 5120

ADD REPLY
0
Entering edit mode
4.6 years ago
c.chakraborty ▴ 170

You can upload both the files in R, and then create a data frame with transcription ID, gene.length, and read.counts together. Use reshape2 and plyr for merging the files. Check this link Help with 2 list in R, comparing gene ID to get refined information from both

ADD COMMENT
0
Entering edit mode

thank you for your reply. My case is a little bit different. The order of transcrip id for the two file is different. and one have around ~20000 id, another has ~30000id.

ADD REPLY

Login before adding your answer.

Traffic: 1281 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6