A script for extracting information related to a list of gene names from a file
2
0
Entering edit mode
5.8 years ago
ahmad.iut ▴ 90

Dear Biostars,

I have a text file containing several rows and columns like this:

"Gene Name"  "Gene Id"   "description"     "GO"
A 1 phosphatase GO:001256
B 2 synthesize GO:013154
C 3 methylase GO:000054
D 4 kinase GO:001254
E 5 oxigenase GO:001354
F 6 synthesize

GO:001254

In addition, I have another text file just containing one column and several rows like this:

Gene Name
A
D
C
B

  I need to extract the rows of file 1 that contain gene names listed in file 2.

Does anybody have any idea how to do that?

PS: I know how to do that by excel but it does not work with huge rows of information.

 

Thank you

RNA-Seq gene script data mining • 3.1k views
ADD COMMENT
2
Entering edit mode
5.8 years ago
Benn 8.2k

You can do it with R, with the subset function works pretty intuitively.

ADD COMMENT
0
Entering edit mode

In R:

file1<-read.table("file1.txt", sep="\t", header=T)
file2<-read.table("file2.txt", sep="\t", header=T)
Selection<-file1[file1$"Gene name" %in% file2$"Gene Name",]

You don't even have to use subset function

ADD REPLY
0
Entering edit mode

Dear Nota,

Thank you for your answer. These command in R just gave me the headers:

Gene.Name   Gene.Id     description GO
ADD REPLY
0
Entering edit mode

OK, R substitutes the spaces in the header to dots.

So you can use:

Selection<-file1[file1$Gene.Name %in% file2$Gene.Name,]
ADD REPLY
1
Entering edit mode

Thank you so much Nota, It worked well. the problem was the spaces in headers (like Gene Name).

ADD REPLY
2
Entering edit mode
5.8 years ago

using linux:

join -1 1 -2 1 <(sort -k1,1 file1.txt) <(sort -k1,1 file2.txt)  > joined.txt

using knime.org:

load both files (Read File) in two tables and join using a "Join" node https://www.knime.org/files/nodedetails/_manipulation_column_column_split_combine_Joiner.html

ADD COMMENT
0
Entering edit mode

Dear Lindenbaum,

The command worked perfectly. Thank you very much

ADD REPLY

Login before adding your answer.

Traffic: 2122 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6