Question

removing decimal from ENSEMBL gene ID from deseq2 output

1

Entering edit mode

6.6 years ago

1769mkc ★ 1.2k

I want to remove the decimal from the ensembl gene ID ,since it contains the decimal point it becomes difficult when i try to map the same to gene name .

gene                   "nH1.bam"    "nH2.bam"   "nH3.bam"              "nH4.bam"
"ENSG00000238164.4" -0.6534833425   -0.6404869759   -0.5898568965   -0.586357257
"ENSG00000049249.6" 1.0589150487    0.2235087421    0.5028436068    0.5201173416

I want this in my gene field "ENSG00000049249" instead of this "ENSG00000049249.6"

I tried this awk '{gsub(/\..*$/,$1)}1' it seems it messing up the data frame im not sure what im doing wrong.

Any help or suggestion would be highly appreciated

R ensembl • 5.5k views

ADD COMMENT • link updated 6.6 years ago by Emily 23k • written 6.6 years ago by 1769mkc ★ 1.2k

0

Entering edit mode

How would you alter the command if there are two digits after the decimal point, say ENSG00000000460.15 ? I am not able to remove the numbers after the decimal point in such cases using the above command.

ADD REPLY • link 4.1 years ago by fawazfebin ▴ 100

score 8 · Accepted Answer · 2017-09-15

8

Entering edit mode

6.6 years ago

Pierre Lindenbaum 161k

sed 's/\(ENSG[0-9]*\)\.[0-9]*/\1/g' input.txt

ADD COMMENT • link 6.6 years ago by Pierre Lindenbaum 161k

0

Entering edit mode

thank you very much

ADD REPLY • link 6.6 years ago by 1769mkc ★ 1.2k

0

Entering edit mode

Hi Pierre, Your solution worked very well, but do you mind explaining the RE?

For example, I am not sure where the substitution to blank space instead of the version number is taking place? I understand that "\1" reverts the found RE to output and that \g is global... but where exactly is the substitution?

Thanks.

ADD REPLY • link 6.1 years ago by r.t.greenblatt • 0

0

Entering edit mode

How would you alter the command if there are two digits after the decimal point, say ENSG00000000460.15 ? I am not able to remove the numbers after the decimal point in such cases using the above command.

ADD REPLY • link 4.1 years ago by fawazfebin ▴ 100