Question: Splitting NCBI Accession ID
1
gravatar for Promi
2.7 years ago by
Promi10
Promi10 wrote:

Hi,

I have a file 'ids.txt' containing IDs like this:

gi|78062356|ref|YP_372264.1|

gi|206563435|ref|YP_002234198.1|

gi|402568881|ref|YP_006618225.1|

gi|54024439|ref|YP_118681.1|

gi|146275970|ref|YP_001166130.1|

How can I have them into two columns one having the gi's and the other having all the ref's like as shown below?

78062356       YP_372264.1

206563435     YP_002234198.1

402568881     YP_006618225.1

54024439       YP_118681.1

146275970     YP_001166130.1

Preferably in R or Python.

python blast accessionid R ncbi • 699 views
ADD COMMENTlink modified 2.7 years ago by Sej Modha4.6k • written 2.7 years ago by Promi10
4
gravatar for Sej Modha
2.7 years ago by
Sej Modha4.6k
Glasgow, UK
Sej Modha4.6k wrote:

Python version:

f1=open('text.txt')
for line in f1:
    old=line.rstrip('\n').split("|")
    gi=old[1]
    acc=old[3]
    print(gi+'\t'+acc)

f1.close()
ADD COMMENTlink written 2.7 years ago by Sej Modha4.6k
2
gravatar for h.mon
2.7 years ago by
h.mon29k
Brazil
h.mon29k wrote:
cut -d'|' -f2,4 ids.txt | tr '|' '\t'
ADD COMMENTlink written 2.7 years ago by h.mon29k

Worked perfectly. Thanks!

ADD REPLYlink written 2.7 years ago by Promi10
1
gravatar for genomax
2.7 years ago by
genomax76k
United States
genomax76k wrote:

Preferably in R or Python.

When you specify a requirement like you should also say if this is an assignment question. If not, this can be easily done using shell (awk -F '|' '{print $2"\t"$4}' your_file > new_file).

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by genomax76k

Worked perfectly! Thanks!

Sorry I didn't get you about the assignment question.

ADD REPLYlink written 2.7 years ago by Promi10

People sometimes ask for solutions in a specific language if they are looking for answers to assignment/homework questions.

ADD REPLYlink written 2.7 years ago by genomax76k
0
gravatar for Sej Modha
2.7 years ago by
Sej Modha4.6k
Glasgow, UK
Sej Modha4.6k wrote:

Simple sed solution:

sed -e 's/gi|//g;s/|[a-z]*|/\t/g' inputfile.txt
ADD COMMENTlink written 2.7 years ago by Sej Modha4.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1020 users visited in the last hour