Question

Efetch extracting large fasta data from positon

0

Entering edit mode

3.9 years ago

bengoaluoni • 0

Hello, I am new in bioinformatics and I need to run a little command line to help me to extract fasta sequences. I download Edirect in my Ubuntu, and I read a lot of Efread command.

I try run this line:

efetch -db nuccore -format fasta -id NC_035437.1 -chr_start 214621161 -chr_stop 214618066

And it works, but I need around 200 of these lines, How I do it?

My table of input data is something like that:

NC_035437.1 214621161   214618066
NC_035437.1 209015121   209019563
NC_035437.1 208791830   208794856
NC_035437.1 194797143   194795212
NC_035437.1 187148585   187150444
NC_035437.1 167068722   167071843
NC_035433.1 131712739   131714461

Thank ;)

Efetch • 687 views

ADD COMMENT • link updated 3.9 years ago by Pierre Lindenbaum 161k • written 3.9 years ago by bengoaluoni • 0

2

Entering edit mode

3.9 years ago

GenoMax 141k

Try this (assumes your input file of co-ordinates is space separated) :

$ awk -F ' ' '{print $1,$2,$3}' input_coord_file | xargs -n 3 sh -c 'efetch -db nuccore -format fasta -id $0 -chr_start $1 -chr_stop $2'

ADD COMMENT • link 3.9 years ago by GenoMax 141k

0

Entering edit mode

Thank you so much!!! It works ;)

ADD REPLY • link 3.9 years ago by bengoaluoni • 0

0

Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they all work.

Upvote|Bookmark|Accept

ADD REPLY • link 3.9 years ago by Ram 43k

score 1 · Accepted Answer · 2020-05-20

1

Entering edit mode

3.9 years ago

Pierre Lindenbaum 161k

cat input.txt | while read A B E; do efetch -db nuccore -format fasta -id "${A}" -chr_start "${B}" -chr_stop "${E}" ; done

ADD COMMENT • link 3.9 years ago by Pierre Lindenbaum 161k