Fasta extraction from bed file
2
0
Entering edit mode
5.2 years ago
baurumon ▴ 30

hello,

how can i extract fasta in reverse order. i have bed file where start position is greater than the stop. this could come from reverse strand. How can i extract fasta file from reverse order?

NC_037130.1 12295912 12286289

please help me .

Thanks in advance

alignment • 2.7k views
ADD COMMENT
3
Entering edit mode
5.2 years ago
alex.zaccaron ▴ 470

Sounds like you want to extract the sequences in the correct orientation and your 3-column bed file has reversed coordinates if sequence is in the reverse strand. Modifying ATpoint suggestion, you could still use bedtools getfasta to extract the correct orientation with:

awk 'OFS="\t" {if($2>$3) print $1, $3, $2, ".", ".", "-"; else print $0, ".", ".", "+"}'  file.bed | bedtools getfasta -s -fi ref.fasta -bed -
ADD COMMENT
0
Entering edit mode

thank you very much.

ADD REPLY
0
Entering edit mode

Hello baurumon ,

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.

Upvote|Bookmark|Accept

ADD REPLY
2
Entering edit mode
5.2 years ago
ATpoint 85k
awk 'OFS="\t" {print $1, $3, $2}' your.file | bedtools getfasta (...)
ADD COMMENT
0
Entering edit mode

Thanks,

But will it be the same position that i want?

As i understand, this awk will print 12286289 12295912 in this way and then extract fasta. after alignment i found some coordinate in reverse order. then i divided then into another bed file and to extract those position.

ADD REPLY
0
Entering edit mode

From what I understand the convention in genomics is that the genome itself (in a bioinformatical context) is unstranded because all positions always refer to the top strand. If you want something from the minus strand this would be indicated by a - in the strand column and the tool would therefore extract the DNA sequence from the top strand and reverse-complement it because again by convention sequences are always written as 5'->3'. Therefore it is odd you even have a sequence with $2 > $3, this should probably not happen. Where did you get that file from?

ADD REPLY

Login before adding your answer.

Traffic: 809 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6