How to extract genomic sequence using genomic coordinates
2
0
Entering edit mode
6.5 years ago
1769mkc ★ 1.2k

These are the set of sequence a very small subset

chr1_6237786_6251176_F
chr1_10150615_10150781_R
chr1_12911118_12934193_R
chr1_13142230_13142459_R
chr1_13640475_13640801_R
chr1_13640480_13640801_R

I want to extract the set of sequences that lies with my coordinates as i have put above ..how to do that, do i use an lets say hg38 chromosome to parse the coordinates ,any help or suggestion would be highly appreciated ..

RNA-Seq • 5.4k views
ADD COMMENT
1
Entering edit mode

this question was asked many times here, search the site please: e.g: Extract User Defined Region From An Fasta File

ADD REPLY
0
Entering edit mode

thank you i will look into it

ADD REPLY
3
Entering edit mode
6.5 years ago

Hi, you convert to BED format and use "getfasta" from bedtools. For details, "http://bedtools.readthedocs.io/en/latest/content/tools/getfasta.html".

Example for BED file of your sample

chr1    6237786     6251176     x   y   +
chr1    10150615    10150781    x   y   -
chr1    12911118    12934193    x   y   - 
chr1    13142230    13142459    x   y   -
chr1    13640475    13640801    x   y   -
chr1    13640480    13640801    x   y   -

Best

ADD COMMENT
0
Entering edit mode

so i have to convert my fasta file to bed file then get the coordinates you are saying?

ADD REPLY
1
Entering edit mode

No..

you need to convert your co-ordinates to BED file format. I assumed that your sample file is co-ordinate file.

ADD REPLY
0
Entering edit mode

well i have the input as i put in my question that is the information i have...

ADD REPLY
1
Entering edit mode

@krushnach: do this on your input file

cat file.bed | tr '_' '\t' > modified_file.bed

And use this file with bedtools

ADD REPLY
0
Entering edit mode

$ sed 's/_/\t/g' old.bed > new.bed

ADD REPLY
1
Entering edit mode
6.5 years ago
Joe 21k

You can use python's slicing notation to extract sub-sequences. I implemented it in this code, look at lines 83-89.

ADD COMMENT

Login before adding your answer.

Traffic: 1981 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6