Question: Is There A Tool To Extract The Reference Genomic Sequence From A Given Coordinate?
0
gravatar for bibb77
7.6 years ago by
bibb7780
Chile
bibb7780 wrote:

Hello everyone, this is my problem:

I have a list of SNPs coordinates in this format (from chr 1 to chr Y, ~580k SNPs), they are under build GRCh37:

chr1:108681808
chr1:109440678
chr1:109479801
chr1:110655430
chr1:11193226
chr1:113933669
chr1:115258741
chr1:115527488
...

And I have the Reference Genome build37 in .fa format

>chr1
NNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNN....gctttatacaatctat
ttgttactttttattctattttgcatttt
gttcctttgcctgaataattcactttggt
ctgcaatggctaattgcaatagattt...

Is there tool to obtain the sequence corresponding to each given coordinate?

chr1:108681808  A
chr1:109440678  C
chr1:109479801  T
chr1:110655430  G
chr1:11193226    A
chr1:113933669  T
chr1:115258741  C
chr1:115527488  C

Hope you can help me

vcf coordinates reference • 3.7k views
ADD COMMENTlink modified 7.5 years ago by Gabriel R.2.8k • written 7.6 years ago by bibb7780
2

exact duplicate of

getting sequence based on chromosome no and coordinates from whole genome fasta file

ADD REPLYlink written 7.6 years ago by Pierre Lindenbaum134k
1
gravatar for Gabriel R.
7.5 years ago by
Gabriel R.2.8k
Danmarks Tekniske Universitet
Gabriel R.2.8k wrote:

You have a relatively small number of positions, I would just do

  1. put all your fastas into one, samtools faidx your reference
  2. do a for loop in bash where you replace chr1:108681808 to chr1:108681808-108681808, a quick awk should wor, see below:
  3. for each, do a faidx on the reference:

    for i in cat file.pos |awk 'BEGIN{FS=":"}{print $1":"$2"-$2}'

    do

    echo -np $i"\t";

    samtools faidx reference.fa $i;

    done

ADD COMMENTlink modified 7.5 years ago • written 7.5 years ago by Gabriel R.2.8k
0
gravatar for SRKR
7.5 years ago by
SRKR180
Visakhapatnam
SRKR180 wrote:

What is the genome size that you are working with... I mean the sequence in the .fa file. If it's reasonably small. I can create an online tool for the purpose.

ADD COMMENTlink modified 7.5 years ago • written 7.5 years ago by SRKR180
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2383 users visited in the last hour
_