8 months ago by
Seattle, WA USA
If your FASTA has metadata in its record header that points to its location on the genome, you can use that directly to map any SNPs to it via BEDOPS
$ echo -e 'chr2\t2241384\t2244383' | bedmap --echo --echo-map-id-uniq --delim '\t' - <(vcf2bed < snps.vcf) > answer.bed
2, depending on the format of chromosome name in your
snps.vcf file. This could either be UCSC (
chr2) or Ensembl (
2), most likely.
answer.bed will contain the 3k nt interval and a listing of all SNP ID values that map to — or associate with, or overlap — that interval.
If you don't have metadata in the FASTA header that tells you where you are, you could use a BLAST search on the sequence to get back the location of your sequence for your genome of interest.
Then you run the command above, again replacing what goes into
echo with whatever region comes out of the BLAST search.