Extract sequences flanking SNPs from organism without reference genome
1
0
Entering edit mode
7.6 years ago

Hi, I'm new in managing DNA sequences and I'm looking for help. I have fastq and vcf files from the sequencing of my samples. The plant species on which I work (Lagenaria siceraria) has no reference genome. What I want to do is to extract SNPs flanking sequences and make a blast in Plant RefSeq for determining the putative functions of my SNPs. All the posts Ive read are related to species that have model organism. Then, could someone help me please?

Thanks

SNP blast • 1.9k views
ADD COMMENT
0
Entering edit mode

Would you also have the GTF file? If so you could try the Variant Effect Predictor. You can annotate the variants if you have the fasta and the gtf. Check the help on cache and database.

ADD REPLY
0
Entering edit mode

Thanks, No I don't have the GTF file.

ADD REPLY
0
Entering edit mode

How was your VCF generated without a reference? Contig assembly then alignment? Alignment to related species reference?

ADD REPLY
0
Entering edit mode

Thank Harold, It is GBS, so DNA was digested with ApeKI and final libraries were sequenced using illumina. Alignement was done on partialy sequenced genome (made of contigs, so not annotated) of my species.

Thank a lot for your help

ADD REPLY
2
Entering edit mode
7.6 years ago

You can use BEDtools 'slop' to create a BED file of the flanking coordinates, then 'getfasta' to extract the sequences.

Note: this will require the refence contigs to which the reads were aligned.

ADD COMMENT
0
Entering edit mode

Thanks a lot Harold, I will try and keep you informed!

ADD REPLY
0
Entering edit mode

Thanks a lot Harold, It's works very well!

ADD REPLY

Login before adding your answer.

Traffic: 2605 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6