Combine REF/ALT alleles from VCF with extracted flanking sequences
1
0
Entering edit mode
4.0 years ago
wrengs • 0

Hi,

I have recently used bedtools flank in combination with getfasta to extract sequences flanking some structural variants, using a VCF file and genome file.

Link to the VCF file: ftp://ftp.solgenomics.net/genomes/tomato100/March_02_2020_sv_landscape/variants/LYC1969.ont.v1.0.s.vcf.gz

Link for the SL4.0 genome fasta: ftp://ftp.solgenomics.net/genomes/Solanumlycopersicum/assembly/build4.00/

For the first structural variant (ID = 261_0_1), 20bp flanking sequences were extracted:

Info from VCF file: POS ID REF ALT 19623 261_0_1 ATATATATATATATATATATATATATATATATATA A

Output from bedtools flank: SL4.0ch01 19602 19622

SL4.0ch01 19658 19678

Output from bedtools get fasta:

SL4.0ch01:19602-19622 GAATGTATTCATATATATAT

SL4.0ch01:19658-19678 TAAAATTCTAACTTGAGAAA

I was wondering if somehow the extracted flanking sequences could be combined with the REF and ALT alleles from the VCF, i.e. using a tool, in the following output:

261_0_1 GAATGTATTCATATATATAT[ATATATATATATATATATATATATATATATATATA/A]TAAAATTCTAACTTGAGAAA

Of course, for just a single structural variant I could do this manually. However, my intention is to do this with some thousands of structural variants in combination with multiple VCF files.

Many thanks!

bedtools vcf structural variants sequence • 964 views
ADD COMMENT
0
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 2027 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6