FastaAlternateReferenceMaker creates different size fasta.
0
0
Entering edit mode
6.3 years ago

I am using FastaAlternateReferenceMaker to insert all SNPs from a VCF file into a reference to create a new fasta file of 41 bps (20 bps on flanking side). My end goal is to get the fasta sequence in which reference base will be replaced with the SNP in the VCF file. For this purpose, I tried the FastaAlternateReferenceMaker tool and used the following command:

java -jar GenomeAnalysisTK.jar \
-T FastaAlternateReferenceMaker \
-R hg38.fa \
-o myalt.fasta \
-L my.intervals \
-V myinput.vcf

The command run without error, however, the fasta sequence have different lengths. Although, my interval file has exactly the same length of 40 bps (in interval file it is 42 bps).

chr1    13261   13302   +       1_1
chr1    13282   13323   +       2_1
chr1    13398   13439   +       3_1
chr1    13398   13439   +       4_1

I also tried with the following command:

java -Xmx4g -cp lib/SVToolkit.jar:/lib/gatk/GenomeAnalysisTK.jar org.broadinstitute.sv.apps.GenerateAltAlleleFasta -I myinput.vcf -O myalt.fasta -R hg38.fa -flankLength 20

It gave me exactly 41 bps sequence fasta file. But when I cross-checked with vcf file, it has alt allele at a different position (See below). According to vcf file, alt allele must be present at chr1:13281 with G allele but in fasta file, the alt allele is located at chr1:13282 position.

java -Xmx4g -cp lib/SVToolkit.jar:/lib/gatk/GenomeAnalysisTK.jar org.broadinstitute.sv.apps.GenerateAltAlleleFasta -I myinput.vcf -O myalt.fasta -R hg38.fa -flankLength 20

>1_1 L:chr1:13262-13281:1-20|R:chr1:13283-13302:22-41|LENGTH:41
CTCCTGGACCAGTGATACACGCGGCACCCTGTCCTGGACAC

.............

My expected output should be:

>1_1 L:chr1:13261-13280:1-20|R:chr1:13282-13301:22-41|LENGTH:41
GCTCCTGGACCAGTGATACAGGCGGCACCCTGTCCTGGACA

............

What am I doing wrong?

Thanks,

Zaib.

FastaAlternateReferenceMaker SNP • 2.7k views
ADD COMMENT
0
Entering edit mode

Seems like the simple thing to try is to slide your coordinates over by one. At first glance, it looks like a discrepancy between 0-based coordinates and 1-based coordinates.

ADD REPLY
0
Entering edit mode

It is because of 0-based coordinates and 1-based coordinates. How to solve this?

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY

Login before adding your answer.

Traffic: 2654 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6