Question: FastaAlternateReferenceMaker creates different size fasta.
0
gravatar for zaibunnisa.t
14 days ago by
zaibunnisa.t0 wrote:

I am using FastaAlternateReferenceMaker to insert all SNPs from a VCF file into a reference to create a new fasta file of 41 bps (20 bps on flanking side). My end goal is to get the fasta sequence in which reference base will be replaced with the SNP in the VCF file. For this purpose, I tried the FastaAlternateReferenceMaker tool and used the following command:

java -jar GenomeAnalysisTK.jar \
-T FastaAlternateReferenceMaker \
-R hg38.fa \
-o myalt.fasta \
-L my.intervals \
-V myinput.vcf

The command run without error, however, the fasta sequence have different lengths. Although, my interval file has exactly the same length of 40 bps (in interval file it is 42 bps).

chr1    13261   13302   +       1_1
chr1    13282   13323   +       2_1
chr1    13398   13439   +       3_1
chr1    13398   13439   +       4_1

I also tried with the following command:

java -Xmx4g -cp lib/SVToolkit.jar:/lib/gatk/GenomeAnalysisTK.jar org.broadinstitute.sv.apps.GenerateAltAlleleFasta -I myinput.vcf -O myalt.fasta -R hg38.fa -flankLength 20

It gave me exactly 41 bps sequence fasta file. But when I cross-checked with vcf file, it has alt allele at a different position (See below). According to vcf file, alt allele must be present at chr1:13281 with G allele but in fasta file, the alt allele is located at chr1:13282 position.

java -Xmx4g -cp lib/SVToolkit.jar:/lib/gatk/GenomeAnalysisTK.jar org.broadinstitute.sv.apps.GenerateAltAlleleFasta -I myinput.vcf -O myalt.fasta -R hg38.fa -flankLength 20

>1_1 L:chr1:13262-13281:1-20|R:chr1:13283-13302:22-41|LENGTH:41
CTCCTGGACCAGTGATACACGCGGCACCCTGTCCTGGACAC

.............

My expected output should be:

>1_1 L:chr1:13261-13280:1-20|R:chr1:13282-13301:22-41|LENGTH:41
GCTCCTGGACCAGTGATACAGGCGGCACCCTGTCCTGGACA

............

What am I doing wrong?

Thanks,

Zaib.

ADD COMMENTlink modified 5 days ago by WouterDeCoster24k • written 14 days ago by zaibunnisa.t0

Seems like the simple thing to try is to slide your coordinates over by one. At first glance, it looks like a discrepancy between 0-based coordinates and 1-based coordinates.

ADD REPLYlink written 13 days ago by swbarnes23.0k

It is because of 0-based coordinates and 1-based coordinates. How to solve this?

ADD REPLYlink written 10 days ago by zaibunnisa.t0

Have you seen this? Cheat Sheet For One-Based Vs Zero-Based Coordinate Systems

ADD REPLYlink written 5 days ago by WouterDeCoster24k

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLYlink written 5 days ago by WouterDeCoster24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 849 users visited in the last hour