VCF to FASTA conversion
0
0
Entering edit mode
7.9 years ago
john.doyle • 0

Hi,

I am running some sequencing data through the Galaxy pipeline and generated a VCF (I think, see below). I am trying to convert the VCF to a FASTA file. I was wondering if there is a method in the Galaxy toolbox to make the conversion. I tried VCF to Tab Delimited and then Tabular to FASTA and got basically nothing. In trying to backtrack to see where my workflow went wrong, I considered my results from the BOWTIE alignment. I am not sure BOWTIE produced any result, but I did not receive an error message so my thought is that it worked. Any guidance would be greatly appreciate and thank you in advance for your help.

Best,

John

VCF output:

##fileformat=VCFv4.1
##fileDate=20171103
##source=Naive Variant Caller version 0.0.2
##reference=file:///galaxy-repl/main/files/022/009/dataset_22009740.dat
##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed">
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=AC,Number=.,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed">
##FORMAT=<ID=AF,Number=.,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
##FORMAT=<ID=NC,Number=.,Type=String,Description="Nucleotide and indel counts">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT __NONE__
VCF FASTA Conversion • 5.4k views
ADD COMMENT
0
Entering edit mode

This is just a header for a VCF file. Was this all that was there in the file? If it was then your bowtie run itself must not have produced any valid data, as you suspect.

ADD REPLY
0
Entering edit mode

I believe you are correct. As an update, I went back and traced all the files prior to bowtie and there is data present that I can visualize using the eyeball icon. When I do this for bowtie the file is blank, however, I received no error message when it ran. Does that mean it is not aligning my sequence or simply that bowtie didn't work and should be run again?

ADD REPLY
0
Entering edit mode

It could be many things. If you click on the "bowtie" step and then on "i" icon what do you see?

ADD REPLY
0
Entering edit mode

Nothing, it is just a blank field, but prior to the bowtie step there is data in all the previous steps.

ADD REPLY
0
Entering edit mode

Try re-running the bowtie job then. Make sure you select all the right options. What kind of data is this? You may want to choose bowtie2, if that is available since it will do gapped alignments.

ADD REPLY
0
Entering edit mode

Sorry, thought you meant when I clicked on the eyeball. I just realized what you were saying, here is what is under the "i" icon:

Bowtie2
Dataset Information
Number: 84
Name:   Bowtie2 on data 65, data 81, and data 83: aligned reads (sorted BAM)
Created:    Fri 03 Nov 2017 04:03:52 PM (UTC)
Filesize:   92 bytes
Dbkey:  ?
Format: bam
Job Information
Galaxy Tool ID: toolshed.g2.bx.psu.edu/repos/devteam/bowtie2/bowtie2/2.3.2.2
Galaxy Tool Version:    2.3.2.2
Tool Version:   /jetstream/scratch0/main/conda/envs/mulled-v1-cf272fa72b0572012c68ee2cbf0c8f909a02f29be46918c2a23283da1d3d76b5/bin/bowtie2-align-s version 2.3.2 64-bit Built on testing-gce-da37cb75-082c-4905-af1b-2c2c1374f1a2 Mon May 8 18:36:38 UTC 2017 Compiler: gcc version 4.8.5 (GCC) Options: -O3 -m64 -msse2 -funroll-loops -g3 -I/jetstream/scratch0/main/conda/envs/mulled-v1-cf272fa72b0572012c68ee2cbf0c8f909a02f29be46918c2a23283da1d3d76b5/include -L/jetstream/scratch0/main/conda/envs/mulled-v1-cf272fa72b0572012c68ee2cbf0c8f909a02f29be46918c2a23283da1d3d76b5/lib -DPOPCNT_CAPABILITY -DWITH_TBB -DNO_SPINLOCK -DWITH_QUEUELOCK=1 Sizeof {int, long, long long, void*, size_t, off_t}: {4, 8, 8, 8, 8, 8}
Tool Standard Output:   stdout

Tool Standard Error:    stderr

Tool Exit Code: 0
History Content API ID: bbd44e69cb8906b543f02b7a124297a2
Job API ID: bbd44e69cb8906b55e334082643803a2
History API ID: 6e83db7cb784a01d
UUID:   524573d7-1e93-4ed6-b660-3a50bf748ee9
Tool Parameters
Input Parameter Value   Note for rerun
Is this single or paired library    paired  
FASTA/Q file #1 83: FASTQ Quality Trimmer on data 67

FASTA/Q file #2 81: FASTQ Quality Trimmer on data 63

Write unaligned reads (in fastq format) to separate file(s) False   
Write aligned reads (in fastq format) to separate file(s)   False   
Do you want to set paired-end options?  no  
Will you select a reference genome from your history or use a built-in index?   history 
Select reference genome 65: NC_002081.1[5209..7225].fa

Set read groups information?    do_not_set  
Select analysis mode    simple  
Do you want to use presets? No, just use defaults   
Save the bowtie2 mapping statistics to the history  False   
Job Resource Parameters no  
Inheritance Chain
Bowtie2 on data 65, data 81, and data 83: aligned reads (sorted BAM)
Job Dependencies
Dependency  Dependency Type Version
bowtie2 conda   2.3.2
samtools    conda   1.3.1
ADD REPLY
0
Entering edit mode

This galaxy (are you using the public galaxy at PSU or internal mirror) may be set to produce a sorted BAM file directly. In that case you would not be able to see anything since this would be a binary file. Do you see a file size when you click on the name of the step? Example below.

300.0 Mb
format bam
database ce6
ADD REPLY
0
Entering edit mode

I am not sure if such a tool is available in galaxy. You can try GATK tool, if you have access to bam files locally: https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_gatk_tools_walkers_fasta_FastaAlternateReferenceMaker.php

ADD REPLY

Login before adding your answer.

Traffic: 3575 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6