Question: Is it possible to get VCF from two fastq files without reference genome?
0
gravatar for genebow
19 months ago by
genebow150
USA/Chicago
genebow150 wrote:

Just want to know if I can get Variant Calling directly from two or more reads fastq files without reference genome? I need to compare SNPs between the two fastq files.

snp next-gen • 829 views
ADD COMMENTlink modified 19 months ago by Dan D6.7k • written 19 months ago by genebow150

You can certainly de-novo assemble them both, map the reads of each to the other's assembly, and get variant calls. Without a reference for coordinates, I'm not sure how useful that would be.

ADD REPLYlink written 19 months ago by Brian Bushnell16k

Thanks for answering my question. I just hope that VCF between two testing genomes can be obtained without reference genomes.

ADD REPLYlink written 19 months ago by genebow150

You can obtain VCF files using the method that I describe. However, as Dan notes, they will not necessarily be useful.

I think it would be helpful if you explained what organisms you are working with, what kind of data you have (the complete experimental setup), and what you are trying to accomplish. Blinded questions rarely yield useful results.

ADD REPLYlink written 19 months ago by Brian Bushnell16k

Yes, thanks for the suggestion. The goal of VCF calling is to find recombinants from two bacterial genomes, which were sequenced as raw fastq files. Since there will be large amounts of genomes in comparision for recombinants, it would be convenient to get recombinant information directly from VCF calling, instead of assembling whole genomes.

ADD REPLYlink modified 19 months ago • written 19 months ago by genebow150

So, I'm confused as to why you have only 2 fastq files if you have a large number of recombinants... is this one species? 2 species? Are you combining lots of samples in a single library?

ADD REPLYlink written 19 months ago by Brian Bushnell16k

The two fastq files contain multiple reads from sequencing, each covering the whole genome of a sample. Thanks!

ADD REPLYlink written 19 months ago by genebow150
2
gravatar for Dan D
19 months ago by
Dan D6.7k
Tennessee
Dan D6.7k wrote:

No. Variant calls are based on a reference genome sequence. Technically you could assemble the raw reads into a draft reference and variant call from that, but you'll still have to obtain some reference before you can perform variant calling.

EDIT: I may be incorrect in the case that you're working with well-characterized bacterial species and are looking for specific features. Check out this paper describing the KVarQ program. I imagine the same would be true if you're working with something like mitochondrial data.

ADD COMMENTlink modified 19 months ago • written 19 months ago by Dan D6.7k

Thank you for your suggestions!

ADD REPLYlink written 19 months ago by genebow150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2329 users visited in the last hour