Question: vcf to MAF to fasta
2
gravatar for natasha.sernova
4.8 years ago by
natasha.sernova3.4k
natasha.sernova3.4k wrote:

Dear all,

A strange idea has came to my mind recently.

There is no direct way to convert vcf into fasta. But

I've read that through Galaxy I can convert vcf to maf (multiple align),

and then it may be possible to convert maf to fasta.

How much information from the original vcf-file will I loose using this way,

and is it possible to avoid the loss?

THANK YOU!

Natasha

galaxy vcf • 2.9k views
ADD COMMENTlink modified 4.8 years ago by Biostar ♦♦ 20 • written 4.8 years ago by natasha.sernova3.4k
3
gravatar for Pierre Lindenbaum
4.8 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum120k wrote:

"There is no direct way to convert vcf into fasta" : there is. The GATK provides a tool named FastaAlternateReferenceMaker http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_fasta_FastaAlternateReferenceMaker.html

ADD COMMENTlink written 4.8 years ago by Pierre Lindenbaum120k
Yes, that's true. I've even used it once.

But this option killed me.

-L input.intervals \
I still haven't known (if I have a particular
vcf-file number 1), how and where
to find the corresponding interval coordinates.

You definitely know. PLEASE, help me!

Thank you very much indeed!

Natasha

ADD REPLYlink written 4.8 years ago by natasha.sernova3.4k

is -L required ?

ADD REPLYlink written 4.8 years ago by Pierre Lindenbaum120k

Dear Pierre,

this is optional parameter. But when I omit it,  I have just the full fasta-fail for corresponding chromosome, nothing else. I thought that GATK will allow me to cut a fragment corresponding.to vcf-file. But I have to give it "one or more genomic intervals over which to operate". It seems much more complicated. How to find these genomic intervals?

I have vcf-files, refs and chromosome sequences. I made fai-files, dict-files, but it still doesn't give me any hints to the fasta alignment. What else should be done?

Many thanks for your help!

Natasha

 

ADD REPLYlink written 4.8 years ago by natasha.sernova3.4k
1

"one or more genomic intervals over which to operate": i don't understand your problem. You can provide a BED file or even a VCF file: http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_CommandLineGATK.html#--intervals

ADD REPLYlink written 4.8 years ago by Pierre Lindenbaum120k

Great. And this will work??? " In order to perform the analysis

at specific positions based on the records present in the file

(e.g. -L file.vcf)" - this is exactly what I need. I will try, that's a miracle.

I was very poor in reading the manual... THANK YOU, Pierre!

ADD REPLYlink written 4.8 years ago by natasha.sernova3.4k

I guess I missed the goal of OPs question, I just was not sure what they wanted in the fasta file. I presumed that they didn't have a preexisting fasta file with their reference sequences.
 

ADD REPLYlink written 4.8 years ago by pld4.8k
0
gravatar for pld
4.8 years ago by
pld4.8k
United States
pld4.8k wrote:

MAF is Mutation Annotation Format, not a multiple alignment:https://wiki.nci.nih.gov/display/TCGA/Mutation+Annotation+Format+%28MAF%29+Specification+-+v2.4

Fasta is really a flat sequence data format, you might be able to store the variants found in a sequence in the fasta header, but I'm not sure what you're asking for makes sense.

ADD COMMENTlink written 4.8 years ago by pld4.8k
1

The Multiple Alignment Format (MAF) has been used by UCSC/EnsEMBL etc for ten years or so. It is one of the most widely used multi-alignment formats. The first spec for the Mutation Annotation Format was only released two years ago.

ADD REPLYlink modified 4.8 years ago • written 4.8 years ago by lh331k

https://wiki.nci.nih.gov/dosearchsite.action?queryString=Multiple+alignment+format
So my question doesn't make sence at all. OK, thank you.

 

ADD REPLYlink written 4.8 years ago by natasha.sernova3.4k

That link just leads me to Mutation Annotation Format.

ADD REPLYlink written 4.8 years ago by pld4.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1582 users visited in the last hour