Question: GATK SelectVariant and Separation of Variants
0
gravatar for basalganglia
4.1 years ago by
basalganglia30
England
basalganglia30 wrote:

Hello cheerful bioinformaticians :)

I have a problem about GATK SelectVariant tool. I have a VCF file including many samples. I want to separate certain samples into the different files. 

I have download GATK (GenomeAnalysisTK.jar), and I have used following command,

My variants are like 14-254,14-345.... so I have written 14-282.141202 instead of  SAMPLE_A_PARC

Select two samples out of a VCF with many samples:
 java -Xmx2g -jar GenomeAnalysisTK.jar \
   -R ref.fasta \
   -T SelectVariants \
   --variant input.vcf \
   -o output.vcf \
   -sn SAMPLE_A_PARC \
   -sn SAMPLE_B_ACTG

 

But I have received this error message ,

##### ERROR A USER ERROR has occurred (version 3.3-0-g37228af):
##### ERROR
##### ERROR This means that one or more arguments or inputs in your command are                                                                                         incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR
##### ERROR If the problem is an invalid argument, please check the online docum                                                                                        entation guide
##### ERROR (or rerun your command with --help) to view allowable command-line a                                                                                        rguments for this tool.
##### ERROR
##### ERROR Visit our website and forum for extensive documentation and answers                                                                                         to
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR Please do NOT post this error to the GATK forum unless you have real                                                                                        ly tried to fix it yourself.
##### ERROR
##### ERROR MESSAGE: Invalid argument value 'R' at position 0.
##### ERROR Invalid argument value 'ref.fasta' at position 1.
##### ERROR Invalid argument value 'T' at position 2.
##### ERROR Invalid argument value 'SelectVariants' at position 3.
##### ERROR Invalid argument value 'TR-ETM-59.vcf' at position 6.
##### ERROR Invalid argument value 'sn' at position 7.
##### ERROR Invalid argument value '14-282.141202' at position 8.
##### ERROR --------------------------------------------------------------------

Thanks and have a nice bioinformatics !!!! :)

 

 

ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by basalganglia30
1
gravatar for RamRS
4.1 years ago by
RamRS20k
Houston, TX
RamRS20k wrote:

Check your command - it seems you might be using just R instead of -R

ADD COMMENTlink written 4.1 years ago by RamRS20k

Thank you so much ,

but now I have received that error message ,

 

 ERROR MESSAGE: The fasta file you specified (/home/bio/IGBAM/ref.fasta) does not exist

 

So will I need to download  ref.fasta file ?

ADD REPLYlink written 4.1 years ago by basalganglia30

Yes, you do. Download the appropriate reference file (ucsc.hg19/GRCh37).

ADD REPLYlink written 4.1 years ago by RamRS20k

Thank your kind reply,

Can I use my samples' bam or fastq file instead of ref.fasta file 

ADD REPLYlink written 4.1 years ago by basalganglia30

No, ref.fasta is the reference sequence. samples BAM is alignment of your reads to the ref file and the FASTQ file are the reads.

FASTQ and ref.fa are the two base pieces of information on which the entire process is built, so you cannot make do without either of them.

ADD REPLYlink written 4.1 years ago by RamRS20k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1705 users visited in the last hour