Question: GATK: Remove known variants from the sample vcf file
1
gravatar for ravast
5.2 years ago by
ravast10
Netherlands
ravast10 wrote:

I have a test sample vcf file, from which i have to select the unique variants. As an initial step i want to remove the know variants from my sample. I used SelectVariants walker from GATK.. I got this error

Input files /home///Mouse_ref/mgp.v3.snps.rsIDdbSNPv137.vcf and reference have incompatible contigs: Relative ordering of overlapping contigs differs, which is unsafe.

##### ERROR   /home/Mouse_ref/mgp.v3.snps.rsIDdbSNPv137.vcf contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, X]

##### ERROR   reference contigs = [1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 3, 4, 5, 6, 7, 8, 9, MT, X, Y, JH584295.1, JH584292.1, GL456368.1, GL456396.1, GL456359.1, GL456382.1, GL456392.1, GL456394.1, GL456390.1, GL456387.1, GL456381.1, GL456370.1, GL456372.1, GL456389.1, GL456378.1, GL456360.1, GL456385.1, GL456383.1, GL456213.1, GL456239.1, GL456367.1, GL456366.1, GL456393.1, GL456216.1, GL456379.1, JH584304.1, GL456212.1, JH584302.1, JH584303.1, GL456210.1, GL456219.1, JH584300.1, JH584298.1, JH584294.1, GL456354.1, JH584296.1, JH584297.1, GL456221.1, JH584293.1, GL456350.1, GL456211.1, JH584301.1, GL456233.1, JH584299.1]

gatk • 3.2k views
ADD COMMENTlink modified 3.5 years ago by norhanmahfouz0 • written 5.2 years ago by ravast10

Thank you ashutosh..

ADD REPLYlink written 5.2 years ago by ravast10

This should be comment rather than an answer :-) Also, if the answer solved your problem, you should "accept" it so that your question will be stored as solved.

ADD REPLYlink written 5.2 years ago by Biomonika (Noolean)3.1k
3
gravatar for Ashutosh Pandey
5.2 years ago by
Philadelphia
Ashutosh Pandey11k wrote:

This is pretty common error. If you would have searched a little on this forum or online you could have got the answers. GATK requires order of chromosomes to be the same in both the files. Elaborately described here: http://gatkforums.broadinstitute.org/discussion/1204/what-input-files-does-the-gatk-accept-require. I think you just used cat chr*.fa command to concatenate the individual fasta files (chromosomes) to make the reference file and that messed up the order. You are not wrong but this is how GATK works. 

ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by Ashutosh Pandey11k

Ashutosh..could you please let me know how i should rectify the error.

ADD REPLYlink written 5.2 years ago by ravast10
1

Karyotypically Ordered Hg19. You need to sort your reference fasta file or create a new reference fasta from scratch. Make sure it has allt he chromosomes present in your vcf file in the same order ie. 1,2,3,4...X.

ADD REPLYlink written 5.2 years ago by Ashutosh Pandey11k
0
gravatar for norhanmahfouz
3.5 years ago by
Germany
norhanmahfouz0 wrote:

I didn't quite get if the problem was solved ... I get the same error as Ravast and it's not about the order in case of mouse genome - it's actually that those chromosomes/contigs are absent all together from the vcf file header of already known SNPs compiled by the Sanger institute!!!! I tried to circumvent it by modifying the header (adding the missing contigs in the header) but I get the same error .... :(

ADD COMMENTlink written 3.5 years ago by norhanmahfouz0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1562 users visited in the last hour