Question: Error when trying to fix the contigs order in the reference and vcf for FastaAlternateReferenceMaker
0
gravatar for tiago211287
4.1 years ago by
tiago2112871.1k
USA
tiago2112871.1k wrote:

I am trying to use a VCF containing snps variants to change the mouse reference (GRCm38- c57BL/6J) with BALB/cJ snps.

After running this command:

java -jar ~/programs/GenomeAnalysisTK.jar -T FastaAlternateReferenceMaker -R ~/genome/mouse_GRCm38.p4/GRCm38.primary_assembly/GRCm38.primary_assembly.fa -o ~/BALBcJ.snp.primary.fa -V ~/BALB_cJ.snps.vcf

The following ERROR shows up:

ERROR MESSAGE: Input files /home/tiagocastro/BALB_cJ.snps.vcf and reference have incompatible contigs: The contig order in /home/tiagocastro/BALB_cJ.snps.vcf and referenceis not the same; to fix this please see: (https://www.broadinstitute.org/gatk/guide/article?id=1328), which describes reordering contigs in BAM and VCF files..

ERROR /home/tiagocastro/BALB_cJ.snps.vcf contigs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, X, Y]

ERROR reference contigs = [1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 3, 4, 5, 6, 7, 8, 9, MT, X, Y, JH584299.1, GL456233.1, JH584301.1, GL456211.1, GL456350.1, JH584293.1, GL456221.1, JH584297.1, JH584296.1, GL456354.1, JH584294.1, JH584298.1, JH584300.1, GL456219.1, GL456210.1, JH584303.1, JH584302.1, GL456212.1, JH584304.1, GL456379.1, GL456216.1, GL456393.1, GL456366.1, GL456367.1, GL456239.1, GL456213.1, GL456383.1, GL456385.1, GL456360.1, GL456378.1, GL456389.1, GL456372.1, GL456370.1, GL456381.1, GL456387.1, GL456390.1, GL456394.1, GL456392.1, GL456382.1, GL456359.1, GL456396.1, GL456368.1, JH584292.1, JH584295.1]

So Trying to fix, I used the perl script in the link to sort properly within the reference.

I did this:

./sortByRef.pl ~/BALB_cJ.snps.vcf /home/tiagocastro/genome/mouse_GRCm38.p4/GRCm38.primary_assembly/GRCm38.primary_assembly.fa.fai > ~/BALB_cJ.snps_sorted.vcf

using the new vcf file, a new error is shown:

ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file '/home/tiagocastro/BALB_cJ.snps_sorted.vcf' could not be determined dynamically. Please add an explicit type tag :NAME listing the correct type from among the supported types:

ERROR Name FeatureType Documentation

ERROR BCF2 VariantContext (this is an external codec and is not documented within GATK)

ERROR VCF VariantContext (this is an external codec and is not documented within GATK)

ERROR VCF3 VariantContext (this is an external codec and is not documented within GATK)

looking the head of each, sorted and basic vcf, I can see that is little different, the new file does not has the header.

Can someone help me?

ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by tiago2112871.1k
1
Try copying the header of the original vcf file into the new vcf file and run again.
ADD REPLYlink written 4.1 years ago by Ashutosh Pandey11k
0
gravatar for tiago211287
4.1 years ago by
tiago2112871.1k
USA
tiago2112871.1k wrote:

I fixed the problem by doing what Ashutosh Pandey suggested.

I coppied the header to the sorted file and it solved all errors.

For copping I used this little bash command:

{ head -n69 original.vcf; cat sorted.vcf; } >tmp$$ && mv tmp$$ sorted.vcf

 

ADD COMMENTlink modified 4.1 years ago • written 4.1 years ago by tiago2112871.1k
1

Great. Keep going.

ADD REPLYlink written 4.1 years ago by Ashutosh Pandey11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1674 users visited in the last hour