Question: Beagle imputation - duplication position error
3
gravatar for valopes
16 months ago by
valopes30
valopes30 wrote:

Hi all,

So, after a figured how to extract a .vcf from an Illumina data [C: Getting a .vcf file from a Illumina SNPChip results (.bsc file)], now I am facing problems to filter and do the imputation.

As I never did that before, I've been trying a lot of options but with no success. So, let me explain it... It is a long story...

From a .vcf file, I used the command to filter:

vcftools --vcf input.vcf --remove-filtered-all --max-missing 0.2 --maf 0.05 --mac 1 --min-alleles 2 --max-alleles 2 --recode --out output_filtered.vcf

After that, I've tried to do the imputation using Beagle (beagle.16May18.771.jar):

java -Xmx25000m -jar beagle.16May18.771.jar gt=input.vcf out=output_imputed

but I got an error:

No genetic map is specified: using 1 cM = 1 Mb Exception in thread "main" java.lang.IllegalArgumentException: Duplicate marker:
0 1556719 Gm20_1556719_C_T G A at vcf.Markers.markerSet(Markers.java:175) at vcf.Markers.<init>(Markers.java:92) at vcf.Markers.create(Markers.java:69) at vcf.TargetData.extractMarkers(TargetData.java:130) at vcf.TargetData.advanceWindowCm(TargetData.java:120) at vcf.TargetData.targetData(TargetData.java:76) at main.Main.data(Main.java:143) at main.Main.main(Main.java:115)

So, I thought that I should create a .map file for the filtered .vcf, using PLINK:

plink --vcf input_filtered.vcf --recode --out output_plink_files

Then, I've run the Beagle again:

java -Xmx25000m -jar beagle.16May18.771.jar gt=input.vcf map=input_vcf.map out=output_imputed

and I've got:

Exception in thread "main" java.lang.IllegalArgumentException: duplication posit ion: 0 Gm20_1556719_C_T 0
1556719 at vcf.PlinkGenMap.fillMapPositions(PlinkGenMap.java:76) at vcf.PlinkGenMap.<init>(PlinkGenMap.java:53) at vcf.PlinkGenMap.fromPlinkMapFile(PlinkGenMap.java:117) at vcf.GeneticMap.geneticMap(GeneticMap.java:120) at vcf.TargetData.targetData(TargetData.java:71) at main.Main.data(Main.java:143) at main.Main.main(Main.java:115)

Well, reading I could see that the genetic map is not the problem but I cannot figure the duplication posit out. The thing is, I am quite lost here. Could someone help me?

Oh I also found this post [Can someone help me with imputation of missing SNPs using beagle 4?] But still didn't work for me...

Thanks in advance!

snp • 808 views
ADD COMMENTlink modified 16 months ago by chrchang5235.5k • written 16 months ago by valopes30

Could you search in the vcf for that position Gm20_1556719_C_T, for example using grep? I'm not sure how the SNP and chromosome identifiers are in your vcf file, you may have to search for 1556719 separately.

ADD REPLYlink modified 16 months ago • written 16 months ago by WouterDeCoster40k

Yes, I did seach already! And it looks duplicate...

Line 1762:
0   1556719 Gm05_1556719_C_T    G   A   .   .   .   GT  0/0 0/0 0/0 1/1 1/1 0/0 
Line 1763:
0   1556719 Gm20_1556719_C_T    G   A   .   .   .   GT  0/1 0/0 0/1 0/1 0/0 0/1

and I know this position is not the only one duplicate.

ADD REPLYlink modified 16 months ago by WouterDeCoster40k • written 16 months ago by valopes30
1

I added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

It looked like you pasted the same twice, is the output correct as I formatted it?

ADD REPLYlink modified 16 months ago • written 16 months ago by WouterDeCoster40k
0
gravatar for chrchang523
16 months ago by
chrchang5235.5k
United States
chrchang5235.5k wrote:

plink’s —list-duplicate-vars flag was created specifically to address this Beagle 4 issue.

ADD COMMENTlink written 16 months ago by chrchang5235.5k

Okay I've tried this:

 --list-duplicate-vars

and then

 --exclude

It didn't work.

So, I did:

 --write-snplist

cat input.snplist | sort | uniq -d > output_new.snplist

--exclude

It still not working...

Exception in thread "main" java.lang.IllegalArgumentException: Duplicate marker: 0 1556719 Gm20_1556719_C_T G A at vcf.Markers.markerSet(Markers.java:175) at vcf.Markers.<init>(Markers.java:92) at vcf.Markers.create(Markers.java:69) at vcf.TargetData.extractMarkers(TargetData.java:130) at vcf.TargetData.advanceWindowCm(TargetData.java:120) at vcf.TargetData.targetData(TargetData.java:76) at main.Main.data(Main.java:143) at main.Main.main(Main.java:115)

ADD REPLYlink written 16 months ago by valopes30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1450 users visited in the last hour