bedtools -u not giving unique files
1
0
Entering edit mode
3 months ago

The following are the steps Im following: First step to extract sample using bed file is this (here the bedfile is input bedfile converted to Hg38):

tabix -h -R Hg19_to_Hg38_sorted.bed.gz gnomad.genomes.v{g_version}.hgdp_tgp.chr{chr}.vcf.bgz | perl {vcftools} -c {sample_name} > {sample_name}_out.vcf'

output({sample_name}_out.vcf')
chr2    113982416   rs56177103  TATAAAATAAAATAAA    T   .   PASS    .   GT:AAD:DAD:DAF:ADF  0/1:25519,4077:25519,4077:0.13776:0.13776   
chr2    113982416   rs56177103  TATAAAATAAAATAAA    T   .   PASS    .   GT:AAD:DAD:DAF:ADF  0/1:25519,4077:25519,4077:0.13776:0.13776   
chr2    113982416   rs56177103  TATAAAATAAAATAAA    T   .   PASS    .   GT:AAD:DAD:DAF:ADF  0/1:25519,4077:25519,4077:0.13776:0.13776   

as my output file had repeated regions, inorder to extract the unique regions im using the same input bed file with intersect bed , but unable to get the unique reads. It gives the same repeated results. why is that so ? The following is the cmd that I had used:

bedtools/intersectBed -u -a  {sample_name}_out.vcf' -b bed_filename > output.vcf 
vcftools bedtools vcf intersectbed tabix • 303 views
ADD COMMENT
0
Entering edit mode

Was also wondering if doing sort|uniq gives the same result?

ADD REPLY
0
Entering edit mode
3 months ago

Another option is to pipe BED data to sort-bed:

$ ... | sort-bed --unique - > answer.bed

Ref.: https://bedops.readthedocs.io/en/latest/content/reference/file-management/sorting/sort-bed.html

ADD COMMENT
0
Entering edit mode

But one of my data is a vcf file

ADD REPLY

Login before adding your answer.

Traffic: 1811 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6