Tutorial:Reorder Vcf Files
1
2
Entering edit mode
10.9 years ago
chris.mit7 ▴ 60

Here's a quick bash script to reorder a vcf file. It's currently setup to reorder a vcf to karotypic ordering. I've seen enough threads and have been annoyed at this myself enough times so hopefully someone else finds this useful.

grep -P "^#" $1 > ${1}.header
for I in {1..22}
do
    grep -P "^$i\t" $1 > ${1}.chr$i
done
grep -P "^M\t" $1 > ${1}.chrM
grep -P "^X\t" $1 > ${1}.chrX
grep -P "^Y\t" $1 > ${1}.chrY
toCat=${1}.header
toCat="$toCat ${1}.chrM"
for I in {1..22}
do
    toCat="$toCat ${1}.chr$i"
done
toCat="$toCat ${1}.chrX"
toCat="$toCat ${1}.chrY"
cat $toCat > ${1}.karo.vcf

Usage: bash script.sh filename.vcf

It'll make a new file, filename.vcf.karo.vcf and leave the intermediates.

vcf • 5.3k views
ADD COMMENT
1
Entering edit mode

I have posted this before: alphanumeric GNU sort. It has a new option "-N" that sorts, for example, "chrM, chr10, chr2" to "chr2, chr10, chrM". For VCF sorting: (grep ^# in.vcf; grep -v ^# in.vcf|sort -k1,1N -k2,2n) > out.vcf. This works for any TAB delimited formats. It is more general, though for an already sorted VCF, it is slower.

ADD REPLY
0
Entering edit mode
10.9 years ago
Isaac Turner ▴ 50

Another solution to the problem: vcf_sort.pl

Usage: ./vcf_sort.pl <order> [in.vcf]
  Order is a comma-separated list of chroms
  Prints output to STDOUT, prints stats to STDERR

e.g.

perl vcf_sort.pl chr1,chr2A,chr2B,chr3 chimpanzee.vcf > chimp.reordered.vcf
perl vcf_sort.pl `echo chr{{1..22},X,Y,M} | tr ' ' ','` human.vcf > human.reordered.vcf

Requires the unix 'sort' command. Creates and then removes a temporary directory in the current directory.

ADD COMMENT

Login before adding your answer.

Traffic: 2662 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6