Linux command to edit CHROM column in VCF file
1
0
Entering edit mode
14 months ago

The contents of VCF file looks like this

I want to replace the CHROM column with chr followed by chromosome number 1 [chr1].

After using the command, the contents in the line of a VCF file should look like this

CHROM   POS    ID       REF      ALT      QUAL     FILTER     INFO     FORMAT      801   
chr1           711       .           T          C          40           PASS      DP=15     GT:GQ:DP   1|1:40:15
vcf linux • 1.0k views
ADD COMMENT
0
Entering edit mode

You can do it in R like this If you remove the headers

In Rstudio:

install.packages("dplyr")
library(dplyr)
VCF_recoded <- VCF_original %>% mutate(X.chr=recode(X.chr, 
                                                                     '1'="chr1",
                                                                     '2'="chr2",
                                                                     '3'="chr3",
                                                                     '4'="chr4",
                                                                     '5'="chr5",
                                                                     '6'="chr6",
                                                                     '7'="chr7",
                                                                     '8'="chr8",
                                                                     '9'="chr9",
                                                                     '10'="chr10", 
                                                                     '11'="chr11",
                                                                     '12'="chr12",
                                                                     '13'="chr13",
                                                                     '14'="chr14",
                                                                     '15'="chr15", 
                                                                     '16'="chr16",
                                                                     '17'="chr17",
                                                                     '18'="chr18",
                                                                     '19'="chr19", 
                                                                     '20'="chr20",
                                                                     '21'="chr21",
                                                                     '22'="chr22", 
                                                                     '23'="chrX",
                                                                     '24'="chrY"))                                 

View(VCF_recoded)
write.table(VCF_recoded,"VCF_recoded.cov", row.names=FALSE, sep=" ", quote=FALSE)

There's probs a few different ways - sorry if this isn't helpful

ADD REPLY
0
Entering edit mode

Thanks for the suggestion! I want to use the linux command for multiple files.

ADD REPLY
0
Entering edit mode

Please do not paste screenshots of plain text content, it is counterproductive. You can copy paste the content directly here (using the code formatting option shown below), or use a GitHub Gist if the content volume exceeds allowed length here.

code_formatting

ADD REPLY
1
Entering edit mode
14 months ago
ATpoint 81k

Has been asked before, e.g. VCF files: Change Chromosome Notation

Untested:

awk '$1 ~ /^#/ {OFS="\t"; print $0;next} {$1="chr"$1; print $0}' < in.vcf
ADD COMMENT
0
Entering edit mode

Thank you for the help!! It worked :). I can understand the command easily.

ADD REPLY

Login before adding your answer.

Traffic: 2660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6