Question: Select a subset of variants from a larger vcf file
1
gravatar for paraskevopou
2.1 years ago by
paraskevopou20
paraskevopou20 wrote:

Hi all!! I have a large vcf file and I want to create a subset one according to #CHROM field with a txt file (a list that contains #CHROM IDs of interest). I would like to keep the headers and the vcf format. Any ideas of how to do that? Thanks a lot! :)

snp rna-seq • 2.0k views
ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by paraskevopou20
1
gravatar for genomax
2.1 years ago by
genomax84k
United States
genomax84k wrote:

How to extract specific chromosome from vcf file
Extract Sub-Set Of Regions From Vcf File
https://bioinformatics.stackexchange.com/questions/3401/how-to-subset-a-vcf-by-chromosome-and-keep-the-header

ADD COMMENTlink written 2.1 years ago by genomax84k

Thanks a lot for the comment. Actually my vcf file contains SNPs called from transcriptomes. So, the #CHROM field contains a bunch of different "genes" around 26000. From these I want to extract according to #CHROM around 5000. This is why I asked if it is possible to be done by providing a list as a txt file with the desirable #CHROM names. This is how my prefixes in the #CHROM field look like. Moreover the headers do not have constant numbers but random.

TRINITY_DN6643_c0_g2
ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by paraskevopou20
1

You should be able to use a regions file with bcftools ( gringer's answer in the last link above).

ADD REPLYlink written 2.1 years ago by genomax84k

Thanks a lot. the bcftools filter command with the -R <file.txt> option worked perfectly.

ADD REPLYlink written 2.1 years ago by paraskevopou20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1346 users visited in the last hour