Retain file names as sample names while making pileup
0
0
Entering edit mode
6.7 years ago
bioinfo8 ▴ 230

Hi,

I have different samples (exomes) a and b. In each sample, there are 5-10 individuals.

1) I have used following code for variant calling:

 >samtools mpileup -ugf ref.fa a*_sorted.bam > a.bcf          # pileup of 5 individuals (5 bam)
 >bcftools call -vmO v a.bcf > a.vcf 
 >vcfutils.pl varFilter -Q 10 -d 10 -D 200 a.vcf > a_filtered.vcf

Similarly, b_filtered.vcf was also generated.

2) I have a list of 10 genes for which I am interested to find variants from these two datasets (a and b) and used bcftools for annotation:

>bgzip genes_10sorted.bed
>tabix -p bed genes_10sorted.bed.gz     
>bcftools annotate -a genes_10sorted.bed.gz -c CHROM,FROM,TO,GENE -h <(echo '##INFO=<ID=GENE,Number=1,Type=String,Description="Gene name">') a_filtered.vcf.gz > a_filtered_ann10.vcf

3) Now I can see the gene names in the filtered and annotated vcf file a_filtered_ann10.vcf but I can't figure out the sample names as they are indicated with ERS561518, ERS561535, ERS561560, ERS561566, ERS561638.

How can I retain the file names as sample names while making pileup and keep them throughout?

Any guidance in this regard would be appreciated.

Thanks!

vcf variant calling bcftools gene samtools • 1.7k views
ADD COMMENT
1
Entering edit mode

I edited the title to make it more specific. I guess you should modify the read groups of your bam file.

ADD REPLY
0
Entering edit mode

Yes, there is a SM tag in @RG = ERS561518 of my first sample.

Should I edit manually or is there any automatic way?

ADD REPLY
1
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 2252 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6