Rename sample names in g.vcf files fails [Error Segmentation fault (core dumped) ]
1
0
Entering edit mode
4.0 years ago
Alewa ▴ 170

Dear Biostars,

my aim is to rename sample_name of g.vcf files (generated after gatk haplotype-caller containig varints for each individual for each g.vcffile) but my script breaks with error Segmentation fault (core dumped) Question: any suggestions how i could resolves this?

g.vcfs files are saved in directory /sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/

g.vcfs are saved as this CC_0001.hg38.g.vcf.gz CC_0002.hg38.g.vcf.gz CC_0003.hg38.g.vcf.gz ....... CC_nnnn.hg38.g.vcf.gz

gvcfs="/sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/*.hg38.g.vcf.gz"
gvcfsSampleNames="/sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/sampleNames_dir/*.txt"

for i in $gvcfs
    do
        sampleName=`basename -s .hg38.g.vcf.gz $i` ###grab basename of gvcf to use for replacement
        echo "$sampleName" > sampleNames_dir/$sampleName.txt ###same basename/name in a file to set as input for bcftools reheader --samples
        for j in $gvcfsSampleNames; do
            bcftools reheader --samples $j -o renamed_gvcfs2/renamed_SM.$i $i
        done
    done

error logs

Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
renamed_gvcfs2/renamed_SM./sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/NEW_samples_CC10008_Germline_edited.hg38.g.vcf.gz: No such file or directory
renamed_gvcfs2/renamed_SM./sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/NEW_samples_CC10008_Germline_edited.hg38.g.vcf.gz: No such file or directory
renamed_gvcfs2/renamed_SM./sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/NEW_samples_CC10008_Germline_edited.hg38.g.vcf.gz: No such file or directory
renamed_gvcfs2/renamed_SM./sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/NEW_samples_CC10008_Germline_edited.hg38.g.vcf.gz: No such file or directory
renamed_gvcfs2/renamed_SM./sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/NEW_samples_CC10008_Germline_edited.hg38.g.vcf.gz: No such file or directory
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
renamed_gvcfs2/renamed_SM./sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/NEW_SM3_CC10008_Germline_edited.hg38.g.vcf.gz: No such file or directory
renamed_gvcfs2/renamed_SM./sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/NEW_SM3_CC10008_Germline_edited.hg38.g.vcf.gz: No such file or directory
renamed_gvcfs2/renamed_SM./sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/NEW_SM3_CC10008_Germline_edited.hg38.g.vcf.gz: No such file or directory
renamed_gvcfs2/renamed_SM./sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/NEW_SM3_CC10008_Germline_edited.hg38.g.vcf.gz: No such file or directory
renamed_gvcfs2/renamed_SM./sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/NEW_SM3_CC10008_Germline_edited.hg38.g.vcf.gz: No such file or directory
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
Segmentation fault (core dumped)
SNP bash vcf bcftools gatk • 1.8k views
ADD COMMENT
0
Entering edit mode

How many files do you have? You may be running out of file descriptors on your system (i.e. too many files opened)

ADD REPLY
0
Entering edit mode

@Jean-Karim Heriche, I have only 41 g.vf files

ADD REPLY
0
Entering edit mode

I tried this second approach but results are printed to the screen; I'm not really sure what I'm doing wrong?

gvcfs="/sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_sm_gvcfs/*.hg38.g.vcf.gz"
for i in $gvcfs
    do
        sampleName=`basename -s .hg38.g.vcf.gz $i` ###grab basename of gvcf to use for replacement
        savedFileName=`echo "$sampleName" > ${sampleName}.txt`
        bcftools reheader --samples $savedFileName -o ${i}_edited.g.vcf.gz $i
    done
ADD REPLY
0
Entering edit mode

What is this coded in? Bash? If you want the sample names to be in a file, you probably need to use the --samples-file option of bcftools, the --samples option takes a comma-separated list of names, in the later case, you could just do --samples $sampleName

ADD REPLY
1
Entering edit mode
3.9 years ago
Alewa ▴ 170

ouch! I forget to post my solution! just in case someone else encounters similar solution. bcftools reheader --samples requires a file with sample names so I specified correctly file in local directory by inserting quotes "${sampleName}.txt" to make it work.

thanks everyone for the support

S

####implement the real codes for entire 41 WES T/N samples; script written in Bourne Again Shell (bash)
gvcfs="/sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/*.hg38.g.vcf.gz"
NamePrefix="CC_renamed_" ##addd this prefix to all newly created gvcf files 
for i in $gvcfs
    do
        sampleName=`basename -s .hg38.g.vcf.gz $i` ###grab basename of gvcf to be use as sample_name (SM) in gvcf file
        Outgvcf_file=`basename $i` ###prepare desired output gvcf file name 
        #newName=$sampleName.txt
        echo "$sampleName" > ${sampleName}.txt ###save sample name as a txt file for use bcftools reheader command 
        bcftools reheader --samples "${sampleName}.txt" -o "$NamePrefix${Outgvcf_file}" $i ###this bcf rename command to execute
        echo "completd $i" ###tell you are done
    done


gvcfs_files_to_index="/sc/hydra/projects/canan/colon_cancer10312019/joint_calling_colon/gvcfs/renamed_SM_gvcfs_colon/*.hg38.g.vcf.gz"
for i in $gvcfs_files_to_index;
do
bcftools index -t $i
done
ADD COMMENT

Login before adding your answer.

Traffic: 2670 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6