Question

Change sequence ID in fastq file generated by bcftools mpileup

0

Entering edit mode

12 months ago

antoine.fauchois92 ▴ 20

Hi everobody !

I'm currently work on a HHV8 genetic study and I face to an issue with my bcftools command. Indeed, I want to generate consensus sequences thanks bcftools mpileup command and bam files. However, all ID get the ID of the reference sequence. You can see bellow the code used to do this job:

for patient in patients:
    input_file = os.path.join("Queries_seq/bam_files", patient + ".bam")
    output_file = os.path.join("Queries_seq/consensus/fastq", patient + ".fastq")
    os.system(f"bcftools mpileup -f Ref_seq/HHV8_ref.fasta {input_file} | \
        bcftools call -c | \
        vcfutils.pl vcf2fq > {output_file}")

So, if I do the linux command bellow on a randomly chosen file, I get:

sed '1,1p' Queries_seq/consensus/fastq/1G_S15.fastq
@NC_009333.1

This is the ID of my reference sequence. Can you help me to fix it ?

Best regards

bcftools • 596 views

ADD COMMENT • link updated 10 months ago by Ram 43k • written 12 months ago by antoine.fauchois92 ▴ 20

2

Entering edit mode

Bcftools annotate has a function rename-chr https://samtools.github.io/bcftools/bcftools.html#annotate.

You can pipe that after the bcftools call. ... bcftools call -c | bcftools annotate --rename-chrs myrenamefile.txt | ...

ADD REPLY • link 12 months ago by barslmn ★ 2.1k