Hi!
I am working with an organism that has two plasmids and one chromosome. I would like to determine the relative copy number of my plasmids, respectively to my chromosome, which is in one copy. I have whole-genome sequencing data for this organism, so I thought of mapping my reads to the reference (3 sequences), determine the coverage of each sequence, and therefore get the values I am seeking. My code is:
bowtie2-build three-seqs.fasta three-seqs.fasta
bowtie2 --threads 10 -x three-seqs.fasta -U reads.fq.gz | samtools sort --threads 10 - -o mapped.bam
samtools depth mapped.bam >mapped.depth
perl calc.coverage.in.bam.depth.mads-albertsen.pl -i mapped.depth
# the last script is from: https://github.com/MadsAlbertsen/multi-metagenome/blob/master/misc.scripts/calc.coverage.in.bam.depth.pl
I get the following output from the perl script:
Name Average.coverage
first 1198.907
second 286.216
chromosome 508.999
By dividing the coverage of each reference by the total coverage would give me the relative coverage, hence relative copy number:
Name Average.coverage % coverage
first 1198.907 60.1
second 286.216 14.4
chromosome 508.999 25.5
Therefore, my first plasmid is roughly twice as abundant as my chromosome, and my second plasmid roughly 0.5x, which would mean it's only in every second cell.
Is this workflow correct? Or am I missing something here, e.g. do I need to take into account the sequence length?
If there were no biases in sampling/lib prep/sequencing then that sounds about right. Sounds like your plasmids don't share much sequence with your chromosome.
Thanks for the hint, I should have checked that! I ran a blast and calculated ANI, and there's only 3 or 4 chunks that are similar due to conserved genes, but they are just 2 kb in total.
However: I now filtered my
mapped.bamby Q30, which removes 6 % of my reads. When I calculate coverages, I get roughly equal coverage for mychromosomeand my plasmidfirst, thesecondplasmid however stays at a relative abundance of 0.5:Why could that be? Is it multimapping reads that are causing this shift?