Question: Performing so-called "SNP confirmation" for reads assigned to their corresponding Antibiotic Resistance Genes (ARGs): suggestions?
gravatar for Macspider
9 days ago by
Vienna - BOKU
Macspider3.1k wrote:

I have a question related to a fairly common meta-genomic and meta-transcriptomic analysis.

I am working on a project whereby, among other things, we are assessing the expression of antibiotic resistance genes (ARGs) in gut microbiota of a species. This is a fairly common procedure for those working in this field.

The procedure works by assigning mRNAseq reads to their corresponding gene through mapping and counting, pretty much like htseq/deseq2. Counts define expression which translates to antibiotic resistance.

Now, one of the key features of ARGs is that some of them require SNP confirmation, which means that only the presence of certain alleles confers resistance, not just the expression of the gene.

My doubts start from here: If I count the reads that contain the SNPs, this will not be linearly proportional to the number of transcripts that have the resistance. This because my reads (single end, 100 nt) are shorter than the transcripts, so I will only find the proportion of reads that confer resistance at each detected SNP.

My approach so far has been to call SNPs, extract the allele fraction of each "resistance SNP", and then average them across each gene to infer a fraction of reads that confirm the resistance through the SNPs. But this workflow is quite dirty and has a lot of assumptions: how would you proceed? what could you suggest to me, if you did it in the past?

(note to experts: I can't use amr++ because it is made for paired-end reads)

ADD COMMENTlink modified 3 days ago by Biostar ♦♦ 20 • written 9 days ago by Macspider3.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1148 users visited in the last hour