Question: Variant Calling scRNA-seq data for KRAS mutations
gravatar for theodore.killian
7 days ago by
theodore.killian0 wrote:

I have sequenced scRNA-seq data and I would like to do variant calling for one gene (KRAS) so I can annotate clusters in the downstream analysis. I have looked at past posts and there doesn't seem to be a lot of consensus on what tools to use. However, I understand that I would do something like: 1) Align reads with STAR to generate a BAM file and subsequently generate a pileup file 2) Run the FreeBayes variant caller to find SNVs

Most of the tools and workflows for variant calling tend to focus on finding SNP in the entire genome, and I would only like to look at one specific gene, KRAS.

Another question I had, is what specific read depth is appropriate for variant calling only for one gene? (as opposed to the entire transcriptome). Would there be a difference?

snp rna-seq • 69 views
ADD COMMENTlink written 7 days ago by theodore.killian0

What single cell sequencing method did you use? Most single cell libraries are biased toward one end or the other of the transcript, you might only have a fraction of the transcript covered with reads.

ADD REPLYlink written 7 days ago by swbarnes28.6k

We use 10X. I was thinking of using FreeBayes on the aligned reads, although I know that the called SNPS are strongly contingent on the read depth.

ADD REPLYlink written 7 days ago by theodore.killian0

As the comment above already indicates, the ability to detect mutations will strongly depend on whether you actually managed to sequence the part of the KRAS gene that is typically mutated. Generally, there's not a lot wrong with your pipeline; I wouldn't stress about the variant caller before actually having looked at the BAM file. Even if your mutation happens to be in a region for which you managed to capture reads, the depth will most likely be on the low end per single cell (I would guess below ten reads), so making a mutation call will most likely simply depend on manual annotation (if you are looking for known mutations).

ADD REPLYlink written 6 days ago by Friederike6.1k

If this is cancer, most KRAS mutations (~80-90% of KRAS mutant tumors) occur at just one of two amino acids residues within the protein (G12X or G13X, X=any amino acid). So it would be entirely possible to have a manual component.

ADD REPLYlink written 6 days ago by Collin850

But that means 3' biased sequencing will never cover those sites.

ADD REPLYlink written 6 days ago by swbarnes28.6k

Yes, you are right. But there are also oncogenes that may have many mutations in the middle of a long protein and might not get any coverage for either strategy. Also illustrates why study design is important, as 5' biased sequencing might likely cover these sites.

ADD REPLYlink written 6 days ago by Collin850
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1560 users visited in the last hour