Determine strand info of somatic mutations
1
0
Entering edit mode
2.1 years ago
whb ▴ 60

Hi all,

How do I get the strand (transcribed/non-transcribed) information of somatic mutations? I have called the mutations with Mutect2 and annotated them with SnpEff and SnpSift but I cannot figure out how to find which strand a mutation is coming from in the vcf file.

Thanks

strand DNA • 914 views
ADD COMMENT
0
Entering edit mode

variants (records) on vcf are always denoted on + strand (same strand as reference genome). Unless there is a strand information in vcf, it is difficult to know the stranded information from VCF for each mutation.

ADD REPLY
0
Entering edit mode
2.1 years ago
ATpoint 82k

DNA is double-stranded. Hence mutations are double-stranded and not strand-specific.

ADD COMMENT
0
Entering edit mode

Thanks for the reply. I am wondering if I have missed any info because a few papers compare the number of substitution mutations on the transcribe and non-transcribed strand e.g. Figure 1D in https://pubmed.ncbi.nlm.nih.gov/25999502/ or did I misunderstand something? Thanks again.

ADD REPLY
0
Entering edit mode

They seem to look at the exact nucleotide change. So the position in the chromosome is the same (as it is double-stranded) but if a gene is on top-strand then the nucleotide would e.g. be A and if it was on the bottom it would be T. You could intersect your mutations with a GTF file, retain those that overlap a gene, and then filter for the strand information. Next, use something like bedtools getfasta to extract the exact nucleotide for every mutation, based on the top strand, and if the intersection from the step before indicated minus strand then convert the base to its complement. Does that make sense for you? What is your coding level, can you write something like this or do you need a end-to-end tool?

ADD REPLY
0
Entering edit mode

Hi ATpoint thank you for the reply. I am a beginner in coding so it would be great if you could suggest some end-to-end tools. From what you have suggested, should I start with intersecting vcf files with GTF? or bam/fastq? and I am not sure how to extract strand inforamtion?

ADD REPLY

Login before adding your answer.

Traffic: 1699 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6