Hi there,
I want to find SNV that show bias in transcribed/untranscribed strand
Here is one SNV from varscan, with tumor/normal read counts, with header and data
chrom position ref var normal_reads1 normal_reads2 normal_var_freq normal_gt tumor_reads1 tumor_reads2 tumor_var_freq tumor_gt somatic_status variant_p_value somatic_p_value tumor_reads1_plus tumor_reads1_minus tumor_reads2_plus tumor_reads2_minus normal_reads1_plus normal_reads1_minus normal_reads2_plus normal_reads2_minus
chr11 1092459 C A 300 30 9.09% C 245 71 22.47% M Somatic 1.0 1.9447764810399614E-6 119 126 51 20 156 144 21 9
I calculated ratio of plus_strand/minus_strand for tumor/normal variant
cat filename | awk 'BEGIN {OFS="\t"} { print $0, $16/$17, $18/$19, $20/$21, $22/$23 }'
chr11 1092459 C A 300 30 9.09% C 245 71 22.47% M Somatic 1.0 1.9447764810399614E-6 119 126 51 20 156 144 21 9 0.944444 2.55 1.08333 2.33333
Tumor_ref_allele is 0.94 between plus/minus_Strand
Tumor_var_allele is 2.55 between plus/minus_Strand [indicate preference of mutation in -ve strand]
Normal_ref_allele is 1.08 between plus/minus_Strand
Normal_var_allele is 2.33 between plus/minus_Strand [indicate preference of mutation in -ve strand]
I looked at the position, chr11-1092459 is MUC2 gene on plus strand. Is it correct to say that variant_allele is biased in non-transcribed strand ?
Any suggestions are welcome !
Thanks !
It doesn't seem like this analysis would be productive in unstranded shotgun DNA data generated from double-stranded DNA (particularly if it is amplified). Can you describe your library preparation in more detail?