Closed:Strelka output: read count of REF and ALT, SNVs
1
0
Entering edit mode
6.7 years ago
noeD ▴ 130

Hello!

I have used Strelka as variant caller for SNV. I'm having trouble with the output... I would like to obtained the read counts of Tumor/Normal's REF and ALT. I have already read this post Strelka Indel Allele Counts and I have found it useful for the INDEL... Unfortunately, there isn't a suggestion for SNVs.

For SNV: this is the header of the vcf file:

#INFO=<ID=QSS,Number=1,Type=Integer,Description="Quality score for any somatic snv, ie. for the ALT allele to be present at a significantly different frequency in the tumor and normal">
##INFO=<ID=TQSS,Number=1,Type=Integer,Description="Data tier used to compute QSS">
##INFO=<ID=NT,Number=1,Type=String,Description="Genotype of the normal in all data tiers, as used to classify somatic variants. One of {ref,het,hom,conflict}.">
##INFO=<ID=QSS_NT,Number=1,Type=Integer,Description="Quality score reflecting the joint probability of a somatic variant and NT">
##INFO=<ID=TQSS_NT,Number=1,Type=Integer,Description="Data tier used to compute QSS_NT">
##INFO=<ID=SGT,Number=1,Type=String,Description="Most likely somatic genotype excluding normal noise states">
##INFO=<ID=SOMATIC,Number=0,Type=Flag,Description="Somatic mutation">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read depth for tier1 (used+filtered)">
##FORMAT=<ID=FDP,Number=1,Type=Integer,Description="Number of basecalls filtered from original read depth for tier1">
##FORMAT=<ID=SDP,Number=1,Type=Integer,Description="Number of reads with deletions spanning this site at tier1">
##FORMAT=<ID=SUBDP,Number=1,Type=Integer,Description="Number of reads below tier1 mapping quality threshold aligned across this site">
##FORMAT=<ID=AU,Number=2,Type=Integer,Description="Number of 'A' alleles used in tiers 1,2">
##FORMAT=<ID=CU,Number=2,Type=Integer,Description="Number of 'C' alleles used in tiers 1,2">
##FORMAT=<ID=GU,Number=2,Type=Integer,Description="Number of 'G' alleles used in tiers 1,2">
##FORMAT=<ID=TU,Number=2,Type=Integer,Description="Number of 'T' alleles used in tiers 1,2">
##FILTER=<ID=DP,Description="Greater than 3.0x chromosomal mean depth in Normal sample">
##FILTER=<ID=BCNoise,Description="Fraction of basecalls filtered at this site in either sample is at or above 0.4">
##FILTER=<ID=SpanDel,Description="Fraction of reads crossing site with spanning deletions in either sample exceeeds 0.75">
##FILTER=<ID=QSS_ref,Description="Normal sample is not homozygous ref or ssnv Q-score < 15, ie calls with NT!=ref or QSS_NT < 15">

And these are the first lines of my output:

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  NORMAL  TUMOR
chr1    19199723        .       G       A       .       PASS    NT=ref;QSS=34;QSS_NT=15;SGT=GG->AA;SOMATIC;TQSS=1;TQSS_NT=1     DP:FDP:SDP:SUBDP:AU:CU:GU:TU    6:0:0:0:0,0:0,0:6,6:0,0 19:0:0:0:19,19:0,0:0,0:0,0
chr1    21738020        .       A       G       .       PASS    NT=ref;QSS=31;QSS_NT=15;SGT=AA->GG;SOMATIC;TQSS=1;TQSS_NT=1     DP:FDP:SDP:SUBDP:AU:CU:GU:TU    6:0:0:0:6,6:0,0:0,0:0,0 18:0:0:0:0,0:0,0:18,18:0,0

I am able to extract for the Normal case the read counts of the REF and for Tumor the read counts of the ALT - for example in the first SNV:

chr1 19199723 . G A . PASS NT=ref;QSS=34;QSS_NT=15;SGT=GG->AA;SOMATIC;TQSS=1;TQSS_NT=1 DP:FDP:SDP:SUBDP:AU:CU:GU:TU 6:0:0:0:0,0:0,0:6,6:0,0 19:0:0:0:19,19:0,0:0,0:0,0

I have the number of the allele G for Normal (6) and the number of the alle A for Tumor (19). They didn't report the number of the allele A for Normal and the number of the allele G for Tumor...

Could you help me?

Thank you in advance

REF ALT SOMATIC SNV STRELKA • 721 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 1867 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6