Question: Why rs193922900 and rs59736472 seem to be special for RNAEditor?
Hello, everyone! I want to use RNAEditor . This need to prepare many database first, one of them is VCF for dbSNP. This is the command given by document.

wget -qO- |gunzip -c |awk 'BEGIN{FS="\t";OFS="\t"};match($5,/\./){gsub(/\./,"N",$5)};$5 == "" && $1 !~ /^#/ {gsub("","N",$5)};$3 ~ /rs193922900/ {$5="TN"};$3 ~ /rs59736472/ {$5="AN"};$5 ~ /H/ {gsub(/H/,"N",$5)};{print $0}' dbSNP.vcf

My question is why need to set ALT to TN and AN for rs193922900 and rs59736472 separately? Why this two sites seem to be special for RNAEditor ?

I have no idea what TN and AN is meaning. But I have a guess why this SNPs are treated seperatly. They describe short tandem repeats (see rs193922900, rs59736472). The description in of the variant is not vcf conform (see the values on the dbSNP site in the RefSNP Alleles columns).

According to the help site these type of variants is excluded in the current vcf version of dbSNP. But it might be, that in this old version RNEditor linked to, STRs are included and lead to any problems.

Thanks finswimmer! I actually download ther newest version(release-95) of dbSNP file. I go check the vcf file and find this two sites still in vcf file.

19      18786034        rs193922900     TGTC    T.,T.   .       .       dbSNP_151;TSA=sequence_alteration;E_Cited;AA=GTC
7       107691504       rs59736472      AATATATATAT     
A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.,A.       .       .   

This may explain why set to TN and AN because set all . to N.

