How to call snps on sequencing data that contains UMIs
5 months ago


I asked the question on the help forum on Galaxy, but got no answer, so I'm trying my luck here.

I would like to analyze fastQ files from MiSeq Illumina. For this I want to do a DNA alignment on a reference genome (hg38) and then a polymorphism detection at known positions (I have a bed file with all the positions) and get a VCF file.

I don't know which tools to use on the web plateform of Galaxy. One of the problem is that the sequencing was done with UMIs, for low-intensity polymorphism detection (less PCR and sequencing errors). I don’t know what tools to use to do the alignment taking into account the UMIs. So I have 4 fastQ files : R1 and R2 because it's a double end sequencing, and if necessary, I1 and I2 which contain the indexs and UMIs sequences (I1 = index I7 + UMI and I2 = index I5).

Thank you

I have changed the title of the post to better reflect the question. Galaxy is not all that relevant, as that is not a tool but a platform, one still need to know what tools to use


