I downloaded bam "slices" from a database (TCGA), which correspond to a subset of the entire alignment, corresponding to a small set of genes. Now I would like to find variants in the human genome, I am currently using strelka and it works fine. However, I am wondering if there is a more efficient way to do it given that I know exactly the region of interest (those few genes)? Do you recommend me subsetting the human genome to only those genes? What tool should I use to do that?
- Correct biomart version for TCGA data
- Where to find an older transcript version?
- Finding unique regions in a sequence