Entering edit mode
3.4 years ago
Pierre Lindenbaum
161k
Hi all,
I used to work with the human genome but it's the first time I'm working with musmusculus mm10 sequenced on Novaseq.
- Fastqc / overrepresented sequence reported that 0.1% of my reads are a sequence which looks lile a GSAT_MM (microsat) repeat. Is it a known fact for mm10 or is there anything (wet lab | bioinformatics) that could explain this number ? I've got many poly-G in one sample too ...
- After mapping with bwa + sambamba rmdup, I got an average depth of ~20 but it falls down to 10 for the median depth. I think that's because the region of GSAT_MM is grabbing many reads (?)
So again, is there any known problem like this when doing hts with mus musculus ?
A: What's the name in RepeatMasker for mouse major satellite repeats?
If your depth is changing that much then you may also have other duplicates. Over amplified sample?
yes, may be... I asked the illumina support too...