"GSAT_MM" repeats in mm10

0

Entering edit mode

4.6 years ago

Pierre Lindenbaum 166k

Hi all,

I used to work with the human genome but it's the first time I'm working with musmusculus mm10 sequenced on Novaseq.

Fastqc / overrepresented sequence reported that 0.1% of my reads are a sequence which looks lile a GSAT_MM (microsat) repeat. Is it a known fact for mm10 or is there anything (wet lab | bioinformatics) that could explain this number ? I've got many poly-G in one sample too ...

enter image description here

After mapping with bwa + sambamba rmdup, I got an average depth of ~20 but it falls down to 10 for the median depth. I think that's because the region of GSAT_MM is grabbing many reads (?)

So again, is there any known problem like this when doing hts with mus musculus ?

repeat microsattelite mouse musmusculus • 1.5k views

ADD COMMENT • link 4.6 years ago by Pierre Lindenbaum 166k

1

Entering edit mode

If your depth is changing that much then you may also have other duplicates. Over amplified sample?

ADD REPLY • link 4.6 years ago by GenoMax 152k

0

Entering edit mode

Over amplified sample ?

yes, may be... I asked the illumina support too...

ADD REPLY • link 4.6 years ago by Pierre Lindenbaum 166k

Login before adding your answer.