Depends on the assembler, but typically they are ignored. So, for example, a 101bp read with an N right smack in the middle, if you were assembling with a kmer length of 60, would yield zero kmers since there are no valid 60bp sequences that do not contain an N. Whereas a N-free 101bp read would yield 42 60-mers. A 101bp read with an N at the end would yield 41 60-mers.
It means all the reads(101bp) which are having N's right smack in the middle and if my kmer size is 60, then these reads are of no use because they yield zero kmers? Am i right?
What would happen to reads if they have consecutive N's not in the middle and not at the end but at different part of the reads? Should my assembler not consider the k-mers which are having N's? Even a k-mer contains a single N should be not considered?
Thanks for your reply Brian.
It means all the reads(101bp) which are having N's right smack in the middle and if my kmer size is 60, then these reads are of no use because they yield zero kmers? Am i right?
What would happen to reads if they have consecutive N's not in the middle and not at the end but at different part of the reads? Should my assembler not consider the k-mers which are having N's? Even a k-mer contains a single N should be not considered?
Thanks in advance.
It depends on the choices of the assembler author, but generally, all kmers containing even a single N are disregarded.