Entering edit mode
9.7 years ago
JorgeGonzalez
▴
20
Hi
In relation to the data filtering GBS, Elshire et al. 2011 describes:
Analyses of the 86 bp sequencing reads were based upon the unfiltered qseq files, since the filtering process that produces fastq files sometimes discarded good reads that aligned perfectly to the reference genome for at least 64 bases. Starting with the qseq files from a flow cell, we first filtered for reads that (1) perfectly matched one of the barcodes and the expected four-base remnant of the ApeKI cut site (CWGC), (2) were not adapter/adapter dimers, and (3) contained no ""Ns"" in their first 72 bases.
What does it mean "Ns"?