Hi all, here are some questions about paired-end sequencing for NGS:
- What are the main differences between mate-paired sequencing and paired-end sequencing; Should I care when I use tools like 'samtools', maq, etc.... ? Should one, and only one short read, should be paired with another one (1-1)?
- What is removing duplicates ? does it mean that a pair of short reads has been mapped at two distint positions on the genome or does it mean that a pair matched too many time at one position ?
- Knowing that bwa sampe "Generates alignments in the SAM format given paired-end reads. Repetitive read pairs will be placed randomly", is there any need to "remove the duplicates" ?
- How does picard MarkDuplicates work ? How can I find the reads that have been 'tagged' ? will it remove the reads from the BAM file ?