Entering edit mode
6.9 years ago
micro32uvas
▴
10
Hello Everyone!
I am dealing with whole genome reseq data from Illumina platform for paired end data. Here's the flow,I've been following:
- Fastq-->Sam (BWA mem with -M switch)
- Sam-->Bam (Samtools view -b -S with F-2308 switch, which just didnt worked out for mate pairs, so i shifted to -F 2304)
- Sorted Bam as per coordinates
- Added Read Groups by Picard's add or replace Read groups
- validated by validateSamFile of picard, (After replacing -f2308 with -F 2304, the current Bam file gave no errors with the mate pairs and file was validated successfully)
- Now i removed duplicates by mark duplicates by picard.
- Validated again; now it shows
Error Type Count ERROR:MATE_NOT_FOUND 180154
Now I cant get this thins straight, Help is appreciated
Is there any chance the fastq files you started off with were not in sync (i.e. they may have been trimmed separately getting the order of reads in the file out of sync)?
I doubt that, I got paired end data, mapped both reads with 99.14% coverage. Here's the flagstat of he data