Hi all, I have a weird situation here: I am working with archaic DNA (aDNA) reads, and we had a few sequencing run in the past weeks. The cycle number was set to 50-100 (each run had its own cycle number), but some of the samples were the same (in this case I merged the fastq files to a single R1 and R2 file/sample). Therefore I had fastq files with different read lengths, where the reads' are paired end, or just because of the aDNA length variation are the same in the R1 and R2 file. When I used BWA to align reads to reference as PE reads, the results were terrific (the mismatch rate and read-position shifting in spec. regions were enormous). When I tried to cat the barcode/adapter trimmed fastq files and treat them as SE reads, I have got a very nice alignment with a correct coverage (however the MapDamage results were not that nice-looking, but it was slightly okay). My problem is that besides the good-looking results, I am not sure that my approach is flawless or even acceptable, so I just need some comment/advice about it. Thanks in advance!
PS.: I did not use merging, because it eliminates a considerable amount of reads (10-50%).
What do you mean by that? If you have R1/R2 files you could just use one at a time and treat them as SE reads.
What does that mean?
first question: I merged the R1 and R2 reads to a single file and I treated them as one file with SE reads second question: when I merge the reads, about 10-50% of them filtered out due to the lack of enough overhanging bases. It is because the with cycle 50 run we had got too much non-overhang reads, but with cycle 80 we had got too much identical reads in R1 and R2, thats because I used the file-merging approach