Greetings to all.
I have some samples from mosquito´s Aedes aegypit exome sequencing (Coding portion of DNA). I intend to look at SNPs in DNA's coding portion to associate it with a resistant phenotype. For this, I used gsnap to map due to being SNP tolerant.
My inquiry: I´m having trouble getting from SAM to BAM. And this is the error I got:
I used samtools with this code:
samtools view -b -o HLR1_subsample.bam HLR1_subsample.sam
I´ve repeated the mapping, I´ve redo the instalation of samtools, checked the version of samtools, checked the version of gsnap. My version of samtools is samtools 1.19.2, GSNAP version 2024-11-20 called with args: gsnap.avx512.
It keeps getting the error "positional data is too large for BAM format" as in the picture
Second time posting a question. Not quite sure what it is truly needed to understand my question.
Can you share the output of
samtools view -H
for a file where the bam conversion worked and one where the command failed?Sure thing:
Failed
Worked
Is this enough?
Please copy and paste text and then format (after selecting) as
code
using the101010
button in edit window. Posting screenshots does not provide clear information about content.Do you have large datasets as in number of reads. Searches seem to indicate that other issue may be too many reads aligning to specific positions in your BAM file.
Thank you for your suggestion. I paste the code as stated. I do not have a large data sets. Both of this data set have a size of 300000 reads.
Do the files that fail always fail or work if resubmitted again?
Why are you using this option? If your paired-end reads are out of sync your alignments are going to be incorrect.
Yes. They fail or work when resubmitted consistently.
We allowed certain level of mismatch with this option. However, I´ve re run the alignments withouth that option and they kept failing to change from sam to bam.
Did you scan/trim your paired end data files independently? If you did then the reads are likely out of sync. If that is the case you need to go back to the original files and scan/trim the paired end files together to make sure that does not happen.
Results you have from out of sync paired-end data are going to have discordant alignments for part of the data.
I´ve had run cutadapt
Using this code, so they shouldn´t be out of sync.
This does not appear to be correct. Are you able to provide multiple input files at the same to tocutadapt
(will have to look that up). In any case you have 3 files which is not balanced.Edit: You can't directly provide multiple files of pairs as input to cutadapt.
-o
also specifies output. So It looks like if you ran the command line exactly as above then the results are not correct.</s>I´ve actually done this with other samples and with some of them the actual SAM to BAM step can be done. Additionally, I looked for it in the manual, and in the option paired adapters cutadapt manual - pair adapters it is correct to use
-o
and-p
to produce an output of paired reads.So I´m still clueless about this problem.
That is a confusing way to provide input options (am not a
cutadapt
user) but you are right. I will scratch my comment above.At this point nothing else that comes to mind then. If you are willing you could instead try
fastp
orbbduk.sh
from BBMap suite as an alternate option on one of the sample file pair that is failing and see if it helps.I´m asking for help to other professors at my University. Will reply if I find some way around it. Thank you for your help.