Entering edit mode
6 days ago
Ana
▴
10
Hi all! I am trying to obtain polyA tail length from my scRNAseq bam files using scAPAtrap however I am running into a problem that doesn't appear in the GitHub issues section. The tool starts running fine:
2025-05-06 13:50:55.790059 findUniqueMap: start.
(...)
2025-05-06 13:51:00.425044 findUniqueMap: finish.
(...)
2025-05-06 13:51:00.425397 dedupByPos
2025-05-06 13:51:34,994 WARNING At least one read is missing UMI and/or cell tag(s): LH00248:49:222HJCLT1:1:1111:6927:24408_GCCTGCGCTACTCTGTAAGGCTTGACT_TCCCTTTG 16 chromosome_01 3254 255 30S63M18689N57M1S * 0 0 GACATCGGCATCGGCCCTGGGAAGAGTTCTCTTTTCCCTTTCGATGGGCAGCGCGAATCGCGCTCTTCACACAGGATTACCCCATCTCTTAGGATCGACTAGAGGCTGTTCACCTTGGAGACCTGATGCGGTTATGAGTACGACTAGTCAG IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII9-IIIIIII999II-III99IIIIII9II9I-9-I9I9IIIIIIIIIII9IIIIII99IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII9III NH:i:1 HI:i:1 AS:i:104 nM:i:2
2025-05-06 13:51:43,995 INFO Reads: Input Reads: 4994533, Read skipped, missing umi and/or cell tag: 4994533
2025-05-06 13:51:43,995 INFO Number of reads out: 0
2025-05-06 13:51:43,995 INFO Total number of positions deduplicated: 0
2025-05-06 13:51:43,995 WARNING The BAM did not contain any valid reads/read pairs for deduplication
# job finished in 9 seconds at Tue May 6 13:51:43 2025 -- 10.08 0.52 7.21 0.12 -- 7b232a75-a3c5-4d93-8dc5-092d9d1ae982
... 2025-05-06 13:51:44.067055 dedupByPos: command done.
>>> 2025-05-06 13:51:44.067314 dedupByPos: finish.
(...)
2025-05-06 13:51:44.068104 separateBamBystrand
2025-05-06 13:51:44.09521 separateBamBystrand: finish.
2025-05-06 13:51:44.095603 findPeaksByStrand
2025-05-06 13:51:44.095745 findPeaksByStrand (+): start.
... 2025-05-06 13:51:44.095864 findPeaksByStrand: loadBpCoverages on 60 chrs: chromosome_01,chromo...
And then I get these messages for all chromosomes:
2025-05-06 13:51:47.221861 fullCoverage: processing chromosome chromosome_01
2025-05-06 13:51:47.228045 loadCoverage: finding chromosome lengths
2025-05-06 13:51:47.245775 loadCoverage: loading BAM file Chlamydomonas_single-cell/Data/scAPAtrap_data/Aligned.sortedByCoord.out.UniqSorted.dedup.forward.bam
2025-05-06 13:51:47.296566 loadCoverage: applying the cutoff to the merged data
2025-05-06 13:51:47.305211 filterData: originally there were 8225636 rows, now there are 8225636 rows. Meaning that 0 percent was filtered.
extendedMapSeqlevels: the 'seqnames' you supplied are currently not supported in GenomeInfoDb. Consider adding your genome by following the information at http://www.bioconductor.org/packages/release/bioc/vignettes/GenomeInfoDb/inst/doc/Accept-organism-for-GenomeInfoDb.pdf
... 2025-05-06 13:51:49.181441 findPeaks: start regionMatrix (L=49, cutoff=10)...
By using totalMapped equal to targetSize, regionMatrix() assumes that you have normalized the data already in fullCoverage(), loadCoverage() or filterData().
2025-05-06 13:51:49.185341 regionMatrix: processing chromosome_01
2025-05-06 13:51:49.185458 filterData: normalizing coverage
2025-05-06 13:51:49.188464 filterData: done normalizing coverage
2025-05-06 13:51:49.199611 filterData: originally there were 8225636 rows, now there are 0 rows. Meaning that 100 percent was filtered.
And finally:
2025-05-06 13:51:49.450268 findPeaks: chr=chromosome_01 find 0 peaks
(...)
Error in sprintf("Warning: 0 peak with width>%dnt (Read Length), please check the readlength (L) parameter!", :
invalid format '%d'; use format %s for character objects
I don't know if this is an issue with my BAM file or if there is a parameter I should be setting that I am not. Also I think it's worth saying that this is not human nor mice data. If anyone has suggestions as to why this could be happening/ how to solve it I would really appreciate them. Thank you in advance!
This is the relevant bit. Add your genome. Current link for this appears to be: https://www.bioconductor.org/packages/release/bioc/vignettes/GenomeInfoDb/inst/doc/Accept-organism-for-GenomeInfoDb.html