Hi,
I'm trying to edit LIONS to utilize more updated software for TE analysis. One of the steps was alignment through Tophat2, which I replaced with HiSAT2. I used the '--un' parameter on HiSAT2 to produce two output files: accepted_hits and unmapped. These two files are then sorted and merged with samtools.
A bit downstream I am converting the merged bam to wig, and am encountering a slew of errors resembling the following:
Ignoring SAM validation error: ERROR::READ_GROUP_NOT_FOUND:Record [X], Read name [XX], RG ID on SAMRecord not found in header: accepted_hits
Looking at the header, I don't see any @RG fields:
@PG ID:hisat2 PN:hisat2 VN:2.2.1 CL:"[dir]hisat2-align-s --wrapper basic-0 -p 8 --rna-strandness FR --secondary -x [dir] [dir] --passthrough --read-lengths 101"
@PG ID:samtools PN:samtools PP:hisat2 VN:1.15.1 CL:samtools sort -o accepted_hits.bam accepted_hits.sam
@PG ID:samtools-510607B0 PN:samtools VN:1.15.1 CL:samtools sort -o unmapped.bam unmapped.sam
@PG ID:samtools.1 PN:samtools PP:samtools VN:1.15.1 CL:[dir]samtools merge -r me_cfs.bam accepted_hits.bam unmapped.bam
@PG ID:samtools.2 PN:samtools PP:samtools.1 VN:1.15.1 CL:samtools view -h me_cfs.bam
It looks like bam2wig still runs, just ignores the errors, but it makes navigating the output an absolute pain. Any ideas on how I modify the header such that the @RG:Z: IDs (accepted_hits && unmapped) are represented?
Thank you so much for any assistance,
Synanth
Thank you so much for the resources!