Hi! I am working on a single cell ATAC sequencing project and am having an issue using samtools to split a bam file based on my wild-type and knock-out. The data came from 10X sequencing and they used the Cell Ranger pipeline for analysis. One analysis that the Cell Ranger conducted was t-sne which generates clusters based on similarity. Because different types of cells were used in the ATAC pipeline, similar cell types grouped together regardless of wild-type or knockout and a bam file was produced for each cluster. I would like to split these bam files to look at the variation within the cluster. I used samtools to convert the bam file to sam, and then split the knockout and wild-type files based on a tag line in the file. Now when I try to convert the split sam files back to bam, I keep getting this error.
samtools view -bS marrow_Cluster1_KO.sam > marrow_Cluster1_KO.bam
[W::sam_read1] Parse error at line 1 [main_samview] truncated file.
When I look through the entire
marrow_Cluster1_KO.sam file, it looks how it should. The head and tail or the file looks like this:
head -10 marrow_Cluster1_KO.sam 1112:@RG ID:A2.07,P2.24,A1.03,P1.03 SM:Barcode00086 1113:@RG ID:A2.08,P2.24,A1.10,P1.14 SM:Barcode00152 1114:@RG ID:A2.08,P2.16,A1.03,P1.08 SM:Barcode00191 1115:@RG ID:A2.08,P2.15,A1.09,P1.06 SM:Barcode00199 1116:@RG ID:A2.08,P2.09,A1.03,P1.24 SM:Barcode00248 tail -10 marrow_Cluster1_KO.sam 678439:NB551608:11:HVFMVBGX7:4:23502:4860:495 83 chr9 56881073 42 47M = 56881041 -79 AATCGCTTCCTTCGCGCTTCCGGGTTCCGCCTCGCTCAGAAACGGAC EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAAAAA MD:Z:47 XG:i:0 NM:i:0 XM:i:0 XN:i:0 XO:i:0 AS:i:0 YS:i:0 YT:Z:CP RG:Z:A2.07,P2.15,A1.10,P1.06 PG:Z:MarkDuplicates-6D71E14F 678440:NB551608:11:HVFMVBGX7:1:13105:7658:1885 99 chr9 56881081 42 47M = 56881248 214 CCTTCGCGCTTCCGGGTTCCGCCTCGCTCAGAAACGGACCGACAGAT
What can I do to fix this error?