Hi
I have a fastq with real data and I have an issue with some fnames for instance this one 'A01136:872:HVL25DSXC:1:1239:4842:26788'
grep -n -A 3 A01136:872:HVL25DSXC:1:1239:4842:26788 wgs_S12966Nr2*
wgs_S12966Nr2.1.fastq:8620669:@A01136:872:HVL25DSXC:1:1239:4842:26788 1:N:0:CGTAACAGAA+CCGCTATAGA wgs_S12966Nr2.1.fastq-8620670-TTCCATTCCAATCGAGTTGATTCCATTCCATTCCATTCCATTCCATTCCACTCCATTCCAGTCCTTTCCATTCCATTCCACTCGGGTTGATTCCAATGTAT wgs_S12966Nr2.1.fastq-8620671-+ wgs_S12966Nr2.1.fastq-8620672-,FFFFFFFFFFFFFFF:FFF:FFFFFFFF,FFFFFFFFFFFF:F:FFFFFFF:F:FFFFFF:FFFFFFFF:FF,FFFF,FFFFFFFFFFFFFF:FFFFFFF
wgs_S12966Nr2.2.fastq:8620669:@A01136:872:HVL25DSXC:1:1239:4842:26788 2:N:0:CGTAACAGAA+CCGCTATAGA wgs_S12966Nr2.2.fastq-8620670-AATGGAATGGAATGGAATGCAAAGCAATGGAATCAACTCGATTGCAATGGAATGGAATGGAATGGAAAGGAATACATTGGAATCAACCCGAGTGGAATGGA wgs_S12966Nr2.2.fastq-8620671-+ wgs_S12966Nr2.2.fastq-8620672-FFFFFFFFFFF:FFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F,FF:FFFFFFFFFFFFFFFFFFF:FFFFFF,FF:FFFFFFFFFFFFFFFF
When I run the parabricks fq2bam
docker run --gpus all -v /data:/data nvcr.io/nvidia/clara/clara-parabricks:4.1.0-1 pbrun fq2bam \ --in-fq /data/in/wgs_S12966Nr2.1.fastq.gz data/in/wgs_S12966Nr2.2.fastq.gz \ --ref /data/grch38/GRCh38.p13.genome.fa \
I get this result for 'A01136:872:HVL25DSXC:1:1239:4842:26788' (5 lines)
A01136:872:HVL25DSXC:1:1239:4842:26788 2177 chr20 31063116 0 45H56M GL000216.2 26860 0 AATGGAATGGAATGGAATGGAAAGGAATACATTGGAATCAACCCGAGTGGAATGGA FFFF:F,FF:FFFFFFFFFFFFFFFFFFF:FFFFFF,FF:FFFFFFFFFFFFFFFF SA:Z:KI270442.1,92368,+,67M34S,0,5; XA:Z:chr10,+41905845,45S27M5I24M,6;ML143345.1,-18702,22M5I29M45S,6;chr4,-49095042,22M5I29M45S,6;ML143345.1,-20321,22M5I29M45S,6;chr4,-49093423,22M5I29M45S,6; MD:Z:22T5G0G26 PG:Z:MarkDuplicates RG:Z:HVL25DSXC.1 NM:i:3 AS:i:41 XS:i:35 A01136:872:HVL25DSXC:1:1239:4842:26788 2129 chr20 31063146 0 4H32M65H KI270442.1 92368 0 ATTGGAATCAACCCGAGTGGAATGGAATGGAA FFF:FFFFFFFFFFFFFF,FFFF,FF:FFFFF SA:Z:GL000216.2,26860,+,19S61M21S,0,1;ML143354.1,152198,-,68S33M,0,0; MD:Z:32 PG:Z:MarkDuplicates RG:Z:HVL25DSXC.1 NM:i:0 AS:i:32 XS:i:0 A01136:872:HVL25DSXC:1:1239:4842:26788 1089 GL000216.2 26860 0 19S61M21S KI270442.1 92368 0 TTCCATTCCAATCGAGTTGATTCCATTCCATTCCATTCCATTCCATTCCACTCCATTCCAGTCCTTTCCATTCCATTCCACTCGGGTTGATTCCAATGTAT ,FFFFFFFFFFFFFFF:FFF:FFFFFFFF,FFFFFFFFFFFF:F:FFFFFFF:F:FFFFFF:FFFFFFFF:FF,FFFF,FFFFFFFFFFFFFF:FFFFFFF SA:Z:ML143354.1,152198,-,68S33M,0,0;chr20,31063146,-,4S32M65S,0,0; MD:Z:41T19 PG:Z:MarkDuplicates RG:Z:HVL25DSXC.1 NM:i:1 AS:i:56 XS:i:53 A01136:872:HVL25DSXC:1:1239:4842:26788 2129 ML143354.1 152198 0 68H33M KI270442.1 92368 0 GAATGGAATGGAATCAACTCGATTGGAATGGAA FFF,FFFFFFFF:FFF:FFFFFFFFFFFFFFF, SA:Z:GL000216.2,26860,+,19S61M21S,0,1;chr20,31063146,-,4S32M65S,0,0; XA:Z:chr10,-38515370,68S33M,0; MD:Z:33 PG:Z:MarkDuplicates RG:Z:HVL25DSXC.1 NM:i:0 AS:i:33 XS:i:33 A01136:872:HVL25DSXC:1:1239:4842:26788 1153 KI270442.1 92368 0 67M34S GL000216.2 26860 0 AATGGAATGGAATGGAATGCAAAGCAATGGAATCAACTCGATTGCAATGGAATGGAATGGAATGGAAAGGAATACATTGGAATCAACCCGAGTGGAATGGA FFFFFFFFFFF:FFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F,FF:FFFFFFFFFFFFFFFFFFF:FFFFFF,FF:FFFFFFFFFFFFFFFF SA:Z:chr20,31063116,+,45S56M,0,3; MD:Z:14C4G2T1G19G22 PG:Z:MarkDuplicates RG:Z:HVL25DSXC.1 NM:i:5 AS:i:42 XS:i:41
But when I run bowtie2 using this command
bowtie2 -x grch38.p13 -1 wgs_S12966Nr2.1.fastq -2 wgs_S12966Nr2.2.fastq -S wgs_S12966Nr2_aligned.sam
I get this result (2 lines that miss that info that pbrun has)
A01136:872:HVL25DSXC:1:1239:4842:26788 77 0 0 0 0 TTCCATTCCAATCGAGTTGATTCCATTCCATTCCATTCCATTCCATTCCACTCCATTCCAGTCCTTTCCATTCCATTCCACTCGGGTTGATTCCAATGTAT ,FFFFFFFFFFFFFFF:FFF:FFFFFFFF,FFFFFFFFFFFF:F:FFFFFFF:F:FFFFFF:FFFFFFFF:FF,FFFF,FFFFFFFFFFFFFF:FFFFFFF YT:Z:UP A01136:872:HVL25DSXC:1:1239:4842:26788 141 0 0 0 0 AATGGAATGGAATGGAATGCAAAGCAATGGAATCAACTCGATTGCAATGGAATGGAATGGAATGGAAAGGAATACATTGGAATCAACCCGAGTGGAATGGA FFFFFFFFFFF:FFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F,FF:FFFFFFFFFFFFFFFFFFF:FFFFFF,FF:FFFFFFFFFFFFFFFF YT:Z:UP
Other fname we checked are equals the reason why reacted on the result pbrun was that it was 5 lines in the result file parabricks fq2bam.
Is this a bug in bowtie2 or is parabricks fq2bam making up the results ?
Thanks
Lars (new to Bioinformatics)