Question: How to recreate FASTQ from BAM
0
gravatar for simplitia
23 months ago by
simplitia40
simplitia40 wrote:

Hi all I been testing a paired RNA.seq sample. I start with a paired of FASTQ files and do alignment with STAR and run STAR-Fusion. Everything looks good until I try to reverse the BAM files generated from STAR originally to fastq and rerun STAR-Fusion; this resulted in about half the fusions that as originally called. So making a mistake somewhere when converting from BAM to fastq. So far I tried several tools including samtools fastq, picard RevertSam to a ubam and then SamToFastq, I also tried bamtofastq but nothing works. Here is what I did with the original STAR alingment.

STAR --genomeDir /index/hg38.p5/ \
--readFilesIn T1_1.fq.gz T1_2.fq.gz \
--readFilesCommand zcat --outFileNamePrefix T1 --outSAMtype BAM SortedByCoordinate --quantMode GeneCounts --sjdbGTFfile /index/hg38.gtf

this is how the Bam looks like.

samtools flagstat T1Aligned.sortedByCoord.out.bam

121206267 + 0 in total (QC-passed reads + QC-failed reads)
11756519 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
121206267 + 0 mapped (100.00% : N/A)
109449748 + 0 paired in sequencing
54766664 + 0 read1
54683084 + 0 read2
109354332 + 0 properly paired (99.91% : N/A)
109354332 + 0 with itself and mate mapped
95416 + 0 singletons (0.09% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

Here is what I tried with samtools.

samtools sort -n --threads 4 ./T1Aligned.sortedByCoord.out.bam|samtools fastq -1 T1_R1_.fq -2 T1_R2_.fq -0 test.fq -

picard

/picard-tools-2.2.4/picard.jar RevertSam I=T1Aligned.sortedByCoord.out.bam O=T1.ubam QUIET=true VALIDATION_STRINGENCY=SILENT

/picard-tools-2.2.4/picard.jar SamToFastq INPUT=T1.ubam FASTQ=T1_R1_.fq SECOND_END_FASTQ=T2_R2_.fq VALIDATION_STRINGENCY=SILENT

I also tried bamtofastq but nothing seem to be able to convert the bam back to where I was able to call the fusions correctly. Any suggestions? Thanks!

sequencing rna-seq bam • 2.1k views
ADD COMMENTlink modified 23 months ago by ctseto280 • written 23 months ago by simplitia40
2

I rarely use STAR-Fusion, but I'd assume the reconstructed fastqs are missing unmapped reads from the normal alignment, which contain reads that are used to build chimeric and circular alignments. You might need to use --outSAMunmapped Within.

ADD REPLYlink modified 23 months ago • written 23 months ago by Eric Lim1.7k

this looks promising. I'm going try this and report back. Thanks.

ADD REPLYlink written 23 months ago by simplitia40

Cross-posted on SeqAnswers.

What you can try (and what is actually recommended) is to exclude everything that is not a primary alignment, using the flag -F 2304 in samtools fastq.

ADD REPLYlink modified 23 months ago by WouterDeCoster44k • written 23 months ago by ATpoint41k

It's been an eternity, but isn't it -f 64 for R1 and -f 128 for R2 ; per the sam specifications?

ADD REPLYlink written 23 months ago by ctseto280
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1494 users visited in the last hour