Question: How to recreate FASTQ from BAM
0
gravatar for simplitia
4 months ago by
simplitia30
simplitia30 wrote:

Hi all I been testing a paired RNA.seq sample. I start with a paired of FASTQ files and do alignment with STAR and run STAR-Fusion. Everything looks good until I try to reverse the BAM files generated from STAR originally to fastq and rerun STAR-Fusion; this resulted in about half the fusions that as originally called. So making a mistake somewhere when converting from BAM to fastq. So far I tried several tools including samtools fastq, picard RevertSam to a ubam and then SamToFastq, I also tried bamtofastq but nothing works. Here is what I did with the original STAR alingment.

STAR --genomeDir /index/hg38.p5/ \
--readFilesIn T1_1.fq.gz T1_2.fq.gz \
--readFilesCommand zcat --outFileNamePrefix T1 --outSAMtype BAM SortedByCoordinate --quantMode GeneCounts --sjdbGTFfile /index/hg38.gtf

this is how the Bam looks like.

samtools flagstat T1Aligned.sortedByCoord.out.bam

121206267 + 0 in total (QC-passed reads + QC-failed reads)
11756519 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
121206267 + 0 mapped (100.00% : N/A)
109449748 + 0 paired in sequencing
54766664 + 0 read1
54683084 + 0 read2
109354332 + 0 properly paired (99.91% : N/A)
109354332 + 0 with itself and mate mapped
95416 + 0 singletons (0.09% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

Here is what I tried with samtools.

samtools sort -n --threads 4 ./T1Aligned.sortedByCoord.out.bam|samtools fastq -1 T1_R1_.fq -2 T1_R2_.fq -0 test.fq -

picard

/picard-tools-2.2.4/picard.jar RevertSam I=T1Aligned.sortedByCoord.out.bam O=T1.ubam QUIET=true VALIDATION_STRINGENCY=SILENT

/picard-tools-2.2.4/picard.jar SamToFastq INPUT=T1.ubam FASTQ=T1_R1_.fq SECOND_END_FASTQ=T2_R2_.fq VALIDATION_STRINGENCY=SILENT

I also tried bamtofastq but nothing seem to be able to convert the bam back to where I was able to call the fusions correctly. Any suggestions? Thanks!

sequencing rna-seq bam • 421 views
ADD COMMENTlink modified 4 months ago by ctseto50 • written 4 months ago by simplitia30
2

I rarely use STAR-Fusion, but I'd assume the reconstructed fastqs are missing unmapped reads from the normal alignment, which contain reads that are used to build chimeric and circular alignments. You might need to use --outSAMunmapped Within.

ADD REPLYlink modified 4 months ago • written 4 months ago by Eric Lim1.3k

this looks promising. I'm going try this and report back. Thanks.

ADD REPLYlink written 4 months ago by simplitia30

Cross-posted on SeqAnswers.

What you can try (and what is actually recommended) is to exclude everything that is not a primary alignment, using the flag -F 2304 in samtools fastq.

ADD REPLYlink modified 4 months ago by WouterDeCoster38k • written 4 months ago by ATpoint15k

It's been an eternity, but isn't it -f 64 for R1 and -f 128 for R2 ; per the sam specifications?

ADD REPLYlink written 4 months ago by ctseto50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1401 users visited in the last hour