Question: BAM -> FASTQ Conversion of CCLE data for STAR-Fusion. Filtering Steps?
gravatar for denis.k
9 months ago by
denis.k10 wrote:

Hey everyone,

I'm pretty new to RNA-seqencing and was wondering if anyone could help me out. I am trying to run a variety of SV callers (STAR-Fusion, etc.) on data from the CCLE (

Most SV Callers require .fastq files but all the data I have downloaded is in BAM format. Here are some more details:

Firstly, the BAM files are coordinate sorted, and after realizing that they needed to be sorted by name in order for the paired fastq files to be created correctly, I sorted all files by name

I am using Samtools 1.9.

samtools sort -n infile.bam outfile_sorted.bam


samtools fastq -1 outfile_sorted_1.fastq.gz -2 outfile_sorted_2.fastq.gz outfile_sorted.bam

Is this process enough in order to feed the .fastq reads into the SV caller? I figured if I filtered out any non-primary reads, that the reads corresponding to fusions would also be filtered out. I'm seeing a LOT of duplicated sequences in my QC reports but I figured that wasn't a problem. I just wanted to make sure that I wasn't keeping a bunch of artifiacts in my .fastq files and potentially making my whole project useless.

rna sequence gene • 294 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by denis.k10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1812 users visited in the last hour