Problem with generating BAM files with STAR
Entering edit mode
2.7 years ago
gokberk ▴ 70

Hi all,

I've been generating BAM files using the STAR command below:

for ((i=102; i<=305; i++)); do ./STAR --readFilesIn fastq/E08/SRR2930$i.fastq --outSAMunmapped Within --outSAMtype BAM SortedByCoordinate --outSAMmultNmax 1 --genomeDir STARindex --twopassMode Basic --runThreadN 12 --outFileNamePrefix 2930$i; done

My fastq files are about 180mb and generated BAM files are about 75kb. I checked BAM files using samtools stats --split RG 2930107Aligned.sortedByCoord.out.bam | grep '^SN' and saw the following output for all BAM files:

SN  raw total sequences:    0
SN  filtered sequences: 0
SN  sequences:  0
SN  is sorted:  1
SN  1st fragments:  0
SN  last fragments: 0
SN  reads mapped:   0
SN  reads mapped and paired:    0   # paired-end technology bit set + both mates mapped
SN  reads unmapped: 0
SN  reads properly paired:  0   # proper-pair bit set
SN  reads paired:   0   # paired-end technology bit set
SN  reads duplicated:   0   # PCR or optical duplicate bit set
SN  reads MQ0:  0   # mapped and MQ=0
SN  reads QC failed:    0
SN  non-primary alignments: 0
SN  total length:   0   # ignores clipping
SN  bases mapped:   0   # ignores clipping
SN  bases mapped (cigar):   0   # more accurate
SN  bases trimmed:  0
SN  bases duplicated:   0
SN  mismatches: 0   # from NM fields
SN  error rate: 0.000000e+00    # mismatches / bases mapped (cigar)
SN  average length: 0
SN  maximum length: 30
SN  average quality:    0.0
SN  insert size average:    0.0
SN  insert size standard deviation: 0.0
SN  inward oriented pairs:  0
SN  outward oriented pairs: 0
SN  pairs with other orientation:   0
SN  pairs on different chromosomes: 0

Also, when I run samtools view 2930107Aligned.sortedByCoord.out.bam command, nothing shows up in terminal.

So, I was wondering what might be cause for the problem here, fastq files themselves or the code that generates BAM files? I know that the question is quite vague, sorry for that, but I'm not sure where to look at to solve the problem.

Edit: Thanks a lot for your helps. I've run a single file, STAR worked without any problems on terminal.

./STAR --readFilesIn fastq/E08/SRR2930160.fastq --outSAMunmapped Within --outSAMtype BAM SortedByCoordinate --outSAMmultNmax 1 --genomeDir STARindex --twopassMode Basic --runThreadN 12 --outFileNamePrefix 2930160
Apr 04 15:55:10 ..... started STAR run
Apr 04 15:55:10 ..... loading genome
Apr 04 15:55:26 ..... started 1st pass mapping
Apr 04 15:55:29 ..... finished 1st pass mapping
Apr 04 15:55:30 ..... inserting junctions into the genome indices
Apr 04 15:57:23 ..... started mapping
Apr 04 15:57:26 ..... finished mapping
Apr 04 15:57:27 ..... started sorting BAM
Apr 04 15:57:27 ..... finished successfully

And here is the log.

Edit 2: I asked about this issue to STAR developers on github and apparently my data is from Solid sequencer and STAR does not support that, here is a link to the issue. Thanks a lot for all comments though, cheers.

Thanks, Gökberk

STAR RNA-Seq • 2.7k views
Entering edit mode

My fastq files are about 180mb and generated BAM files are about 75kb

This does not make sense. Are files collectively 180Mb or each file is 180Mb? Look through the log files to see what is going on. Should be reasonably obvious.

Entering edit mode

Each fastq file is about 180mb and yes, as you said it doesn't make sense. Could it be that my index genome or something is problematic so that these BAM files are all corrupted?

Entering edit mode

Can you post sections from log that look like some sort of error?

Did you make STAR indexes yourself? Was there any error generated then or you did not specifically look?

Entering edit mode

I generated the index genome using STAR, but did not receive any errors or warnings while generating it. Here is a part of the log:

  STAR version=3.7.0f
    STAR compilation time,server,dir=Thu Mar 28 16:14:02 EDT 2019 vega:/home/dobin/data/STAR/STARcode/STAR.master/source
    ##### DEFAULT parameters:
    versionGenome                     2.7.0d
    parametersFiles                   -
    sysShell                          -
    runMode                           alignReads
    runThreadN                        1
    runDirPerm                        User_RWX
    runRNGseed                        777
    genomeDir                         ./GenomeDir/
    genomeLoad                        NoSharedMemory
    genomeFastaFiles                  -
    genomeChainFiles                  -
    genomeSAindexNbases               14
    genomeChrBinNbits                 18
    genomeSAsparseD                   1
    genomeSuffixLengthMax             18446744073709551615
    genomeFileSizes                   0
    genomeConsensusFile               -
    readFilesType                     Fastx
    readFilesIn                       Read1   Read2
    readFilesPrefix                   -
    readFilesCommand                  -
    readMatesLengthsIn                NotEqual
    readMapNumber                     18446744073709551615
    readNameSeparator                 /
    inputBAMfile                      -
    bamRemoveDuplicatesType           -
    bamRemoveDuplicatesMate2basesN    0
    limitGenomeGenerateRAM            31000000000
    limitIObufferSize                 150000000
    limitOutSAMoneReadBytes           100000
    limitOutSJcollapsed               1000000
    limitOutSJoneRead                 1000
    limitBAMsortRAM                   0
    limitSjdbInsertNsj                1000000
    limitNreadsSoft                   18446744073709551615
    outTmpDir                         -
    outTmpKeep                        None
    outStd                            Log
    outReadsUnmapped                  None
    outQSconversionAdd                0
    outMultimapperOrder               Old_2.4
    outSAMtype                        SAM
    outSAMmode                        Full
    outSAMstrandField                 None
    outSAMattributes                  Standard
    outSAMunmapped                    None
    outSAMorder                       Paired
    outSAMprimaryFlag                 OneBestScore
    outSAMreadID                      Standard
    outSAMmapqUnique                  255
    outSAMflagOR                      0
    outSAMflagAND                     65535
    outSAMattrRGline                  -
    outSAMheaderHD                    -
    outSAMheaderPG                    -
    outSAMheaderCommentFile           -
    outBAMcompression                 1
    outBAMsortingThreadN              0
    outBAMsortingBinsN                50
    outSAMfilter                      None
    outSAMmultNmax                    18446744073709551615
    outSAMattrIHstart                 1
    outSAMtlen                        1
    outSJfilterReads                  All
    outSJfilterCountUniqueMin         3   1   1   1
    outSJfilterCountTotalMin          3   1   1   1
    outSJfilterOverhangMin            30   12   12   12
    outSJfilterDistToOtherSJmin       10   0   5   10
    outSJfilterIntronMaxVsReadN       50000   100000   200000
    outWigType                        None
    outWigStrand                      Stranded
    outWigReferencesPrefix            -
    outWigNorm                        RPM
    outFilterType                     Normal
    outFilterMultimapNmax             10
    outFilterMultimapScoreRange       1
    outFilterScoreMin                 0
    outFilterScoreMinOverLread        0.66
    outFilterMatchNmin                0
    outFilterMatchNminOverLread       0.66
    outFilterMismatchNmax             10
    outFilterMismatchNoverLmax        0.3
    outFilterMismatchNoverReadLmax    1
    outFilterIntronMotifs             None
    outFilterIntronStrands            RemoveInconsistentStrands
    clip5pNbases                      0
    clip3pNbases                      0
    clip3pAfterAdapterNbases          0
    clip3pAdapterSeq                  -
    clip3pAdapterMMp                  0.1
Entering edit mode

So we will assume that your STAR index is properly made. Can you run a single file against this index and let us see what happens. Make sure you capture a log so we can look through it alignment fails. If log file is large you can post it on and paste that link here.

Entering edit mode

When in doubt, run 1 sample manually. STAR is not at fault, so if the output isn't correct it's because you've done something wrong.

Entering edit mode
2.7 years ago
h.mon 33k

As you have SOLiD reads, you need a colorspace aligner, you should probably use Subread - it is the only currently maintained aligner that supperts colorspace mapping, as far as I know. It is a bad idea converting colorspace to basespace, see Convert colorspace fastq to basespace fastq and references therein.


Login before adding your answer.

Traffic: 3311 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6