Entering edit mode
                    6.9 years ago
        610225668
        
    
        •
    
    0
    Hi,all It was the first time for me to map RNA sequence. The data generated from corals .I used STAR to map the sequence to the reference. I used the default parameter but got a terrible result. The final mapping result was
                             Started job on |       Dec 12 16:02:44
                         Started mapping on |       Dec 12 16:03:13
                                Finished on |       Dec 12 16:12:34
   Mapping speed, Million of reads per hour |       85.99
                      Number of input reads |       13400813
                  Average input read length |       150
                                UNIQUE READS:
               Uniquely mapped reads number |       3114
                    Uniquely mapped reads % |       0.02%
                      Average mapped length |       124.34
                   Number of splices: Total |       41
        Number of splices: Annotated (sjdb) |       2
                   Number of splices: GT/AG |       24
                   Number of splices: GC/AG |       4
                   Number of splices: AT/AC |       0
           Number of splices: Non-canonical |       13
                  Mismatch rate per base, % |       4.13%
                     Deletion rate per base |       0.03%
                    Deletion average length |       1.86
                    Insertion rate per base |       0.01%
                   Insertion average length |       1.47
                         MULTI-MAPPING READS:
    Number of reads mapped to multiple loci |       1505
         % of reads mapped to multiple loci |       0.01%
    Number of reads mapped to too many loci |       47
         % of reads mapped to too many loci |       0.00%
                              UNMAPPED READS:
   % of reads unmapped: too many mismatches |       0.00%
             % of reads unmapped: too short |       99.96%
                 % of reads unmapped: other |       0.00%
                              CHIMERIC READS:
                   Number of chimeric reads |       0
                        % of chimeric reads |       0.00%
Is there any idea about the too many unmapped reads? I didn't understand what the reason 'too short' mean. Can somebody explain it?Thanks!
Could you send your data to a pre-processing software like fastqc
What are your reads length ?
What is you command line to align ?
% of reads unmapped: too shortcan mean two things with STAR :Too short means too short alignment. Are you sure you use the right reference?
In fact, I have nine types of coral,and I chosen five of these to build the reference index independently.But unfortunately, the results were similar
Can you elaborate on this ? Are you just using short contigs as a reference (please give stats, like using bbmaps stats.sh) ? Are you aligning against a single species ?
Have you tried bwa-mem or minimap2 to check their mapping rates for general info ? Have you ever tried alignments to these references before ?
I've notice that older STAR versions have issues with PE-reads having too much of an overlap.
If you've got PE data, try only R1 first. Otherwise, check your FastQC reports, as Batien mentioned, for adapter-contamination or overrepresented sequences indicating other contaminations.