hi im working on RNA-seq data of honey bee apis.mellifera.mellifera for trim the data i am using trimmomatic and it is the resaults :
single end https://www.dropbox.com/s/wqqxav4qxy3sk1e/single-end.zip?dl=0
paired end1 https://www.dropbox.com/s/b7rfgqgee3ds9u3/SRR1571716-1.trimmed_fastqc.zip?dl=0
paired end2 https://www.dropbox.com/s/5zqc4sgg3ccrtcr/SRR1571716-2.trimmed_fastqc.zip?dl=0
because reverse has bad quality im using forward only (single) and script tophat2.1.1:
single : tophat2 -p 4 -G anotation.gtf -o output indexed-refrence.fa trimmed1.fastq.gz
paired : tophat2 -p 4 -r 200 --mate-std-dev 50 -G anotation.gtf -o output indexed-refrence.fa trimmed1.fastq.gz trimmed2.fastq.gz
and tophat output:
single : Reads: Input : 10858264 Mapped : 4083998 (37.6% of input) of these: 53299 ( 1.3%) have multiple alignments (16 have >20) 37.6% overall read mapping rate.
paired : Left reads: Input : 8066646 Mapped : 3376613 (41.9% of input) of these: 41967 ( 1.2%) have multiple alignments (61 have >20) Right reads: Input : 8066646 Mapped : 4488196 (55.6% of input) of these: 55841 ( 1.2%) have multiple alignments (66 have >20) 48.7% overall read mapping rate.
Aligned pairs: 3164464 of these: 39807 ( 1.3%) have multiple alignments 9497 ( 0.3%) are discordant alignments 39.1% concordant pair alignment rate.
why mapping is so low in each condition ?
Any time you start seeing low alignments, the first thing to do is to take a set of reads that are not aligning and blast them at NCBI to make sure there is no issue with data (e.g. contamination).
thank you for your advise ...
may I ask why you do not use HISAT2?
thank you for your advise ...