I'm analyzing RNA-seq data.
I did a mapping of RNA-seq data to the reference sequence (hg19) and got a bam file. To confirmed mapped reads, I saw the bam file by using IGV. It seems that there are many green reads in IGV (about 20-30% of mapped reads were green ).
I know that the green reads indicate tandem duplications or translocations. However, I don't understand why there were many green reads in my data set.
I'm worrying that this might be caused by my wrong operation for RNA-seq data. Could you give me advice about it?
My pipeline of RNA-seq was as below. (1) Trimmomatic exclude adapter sequences and low-quality bases from my fastq files. (2) Tophat2 mapped my reads to the reference sequence (hg19).