Problem mapping with STAR
1
0
Entering edit mode
3.1 years ago

Hello, I'm mapping with three different programs: TopHat, HISAT2, and STAR. I'm using default values for all but STAR (I have short reads after trimming):

#TopHat
tophat -I 300 -i 20 $HOME/Doct2.0/Genomes/Ustilago/BowTie2_index/Ustilago ../ax3_1_paired.fastq ../ax3_2_paired.fastq #HISAT2 hisat2 -p 18 -x /home/acenbro/Doct2.0/Genomes/Ustilago/HISAT2_index/Ustilago -1 ../ax3_1_paired.fastq -2 ../ax3_2_paired.fastq -S ax3_HISAT2.sam #STAR STAR --runThreadN 18 --genomeDir$HOME/Doct2.0/Genomes/Ustilago/STAR_index/ --readFilesIn $HOME/Doct2.0/Data/ax_3/ax3_1_paired.fastq$HOME/Doct2.0/Data/ax_3/ax3_1_paired.fastq --outFilterScoreMinOverLread 0 --outFilterMatchNminOverLread 0 --outFilterMatchNmin 40


When I do samtools flagstat in each .bam I get logic results for TopHat and HISAT however with STAR I got a huge difference in the number of reads 1 and reads 2 and a really little number of properly paired reads:

#TopHat
665978 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
72862137 + 0 mapped (100.00% : N/A)
72196159 + 0 paired in sequencing
66340500 + 0 properly paired (91.89% : N/A)
70583534 + 0 with itself and mate mapped
1612625 + 0 singletons (2.23% : N/A)
1236274 + 0 with mate mapped to a different chr
1211542 + 0 with mate mapped to a different chr (mapQ>=5)

#HISAT2
882162 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
76025120 + 0 mapped (98.71% : N/A)
76134948 + 0 paired in sequencing
64971276 + 0 properly paired (85.34% : N/A)
74832130 + 0 with itself and mate mapped
310828 + 0 singletons (0.41% : N/A)
1615256 + 0 with mate mapped to a different chr

#STAR
39755776 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
77920325 + 0 mapped (100.00% : N/A)
38164549 + 0 paired in sequencing
834538 + 0 properly paired (2.19% : N/A)
834538 + 0 with itself and mate mapped
37330011 + 0 singletons (97.81% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)


I read the manual and I think I'm not seeing something and I'm feeling a little bit frustrated so any help would be great guys,

STAR TopHat HISAT2 • 1.0k views
0
Entering edit mode

You really shouldn't be using TopHat anymore, at least use Tophat2. And even that one is replaced by HISAT.

4
Entering edit mode
3.1 years ago

$HOME/Doct2.0/Data/ax_3/ax3_1_paired.fastq$HOME/Doct2.0/Data/ax_3/ax3_1_paired.fastq

You use the same fastq twice for STAR.

0
Entering edit mode

Ok, I was too focus on parameters and I did that dumb mistake, thank you! And about TopHat I know, but I was taught with it even if it is not used anymore and I used it as a control to know that I did ok with the other programs

1
Entering edit mode

Not dumb - no te preocupes. We have all done it before. Best wishes / Hasta luego

0
Entering edit mode

Now I have a problem visualizing the data in IGV, I'm getting several reads with huge insert sizes. I'm trying to solve it setting in -4 the --scoreInsOpen --scoreInsBase parameters, I think in this way I will penalize long inserts but I'm not sure due to my lack of knowledge in this field and don't find much information in the manual or internet, so If you can give me a tip on how to control the insert size would be great!

1
Entering edit mode

Not a good practice to ask unrelated questions in an existing thread. You should create a new question for this.

That said what do you mean by a tip on how to control the insert size?

0
Entering edit mode

Ok, I'll open a new thread. And I meant how to set a threshold for the maximum insert size