Question: Problem mapping with STAR
0
gravatar for alvarocentron91
5 months ago by
alvarocentron910 wrote:

Hello, I'm mapping with three different programs: TopHat, HISAT2, and STAR. I'm using default values for all but STAR (I have short reads after trimming):

#TopHat
tophat -I 300 -i 20 $HOME/Doct2.0/Genomes/Ustilago/BowTie2_index/Ustilago ../ax3_1_paired.fastq ../ax3_2_paired.fastq

#HISAT2
hisat2 -p 18 -x /home/acenbro/Doct2.0/Genomes/Ustilago/HISAT2_index/Ustilago -1 ../ax3_1_paired.fastq -2 ../ax3_2_paired.fastq -S ax3_HISAT2.sam

#STAR
STAR --runThreadN 18 --genomeDir $HOME/Doct2.0/Genomes/Ustilago/STAR_index/ --readFilesIn $HOME/Doct2.0/Data/ax_3/ax3_1_paired.fastq $HOME/Doct2.0/Data/ax_3/ax3_1_paired.fastq --outFilterScoreMinOverLread 0 --outFilterMatchNminOverLread 0 --outFilterMatchNmin 40

When I do samtools flagstat in each .bam I get logic results for TopHat and HISAT however with STAR I got a huge difference in the number of reads 1 and reads 2 and a really little number of properly paired reads:

#TopHat
72862137 + 0 in total (QC-passed reads + QC-failed reads)
665978 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
72862137 + 0 mapped (100.00% : N/A)
72196159 + 0 paired in sequencing
36254310 + 0 read1
35941849 + 0 read2
66340500 + 0 properly paired (91.89% : N/A)
70583534 + 0 with itself and mate mapped
1612625 + 0 singletons (2.23% : N/A)
1236274 + 0 with mate mapped to a different chr
1211542 + 0 with mate mapped to a different chr (mapQ>=5)

#HISAT2
77017110 + 0 in total (QC-passed reads + QC-failed reads)
882162 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
76025120 + 0 mapped (98.71% : N/A)
76134948 + 0 paired in sequencing
38067474 + 0 read1
38067474 + 0 read2
64971276 + 0 properly paired (85.34% : N/A)
74832130 + 0 with itself and mate mapped
310828 + 0 singletons (0.41% : N/A)
1615256 + 0 with mate mapped to a different chr

#STAR
77920325 + 0 in total (QC-passed reads + QC-failed reads)
39755776 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
77920325 + 0 mapped (100.00% : N/A)
38164549 + 0 paired in sequencing
37728565 + 0 read1
435984 + 0 read2
834538 + 0 properly paired (2.19% : N/A)
834538 + 0 with itself and mate mapped
37330011 + 0 singletons (97.81% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

I read the manual and I think I'm not seeing something and I'm feeling a little bit frustrated so any help would be great guys,

thank you in advance.

hisat2 star tophat • 368 views
ADD COMMENTlink modified 5 months ago by WouterDeCoster36k • written 5 months ago by alvarocentron910

You really shouldn't be using TopHat anymore, at least use Tophat2. And even that one is replaced by HISAT.

ADD REPLYlink written 5 months ago by WouterDeCoster36k
4
gravatar for WouterDeCoster
5 months ago by
Belgium
WouterDeCoster36k wrote:

$HOME/Doct2.0/Data/ax_3/ax3_1_paired.fastq $HOME/Doct2.0/Data/ax_3/ax3_1_paired.fastq

You use the same fastq twice for STAR.

ADD COMMENTlink modified 5 months ago • written 5 months ago by WouterDeCoster36k

Ok, I was too focus on parameters and I did that dumb mistake, thank you! And about TopHat I know, but I was taught with it even if it is not used anymore and I used it as a control to know that I did ok with the other programs

ADD REPLYlink written 5 months ago by alvarocentron910
1

Not dumb - no te preocupes. We have all done it before. Best wishes / Hasta luego

ADD REPLYlink modified 5 months ago • written 5 months ago by Kevin Blighe37k

Now I have a problem visualizing the data in IGV, I'm getting several reads with huge insert sizes. I'm trying to solve it setting in -4 the --scoreInsOpen --scoreInsBase parameters, I think in this way I will penalize long inserts but I'm not sure due to my lack of knowledge in this field and don't find much information in the manual or internet, so If you can give me a tip on how to control the insert size would be great!

Sin_t_tulo

ADD REPLYlink modified 5 months ago • written 5 months ago by alvarocentron910
1

Not a good practice to ask unrelated questions in an existing thread. You should create a new question for this.

That said what do you mean by a tip on how to control the insert size?

ADD REPLYlink modified 5 months ago • written 5 months ago by genomax62k

Ok, I'll open a new thread. And I meant how to set a threshold for the maximum insert size

ADD REPLYlink written 5 months ago by alvarocentron910
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2134 users visited in the last hour