STAR for bacterial genome
1
0
Entering edit mode
4.6 years ago

Hi Biostars,

Is it fine to use STAR for bacterial data (no splicing)? Any comments/suggestions are highly appreciated.

Thanks

STAR prokaryots • 3.7k views
4
Entering edit mode
4.6 years ago

It should work but you need to specify --alignIntronMax 1 to force STAR to avoid splice alignments.

Also during the genome index generation step you should put --genomeSAindexNbases to min(14, log2(GenomeLength)/2 - 1) if your bacterial genome of interest is relatively small. GenomeLength is in base pairs

0
Entering edit mode

Thanks, very helpful. Are you aware of any comparison (paper or post) between STAR and lets say bowtie2 or bwa in case of bacterial data?

1
Entering edit mode

No don't know any benchmarking for bacterial genomes

0
Entering edit mode

I've made quite a few comparisons, but haven't published them yet.

Briefly, bacterial genomes are quite easy to map to - so uniquely mapped reads don't differ much between bwa/bowtie2/STAR/etc. However STAR/hisat2 are a lot better in reporting multimappers, which could be interesting for highly repetitive genomes like Nesseria etc.

0
Entering edit mode

Do you have any experience with EDGE-pro, since that program should be made specifically for bacterial RNA-seq?

0
Entering edit mode

EDGE-pro is using bowite2, reporting up to 10 multimappers.

0
Entering edit mode

I think the formula might be over-estimating the value a bit. The number min(14, log2(GenomeLength)/2 - 1) for something like Ecoli/Salmonella should be approximately 10, yet sometimes I get segfaults with this setting. Changing it to 8 fixes everything.

0
Entering edit mode

Hi, Nicolas. What do you understand by "relatively small genomes"? I am dealing with data that ranges from 1,02 Mb to 9,56 Mb. Biologically, I've read publications that indicate 2 Mb as a common threshold but I'm guessing that the impact of the parameter won't attend to considerations of that nature when computing. Thanks in advance.