STAR for chip-seq
1
2
Entering edit mode
3.0 years ago
varsha619 ▴ 90

Hello, Is STAR aligner recommended for use with ChIP-seq data? I am trying to use STAR for ChIP-seq data to obtain reads mapped to multiple regions of the genome with mismatch options, which STAR seems to do better than Bowtie2. I get only around 14% of reads mapped, and around 80% in "% of reads unmapped: too short". From the suggestions in the link - https://groups.google.com/forum/#!topic/rna-star/E_mKqm9jDm0, I tried --alignIntronMax 1 option but the results are similar. Please advise, thank you.

star ChIP-Seq alignment • 4.0k views
ADD COMMENT
0
Entering edit mode

around 80% in "% of reads unmapped: too short".

What is the size distribution of reads in that pool (or this data in general)? If the reads are very short (< 30-40 bp, after scan/trim) then it may indeed be difficult to map them.

ADD REPLY
0
Entering edit mode

@genomax, The average read size is 50-75bp

ADD REPLY
0
Entering edit mode

Then @predeus' answer may not apply. You likely have a different problem. Have you checked a sampling of reads that do not map by blast? You could have some sort of contamination in your data.

ADD REPLY
1
Entering edit mode

I concur with genomax. Did you run FastQC on the fastq files? It's likely that only about 18% of your reads are usable if both STAR and bowtie2 agree. Depending on what FastQC says, you may be able to rescue some more reads by adapter trimming.

ADD REPLY
0
Entering edit mode

I will check this, thank you for your help!

ADD REPLY
0
Entering edit mode

can you post the entire command you're using and the log file output?

ADD REPLY
0
Entering edit mode

STAR --genomeDir /genomes/dm6/Sequence/STARindex --runThreadN 8 --readFilesIn in.fastq --outSAMtype BAM SortedByCoordinate --outFileNamePrefix star_out

ADD REPLY
3
Entering edit mode
3.0 years ago
predeus ★ 1.6k

"too short" is STAR's euphemism for reads that just fail to align. What's the alignment rate you're getting with bowtie2? Chip-Seq is very tricky experimentally, so it happens quite often that libraries are full of adapter sequences etc. Aligners (as long as you are using a well-supported modern one, like bwa, bowtie2, or STAR) should not matter all that much.

Some types (e.g. H3K9me3) are also enriched for multimapping reads because these marks are enriched in heterochromatin.

ADD COMMENT
0
Entering edit mode

Bowtie2 also gave me only 18% alignment but I was confused because the file sizes are not comparable. The bam file from Bowtie2 (1,035,494,925) is much larger than the one from STAR (275,497,682). P.S. It's fly genome, hence the smaller sizes.

ADD REPLY
0
Entering edit mode

Size discrepancy could just be due to bowtie2 including unaligned reads in the BAM vs STAR not doing that.

ADD REPLY
0
Entering edit mode

@genomax, does Bowtie2 output unaligned reads by default? Even when I don't use --un option

ADD REPLY
2
Entering edit mode

Since you did not use --no-unal your file must have unaligned reads. --un separates these reads in a new file.

ADD REPLY
0
Entering edit mode

Ah I see, thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2666 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6