Question: STAR for chip-seq
2
gravatar for varsha619
2.5 years ago by
varsha61990
varsha61990 wrote:

Hello, Is STAR aligner recommended for use with ChIP-seq data? I am trying to use STAR for ChIP-seq data to obtain reads mapped to multiple regions of the genome with mismatch options, which STAR seems to do better than Bowtie2. I get only around 14% of reads mapped, and around 80% in "% of reads unmapped: too short". From the suggestions in the link - https://groups.google.com/forum/#!topic/rna-star/E_mKqm9jDm0, I tried --alignIntronMax 1 option but the results are similar. Please advise, thank you.

star chip-seq alignment • 3.3k views
ADD COMMENTlink modified 2.5 years ago by predeus1.4k • written 2.5 years ago by varsha61990

around 80% in "% of reads unmapped: too short".

What is the size distribution of reads in that pool (or this data in general)? If the reads are very short (< 30-40 bp, after scan/trim) then it may indeed be difficult to map them.

ADD REPLYlink written 2.5 years ago by genomax92k

@genomax, The average read size is 50-75bp

ADD REPLYlink written 2.5 years ago by varsha61990

Then @predeus' answer may not apply. You likely have a different problem. Have you checked a sampling of reads that do not map by blast? You could have some sort of contamination in your data.

ADD REPLYlink written 2.5 years ago by genomax92k
1

I concur with genomax. Did you run FastQC on the fastq files? It's likely that only about 18% of your reads are usable if both STAR and bowtie2 agree. Depending on what FastQC says, you may be able to rescue some more reads by adapter trimming.

ADD REPLYlink written 2.5 years ago by Friederike6.5k

I will check this, thank you for your help!

ADD REPLYlink written 2.5 years ago by varsha61990

can you post the entire command you're using and the log file output?

ADD REPLYlink written 2.5 years ago by Friederike6.5k

STAR --genomeDir /genomes/dm6/Sequence/STARindex --runThreadN 8 --readFilesIn in.fastq --outSAMtype BAM SortedByCoordinate --outFileNamePrefix star_out

ADD REPLYlink written 2.5 years ago by varsha61990
3
gravatar for predeus
2.5 years ago by
predeus1.4k
Russia
predeus1.4k wrote:

"too short" is STAR's euphemism for reads that just fail to align. What's the alignment rate you're getting with bowtie2? Chip-Seq is very tricky experimentally, so it happens quite often that libraries are full of adapter sequences etc. Aligners (as long as you are using a well-supported modern one, like bwa, bowtie2, or STAR) should not matter all that much.

Some types (e.g. H3K9me3) are also enriched for multimapping reads because these marks are enriched in heterochromatin.

ADD COMMENTlink written 2.5 years ago by predeus1.4k

Bowtie2 also gave me only 18% alignment but I was confused because the file sizes are not comparable. The bam file from Bowtie2 (1,035,494,925) is much larger than the one from STAR (275,497,682). P.S. It's fly genome, hence the smaller sizes.

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by varsha61990

Size discrepancy could just be due to bowtie2 including unaligned reads in the BAM vs STAR not doing that.

ADD REPLYlink written 2.5 years ago by genomax92k

@genomax, does Bowtie2 output unaligned reads by default? Even when I don't use --un option

ADD REPLYlink written 2.5 years ago by varsha61990
1

Since you did not use --no-unal your file must have unaligned reads. --un separates these reads in a new file.

ADD REPLYlink written 2.5 years ago by genomax92k

Ah I see, thank you!

ADD REPLYlink written 2.5 years ago by varsha61990
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1204 users visited in the last hour