Question: How to analyze CAGE-Seq data?
0
gravatar for heir_of_isildur88
20 months ago by
heir_of_isildur8810 wrote:

Hi all,

I'm now 6 months into the field of NGS and analysis of sequencing data. I have been working on RNA-Seq data and recently, just started to venture into CAGE-Seq data.

I wanted to ask how do we actually map CAGE-Seq data? We did a paired-end sequencing for the CAGE data and then got the fastq files. After cleaning, I got the clean reads files for read1 and read2 but both of them are of different size. When I run them on STAR, it said that mapping could not be done as the run finished for 1 read while the other 1 is still not.

Is this normal for CAGE-Seq data? Or should we just map read1 only as we are only interested in the TSS i.e. reads seq from 5' end?

I am a bit confused how to process CAGE data here.

Please give some guidance & advice. Thank you very much.

cage • 977 views
ADD COMMENTlink modified 20 months ago by Charles Plessy2.6k • written 20 months ago by heir_of_isildur8810

After cleaning, I got the clean reads files for read1 and read2 but both of them are of different size.

Can you elaborate on the "cleaning" part?
And do you mean different read lengths or different number of reads in R1 vs R2?

ADD REPLYlink written 20 months ago by WouterDeCoster35k

Cleaning is where I trimmed off 4 basepairs off the reads which correspond to the index of the samples they represent.

Yes, I get different number of reads for R1 & R2.

ADD REPLYlink written 20 months ago by heir_of_isildur8810

Please post names and versions of the programs you used, and also the exact commands. You should clean and map R1+R2 as paired files, i. e., simultaneously and keeping proper pair information.

ADD REPLYlink written 20 months ago by h.mon21k

Here's the reads processing before mapping...

  1. Index identification of samples - using custom perl file

read_skipper.pl R1_step1.fq CAC

  1. Trim away the index

fastx_trimmer -f 4 -i R1_step1.fq -o R1_trimmed.fq -Q33

  1. Using perl file to remove reads with Q<20

perl ../IndexQuality_CAGE_20.pl R1_trimmed.fq R1_trimmed.fq I.fq R1_20.fq R1_20.2.fq I_20.fq

  1. Reads cleaning using QCleaner (I have to check what does this clean as it's in Japanese)

qcleaner_renew_v3.1.pl --i ./R1_step1_skip.fq --o R1_clean.fastq --log qclog.txt

qcleaner_renew_v3.1.pl --i ./Undetermined_S0_L001_R2_001.fastq --o R2_clean.fastq --log qclog.txt

ADD REPLYlink modified 20 months ago • written 20 months ago by heir_of_isildur8810

fastx does not preserve pairing, use Trimmomatic or BBDuk do trim adapters and low quality.

ADD REPLYlink written 20 months ago by h.mon21k

Thank you for your suggestion. I will try it out and see if it works.

ADD REPLYlink written 20 months ago by heir_of_isildur8810
2
gravatar for Charles Plessy
20 months ago by
Charles Plessy2.6k
Japan
Charles Plessy2.6k wrote:

If your CAGE data is paired-end, then I recommend to align it paired end, and to only transform it to TSS positions at the end.

Here is a toy example on how to process CAGE data (the nanoCAGE variant, which can be sequenced paired-end).

And here is a preprint showing more or less the same on a different dataset with a different workflow system.

Recent versions of CAGEr can load paired-end CAGE data in BAM or BED format.

ADD COMMENTlink written 20 months ago by Charles Plessy2.6k

How do you transform aligned reads to TSS positions?

Thank you very much for your references.

ADD REPLYlink written 20 months ago by heir_of_isildur8810

For paired-end data my favourite approach is to convert paired alignments from BAM format, where each mate is represented on separate lines, to BED12 format, where each pair is on one line, using the pairedBamToBed12 tool. The 5′ end of the BED entries is the CAGE TSS. CAGEr supports loading data in BAM, BED, and other formats. I recommend you to read its vignette.

ADD REPLYlink modified 20 months ago • written 20 months ago by Charles Plessy2.6k

Thank you very much! I will try it out and see if it work out for my data.

ADD REPLYlink written 20 months ago by heir_of_isildur8810
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 852 users visited in the last hour