Question: How to align RNA-seq data to a PolyA tail "genome"
gravatar for mb314
23 months ago by
mb31410 wrote:


I think my RBP of interest is binding to PolyA tails. I have fastq.gz files from a CLIP-seq experiment and I want to see if any of my reads have high polyA content. I tried using Star Aligner (version 2.5.3a) to index a very short sequence of A's and was successful:

STAR --runMode genomeGenerate --runThreadN 16 --genomeDir PolyA/star_index --genomeFastaFiles PolyA/PolyA.fasta --genomeSAindexNbases 2 --limitGenomeGenerateRAM 33000000000

I then tried aligning my fastq.gz to the indexed polyA genome. The script ran for 24 hrs then aborted. When I aligned to the human genome, it was done in less than 20 minutes. The code I used is below:

STAR --runMode alignReads \
--runThreadN 16 \
--genomeDir ../genomes/PolyA/star_index \
--genomeLoad LoadAndRemove \
--readFilesIn pathtomyfile.fastq.gz \
--readFilesCommand zcat \
--outFilterMultimapNmax 20 \
--outFileNamePrefix myfileout.bam \
--outSAMattributes All \
--outSAMtype BAM Unsorted \
--outFilterMismatchNmax 10

Does anyone have suggestions at to why this failed? or have other recommendations to see how much polyA content I have in my samples?

Thank you!

ADD COMMENTlink modified 23 months ago by h.mon32k • written 23 months ago by mb31410

I do not really see the point in doing that. Polyadenylation is a posttranscriptional modification, means special enzymes put the polyA tail to the pre-mRNA after transcription. The polyA is not part of the gene in the genome so alignment won't help you. I would rather use a dedicated trimming tool such as bbduk, trimmomatic or cutadapt to trim polyA tails (please use the search function), and then see how many % of the reads contained that pattern.

ADD REPLYlink modified 23 months ago • written 23 months ago by ATpoint44k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1721 users visited in the last hour