Question: Trimming single end reads for STAR?
2
gravatar for caggtaagtat
17 months ago by
caggtaagtat700
caggtaagtat700 wrote:

Hi,

I just started to work with single end reads, which are already trimmed for adapter sequences and quality. Do I have to trimm the reads now to the same length of e.g. 100nt for mapping them with STAR? Is there a negative effect, if I don't?

rna-seq star trimming • 1.5k views
ADD COMMENTlink modified 17 months ago by h.mon27k • written 17 months ago by caggtaagtat700
4
gravatar for grant.hovhannisyan
17 months ago by
grant.hovhannisyan1.7k wrote:

If the qualities are ok and there are no adapters you can proceed with mapping. There is a recent paper about trimming of RNAseq data and its possible consequence on downstream analysis - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4766705/

ADD COMMENTlink written 17 months ago by grant.hovhannisyan1.7k

Thank you! I will proceed with the mapping than.

ADD REPLYlink written 17 months ago by caggtaagtat700
2
gravatar for h.mon
17 months ago by
h.mon27k
Brazil
h.mon27k wrote:

If they are already trimmed for adapters and quality, don't trim more. Trimming will make sequences shorter, and shorter sequences tend to map more to multiple locations.

What is the length range of your reads? I generally keep reads only within a certain range, and discard the shorter reads. For example, for a 100bp dataset, I keep reads from 70-100bp after trimming, and discard the rest.

ADD COMMENTlink written 17 months ago by h.mon27k

That makes sense! My reads are 40-155nt long.

Here is a plot of the percentage I would discard vs the possible minimal read length. Would a minimal length of 80nt be appropriate?

https://ibb.co/gLZ7q7

ADD REPLYlink modified 17 months ago • written 17 months ago by caggtaagtat700
1

80 seems reasonable. What is the organism? Also, if you used trimmomatic for trimming then it has an option to remove trimmed reads shorter than given value.

ADD REPLYlink written 17 months ago by grant.hovhannisyan1.7k

Ok thank you. The reads were obtained from human cardiovascular endothelial cells. Thank you, I was going to use trimmomatic :)

ADD REPLYlink written 17 months ago by caggtaagtat700

50bp should be fine for counting applications for human genome. You may be throwing good data away by being too strict.

ADD REPLYlink written 17 months ago by genomax70k

Ok, but since I do analysis of alternative splicing, I will stick with a minimal lenght of 75nts for now. I read somewhere in this forum, that reads schould not be shorter than 70nt for isoform analysis

ADD REPLYlink written 17 months ago by caggtaagtat700

That sounds reasonable. Curious why you did not choose to do paired-end sequencing to get spatial information in that case.

ADD REPLYlink written 17 months ago by genomax70k

I was told that using single-end sequencing would be better for doing splicing analysis, althoug I can't remember why . Besides, I was not included in that desicion and would maybe also guess financial reasons ;)

ADD REPLYlink written 17 months ago by caggtaagtat700

Sufficient makes sense rather than better. The financial reason angle is always critical :-)

ADD REPLYlink written 17 months ago by genomax70k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1117 users visited in the last hour