Question: Suggestion on Improving mRNA and sRNA mapping
0
gravatar for kamel
9 months ago by
kamel0
kamel0 wrote:

Hello friends I am new to the RNA-seq anaIyse have fastq files single end (1x50) of mRNA and sRNA (each file contains the reads of two organisms) ad and I want to study the expression of genes in each sample.

I made a mapping by STAR for the first sample train with the following command:

$ Linux_x86_64/STAR --runThreadN 12 --genomeDir  /index --sjdbOverhang 49 --readFilesIn /sample1.fastq.gz --readFilesCommand gunzip -c --outFileNamePrefix  /output -outSAMtype BAM SortedByCoordinate

but I got :

Uniquely mapped reads% | 58.83%

% of reads mapped to multiple loci | 38.77%

% of reads mapped to too many loci | 0.46%

I do not understand why I got this high percentage of reads mapped to multiple loci. Do you have an idea to improve the result of the mapping of mRNA in the STAR command that I used ???????

What is the best way to map sRNA reads to the reference genome ??

Thank you in advance

rna-seq alignment • 508 views
ADD COMMENTlink modified 7 months ago by ataulhaleem0 • written 9 months ago by kamel0

Start troubleshooting your reads. Check for rRNA contamination, and check which genes have a particularly high number of multi-mappers.

ADD REPLYlink written 9 months ago by h.mon21k

can you give me more precision for the methodology I'm going to do

ADD REPLYlink written 9 months ago by kamel0

Examine the alignments with IGV, or filter the multi-mappers with samtools (many methods for doing so, e.g. here, here, here and here) and examine the multi-mappers alignment with IGV.

Use bbduk with the ribokmers.fa.gz file to check for rRNA contamination.

ADD REPLYlink written 9 months ago by h.mon21k

Another question Plz. For sRNA mapping, I find somebody who selects reads from 18-30 and others from 18-26 before mapping. I want to know what size to select??

ADD REPLYlink written 9 months ago by kamel0

You have mentioned that each file contains reads from two organisms. What are these organisms? If their genomes are similar, then you will get high number of mulitmaps.

ADD REPLYlink written 9 months ago by grant.hovhannisyan1.3k

these two organisms are not similar, it is a human genome infected by a pathogenic bacterium and therefore the mRNA has been sequenced it contains reads of bacteria and genome huamain (of course reads the human genome is more than bacterial genome ). I do not know is it normal to have the multimapped reads ?? or for a study of expression must I ask for sequencing paired end with a lognueur more than 50bp?

ADD REPLYlink written 9 months ago by kamel0
0
gravatar for kamel
9 months ago by
kamel0
kamel0 wrote:

Another question Plz. For sRNA mapping, I find somebody who selects reads from 18-30 and others from 18-26 before mapping. I want to know what size to select??

ADD COMMENTlink written 9 months ago by kamel0

I don't think you need to select at all. Are your sRNAs from human or bacteria? If from bacteria then they should be long (up to ~150 nt)

ADD REPLYlink written 9 months ago by Asaf5.0k
0
gravatar for ataulhaleem
7 months ago by
ataulhaleem0 wrote:

sRNA less than 15bp dont make any sense. reads smaller than 15bp mostly might be part of RNA-degradom

ADD COMMENTlink written 7 months ago by ataulhaleem0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1374 users visited in the last hour