Question: miRNA fastqc sequence length distribution with UMI
0
gravatar for maria2019
9 months ago by
maria2019100
maria2019100 wrote:

I have single ended 75 bp miRNA reads (Quiagene miRNA kit) reads with UMI.

The fastqc report shows high peak at the 83-84 bp and illumina universal adaptor.

After removing the 5-3' adaptor ((5’-3’) AACTGTAGGCACCATCAAT) and also reads lower than 17bp with cutadapt, The sequence distribution peak is on 22-23.

I know that miRNA should be around 18-22 and UMI length 12. Doesn't it mean that I should see a peak around 30-34?

The code that I used was:

cutadapt -a AACTGTAGGCACCATCAAT --minimum-length 17 -o tri.fastq sample.fastq

ADD COMMENTlink modified 9 months ago by swbarnes28.1k • written 9 months ago by maria2019100
1
gravatar for swbarnes2
9 months ago by
swbarnes28.1k
United States
swbarnes28.1k wrote:

How did you process your fastq? bcl2fastq can be configured to remove the UMI from the read and put it in the read name; are you sure that wasn't done?

ADD COMMENTlink written 9 months ago by swbarnes28.1k

I believe not. The head of fastq file is as follow:

@NB551007:45:HNKVLBGX5:1:11101:18335:1071 1:N:0:GCCAAT CTGGANGCGAGCCAACTGTAGGCACCATCAATNCCGTGCCCTCNAGATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAAT + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEE#EAEEEEEEEE#EEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEE @NB551007:45:HNKVLBGX5:1:11101:5844:1072 1:N:0:GCCAAT CTGTANGCACCATCAATCGACGTGAACAGAGATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATATCTCGTATGCCGT + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEAEEE< AE/AEEEEEEEEEEEEE @NB551007:45:HNKVLBGX5:1:11101:23470:1072 1:N:0:GCCAAT CGTGGNGAGGAACAATTCTGAGAACTGTAGGCACCATCAATGAACTCGAACCCAGATCGGAAGAGCACACGTCTGAACTCCAGT + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEE @NB551007:45:HNKVLBGX5:1:11101:12496:1074 1:N:0:GCCAAT TCGCTNCGATCTATTGAAAGTCGGCCCTCGACACAAGGGTTTGTAACTGTAGGCACCATCAATTCCCTTATTGCCAGATCGGAA + AAAAA#EEAEAEEEEE6EEEEE/EEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEE

In a downstream analysis I want to use UMI-tools for deduplication. I should actually have the UMI name on the read name to be able to work on it. I searched and looks like I can use fastp to remove the UMI from the read and move it to the read name.

Now my question would be once I have done that, for the trimming with cutadapt, should I remove reads higher than say 40 bp? Just keep 17-40 reads?

ADD REPLYlink modified 9 months ago • written 9 months ago by maria2019100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1353 users visited in the last hour