Question: miRNA fastqc sequence length distribution with UMI
0
gravatar for maria2019
11 days ago by
maria201950
maria201950 wrote:

I have single ended 75 bp miRNA reads (Quiagene miRNA kit) reads with UMI.

The fastqc report shows high peak at the 83-84 bp and illumina universal adaptor.

After removing the 5-3' adaptor ((5’-3’) AACTGTAGGCACCATCAAT) and also reads lower than 17bp with cutadapt, The sequence distribution peak is on 22-23.

I know that miRNA should be around 18-22 and UMI length 12. Doesn't it mean that I should see a peak around 30-34?

The code that I used was:

cutadapt -a AACTGTAGGCACCATCAAT --minimum-length 17 -o tri.fastq sample.fastq

ADD COMMENTlink modified 11 days ago by swbarnes26.7k • written 11 days ago by maria201950
1
gravatar for swbarnes2
11 days ago by
swbarnes26.7k
United States
swbarnes26.7k wrote:

How did you process your fastq? bcl2fastq can be configured to remove the UMI from the read and put it in the read name; are you sure that wasn't done?

ADD COMMENTlink written 11 days ago by swbarnes26.7k

I believe not. The head of fastq file is as follow:

@NB551007:45:HNKVLBGX5:1:11101:18335:1071 1:N:0:GCCAAT CTGGANGCGAGCCAACTGTAGGCACCATCAATNCCGTGCCCTCNAGATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAAT + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEE#EAEEEEEEEE#EEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEE @NB551007:45:HNKVLBGX5:1:11101:5844:1072 1:N:0:GCCAAT CTGTANGCACCATCAATCGACGTGAACAGAGATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATATCTCGTATGCCGT + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEAEEE< AE/AEEEEEEEEEEEEE @NB551007:45:HNKVLBGX5:1:11101:23470:1072 1:N:0:GCCAAT CGTGGNGAGGAACAATTCTGAGAACTGTAGGCACCATCAATGAACTCGAACCCAGATCGGAAGAGCACACGTCTGAACTCCAGT + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEE @NB551007:45:HNKVLBGX5:1:11101:12496:1074 1:N:0:GCCAAT TCGCTNCGATCTATTGAAAGTCGGCCCTCGACACAAGGGTTTGTAACTGTAGGCACCATCAATTCCCTTATTGCCAGATCGGAA + AAAAA#EEAEAEEEEE6EEEEE/EEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEE

In a downstream analysis I want to use UMI-tools for deduplication. I should actually have the UMI name on the read name to be able to work on it. I searched and looks like I can use fastp to remove the UMI from the read and move it to the read name.

Now my question would be once I have done that, for the trimming with cutadapt, should I remove reads higher than say 40 bp? Just keep 17-40 reads?

ADD REPLYlink modified 11 days ago • written 11 days ago by maria201950
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1303 users visited in the last hour