miRNA fastqc sequence length distribution with UMI
1
1
Entering edit mode
4.5 years ago
maria2019 ▴ 250

I have single ended 75 bp miRNA reads (Quiagene miRNA kit) reads with UMI.

The fastqc report shows high peak at the 83-84 bp and illumina universal adaptor.

After removing the 5-3' adaptor ((5’-3’) AACTGTAGGCACCATCAAT) and also reads lower than 17bp with cutadapt, The sequence distribution peak is on 22-23.

I know that miRNA should be around 18-22 and UMI length 12. Doesn't it mean that I should see a peak around 30-34?

The code that I used was:

cutadapt -a AACTGTAGGCACCATCAAT --minimum-length 17 -o tri.fastq sample.fastq

miRNA fastqc cutadapt trimming Qiagene • 1.3k views
ADD COMMENT
1
Entering edit mode
4.5 years ago

How did you process your fastq? bcl2fastq can be configured to remove the UMI from the read and put it in the read name; are you sure that wasn't done?

ADD COMMENT
0
Entering edit mode

I believe not. The head of fastq file is as follow:

@NB551007:45:HNKVLBGX5:1:11101:18335:1071 1:N:0:GCCAAT CTGGANGCGAGCCAACTGTAGGCACCATCAATNCCGTGCCCTCNAGATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAAT + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEE#EAEEEEEEEE#EEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEE @NB551007:45:HNKVLBGX5:1:11101:5844:1072 1:N:0:GCCAAT CTGTANGCACCATCAATCGACGTGAACAGAGATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATATCTCGTATGCCGT + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEAEEE< AE/AEEEEEEEEEEEEE @NB551007:45:HNKVLBGX5:1:11101:23470:1072 1:N:0:GCCAAT CGTGGNGAGGAACAATTCTGAGAACTGTAGGCACCATCAATGAACTCGAACCCAGATCGGAAGAGCACACGTCTGAACTCCAGT + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEE @NB551007:45:HNKVLBGX5:1:11101:12496:1074 1:N:0:GCCAAT TCGCTNCGATCTATTGAAAGTCGGCCCTCGACACAAGGGTTTGTAACTGTAGGCACCATCAATTCCCTTATTGCCAGATCGGAA + AAAAA#EEAEAEEEEE6EEEEE/EEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEE

In a downstream analysis I want to use UMI-tools for deduplication. I should actually have the UMI name on the read name to be able to work on it. I searched and looks like I can use fastp to remove the UMI from the read and move it to the read name.

Now my question would be once I have done that, for the trimming with cutadapt, should I remove reads higher than say 40 bp? Just keep 17-40 reads?

ADD REPLY

Login before adding your answer.

Traffic: 2173 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6