Question: tool to analysis bulk RNAseq data with UMIs
2
gravatar for Sara
13 months ago by
Sara150
Sara150 wrote:

I have bulk RNAseq data and in the protocol, they also used UMIs. I am looking for a tools which is able to deal with UMIs in bulk RNAseq but did not find any (they are all made for single cell RNAseq). so, my question is that, is there any tool available to work with the bulk RNAseq data with UMIs?

next-gen • 1.3k views
ADD COMMENTlink modified 13 months ago by i.sudbery9.8k • written 13 months ago by Sara150
3
gravatar for i.sudbery
13 months ago by
i.sudbery9.8k
Sheffield, UK
i.sudbery9.8k wrote:

UMI-tools can handle any UMI tagged sequencing data where deduplication happens after mapping https://umi-tools.readthedocs.io/en/latest/index.html.

The process is to extract the UMIs from the read sequence and add it to the read names. There are two ways to do this, and between them provide the flexibility to handle any read configuration I can think of (see https://umi-tools.readthedocs.io/en/latest/regex.html)

You then map your reads with your favourite mapper.

The next step depends on whether your technique fragments the cDNA before or after PCR. If fragmentation happens after PCR, then the next step is to assign reads to features (e.g. genes) using featureCounts. If PCR happened after fragmentation, then you do the read assignment/quantification after deduping.

Then you group/dedup/count (depending on your downstream application). If fragmentation happened after PCR then you need to do this on a per-gene basis.

ADD COMMENTlink written 13 months ago by i.sudbery9.8k

Hi, I have collected my HTS data (single-end) of E.coli ribosome (full) using the Illumina platform. I found UMI-tools is very interesting and useful. I have used 18nt random barcode at 5'-end for avoiding the read duplication. I want to count the number of UMIs and reads at each position after mapping with a reference sequence. I have read the manual of UMI-tools, but couldn't figure out the solution: can you please suggest me how can I proceed. I'm providing an example showing what is my aim and how much I have understood:

Say, I have extracted the random barcode (18nt) from the 5'- end of each reads at the head ('_' seperated) like below using UMI-tools. Then I'll do mapping with the reference sequence using bowtie -2 . Now, I want to count the number of reads at each position of the reference and the barcodes which were unique to those reads from the SAM/BAM file. That means, I want to get the number of molecules at each position and their UMIs. For example, if I get 100 reads at 15th position and those 100 reads contained 75 types of unique barcodes, e.g., I want to get the number of reads (100) and unique barcodes (75) at each position (here 15th).

@ST-E00205:943:HCF3YCCX2:4:1101:11495:1678_CCAGCCCAAAGCCACCCG 1:N:0:NCCACGCG+NGATCTCG ACCGGATGGTAGACCTGGAGGAGGGGAAAGCCGAGGTGGTGACGGGAGCGGCTGGGGGGGGAGTCCGGGATGGTAGGCGGAGCGGGCAGAGCACAGCAGCTCGTGTAGAAATGG
+
7-<--7--7-7F-----77----7---7-------------------7----77-7-----7------7---------7-7------7--7----77----------77-7---
ADD REPLYlink modified 5 months ago by GenoMax92k • written 5 months ago by naeem40thju0

This is a separate question. Can you please start a new post.

ADD REPLYlink written 5 months ago by i.sudbery9.8k

Okay, thank you very much.

ADD REPLYlink written 5 months ago by naeem40thju0
1
gravatar for swbarnes2
13 months ago by
swbarnes29.2k
United States
swbarnes29.2k wrote:

Have you looked at umi_tools?

ADD COMMENTlink written 13 months ago by swbarnes29.2k

@swbarnes2: I think umi_tools is only for scRNAseq, right?

ADD REPLYlink modified 13 months ago • written 13 months ago by Sara150
1

scRNA data is still normal sequence. Depending on the scheme you are using for your UMI's you should be able to apply umi_tools. See the FAQ for examples of regular expressions you can use.

ADD REPLYlink written 13 months ago by GenoMax92k

UMI-tools was actually first created to analyse iCLIP data! Absolutely no reason it shouldn't work with bulk RNA-seq, infact we are analysing some UMI-tagged bulk RNAseq data with it ourselves right now.

ADD REPLYlink written 13 months ago by i.sudbery9.8k

It pulls the UMI out of a read and puts it in the read name; That's not specific to scRNASeq

ADD REPLYlink written 13 months ago by swbarnes29.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2127 users visited in the last hour