Simulate short-read RNA-seq data from long-read RNA-seq data
1
0
Entering edit mode
7 months ago
rhonddaskl • 0

Hi,

I'm trying to use my self-developed tool to test it on long-read and short-read single-cell RNA-seq datasets. To ensure a fair comparison, I'm attempting to simulate short-length single-cell RNA-seq data from the long-read RNA-seq data. For example, for each sequence in the long-read FASTQ file, I will randomly select sequences that are 100 bp long to simulate a Smart-seq dataset. I will then use my tool for comparison. I'm wondering if there is a tool that can perform a similar task. I've noticed some people using Polyester to achieve this with transcript datasets, but I'm curious if I can directly use the FASTQ data to simulate short-read RNA-seq.

thank you!

Best, R

polyester single-cell simulation short-read long-read • 552 views
ADD COMMENT
2
Entering edit mode
7 months ago
Mensur Dlakic ★ 27k

I think you could try doing this, but it wouldn't be appropriate. You'd be bringing long-read errors, which I believe are higher than short-read errors, into your "simulated" short reads. Separately, you wouldn't be creating any type of error that are hallmarks of short reads, so this would be a short-read sample only in name.

If I were to do this I'd assemble long reads first which will correct some of those errors, and simulate from the assembled material.

ADD COMMENT
0
Entering edit mode

thank you for your suggestion!

ADD REPLY

Login before adding your answer.

Traffic: 1469 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6