How do I carry out small RNA assembly?
2.4 years ago
jaqx008 ▴ 110

Hello everyone. I have a small RNA library that I am trying to see if the target any kind of viruses. I plan to make these short sequences into longer reads or contigs and then see if they map to any viral genomes. I was advised to use velvet to carry out the assembly. I installed velvet using conda install velvet. to do the assembly I used the command I found in the velvet help according to how I understand it.

velveth output.fastq 191 -fastq sample.fastq


the problem is, I am not sure this is right because the output file were still short reads when I looked and looked as below

output

>NS500519:44:HHHTLBGX2:3:11406:4633:13982   21398134    0
CATTGCACTCGTCCCGGCCTGA
>NS500519:44:HHHTLBGX2:3:11406:15103:13982  21398135    0
AGCACTGAGAACACTTTGGCCTTGGCAAG
>NS500519:44:HHHTLBGX2:3:11406:23757:13983  21398136    0
TCTTAGAACTCATCGGGAGGGAACATTAGC
>NS500519:44:HHHTLBGX2:3:11406:9883:13983   21398137    0

Velvet Assembly sequencing RNA-Seq • 561 views
smallRNA's are of a size that should not need assembly. If you need to see if they map to viral genomes, you could do that with the data you have now. If the data originally did not come from an entity that was long to begin with there is no point in trying to assemble.

Instead you could look into tadpole.sh from BBMap suite that can be used for read extension/error correction as an alternative.

I apologize for late response. The data I have are small RNAs mostly piRNA and siRNAs that can likely target viral elements (in invertebrates). making the reads longer can help me blast my data against viral databases to see if the small RNAs originate from any viral genomes as I dont have a specific virus in mind. However, I will see if tadpole can help with the read extension. Thanks again