Entering edit mode
3.9 years ago
Kumar
▴
170
Hi,
I need to align the MiNION sequencing reads against the viral genome (TTV- 3kb long) to figure out how many mRNAs are produced, transcription start/stop sites and regulatory elements in the viral genome.
Please suggest pipelines and software for that. Should I use Bowtie, GMAP mapper for this analysis.
Thank you!
Start with
minimap2
to see what your data looks like. Then you may need to use GMAP indeed.Sure! So minimap2 produce .bam output file. How to use this .bam for mRNAs prediction, transcription start/stop sites and regulatory elements?
What kind of prep are you sequencing? Are you directly sequencing RNA?
Yes! The direct RNA sequencing approach. Total RNA was isolated using the Nucleospin RNA Kit (Macherey-Nagel) according to the manufacturer’s guidance.
Don't know if you need to convert the U's to T's before aligning but
minimap2
may be smart. Until you align you won't know what is happening with the data. Not everything is going to be full length so you will need to look for read pileups to have strong evidence that transctipts are real/prevalent. Since you are sequencing RNA no predictions should be needed.Ok! Sounds good.
I think I should proceed .sam (output of minimap2 or graphmap) for GTAK since I am also looking to identify variants calling/snps and then IGV for visualization.
Tools designed for short read sequencing don't always work well for long reads. So search for variant calling tools for long reads and don't just use GATK
I got one longshot for variant calling for long reads analysis.
Do I need to do assemble to find transcription start/stop sites and regulatory elements. I am .sam files from generated from minimap2 and graphmap. Do I need to use IGV for visualization of .sam files generated from graphmap? Else any other pipeline to identify regulatory elements.
You always should visualize the data to get an idea of what it looks like. Direct RNA sequencing is so new that you are blazing a new path instead of following a standard one. Remember regulatory elements are not going to be directly present in this data. You are looking at the final product i.e. RNA of those elements.
Can you give some steps for this analysis. I am not sure how to proceed alignment file generated from minimap2 in order to get transcription start/stop sites etc..