Question

Mapping paired end

0

Entering edit mode

12 weeks ago

Sarah • 0

Hi, sorry if this is a very basic question!

I'm trying to analyse a paired end sequence using UMI-tools. I was instructed by my supervisor to first extract the UMIs from the R1 and R2 files seperately. This gives me a Processed_R1_fastq.gz file, a Processed_R2_fastq.gz file and a processed.log file.

On the UMItools guide, it doesn't really explain what to do next. In regards to paired end reads, it just says: "After paired-end mapping, paired end deduplication can be achieved by adding the --paired option to the call to dedup".

My question is, how should I map the files? Do I map them both separately? What do I do after that?

Also, I wanted to check that I am using the correct file for genomic indexing/mapping/alignment. Basically, I would like to use the hg19 genome. I downloaded hg19.fa.gz and hg19.2bit from https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/ and unzipped it and placed it in my directory. Are these the correct files need for mapping?

Thanks in advance!

dna-seq mapping umi umitools ngs • 403 views

ADD COMMENT • link 12 weeks ago by Sarah • 0

0

Entering edit mode

I'm trying to analyse a paired end sequence using UMI-tools.

Are you certain your data has UMI's?

ADD REPLY • link 12 weeks ago by GenoMax 141k

0

Entering edit mode

Yes, I have been told there are UMIs as this data was successfully analysed by someone else previously.

ADD REPLY • link 12 weeks ago by Sarah • 0

score 0 · Answer 1 · 2024-01-31

0

Entering edit mode

12 weeks ago

colindaven 6.4k

Hi Sarah,

please have a look at this basic tutorial for these introductory concepts. You can do this on Galaxy or on your own machine.

https://training.galaxyproject.org/training-material/topics/sequence-analysis/tutorials/mapping/tutorial.html

ADD COMMENT • link 12 weeks ago by colindaven 6.4k