Question: How to align raw data to custom regions of the genome?
0
gravatar for Mag.ds
3 months ago by
Mag.ds10
Mag.ds10 wrote:

I have a bed file of some custom regions I have interest in, e.g.

chr1 1000 1015 my_region 0 + ...

I believe these regions may be translated, but are not necessarily annotated genes. So I would like to map some reads to these regions to confirm my suspicions.

The dataset GSE109313 has 1 Trillion reads from various human tissues.

Clicking on the run selector I can see the 37 SRR samples.

Using

prefetch

I downloaded these files.

Often reads are mapped to a GTF file, e.g. the human GTF file can be found on ensembl.

So now I have bed file or regions I would like to map to, the human GTF file, and the ~400Gb of SRR files.

Where do I go from here?

rna-seq alignment • 146 views
ADD COMMENTlink modified 3 months ago by lieven.sterck8.0k • written 3 months ago by Mag.ds10
1

Map to the whole genome and then look at regions of your interest. Don't do it any other way, especially if original data is from entire genome.

ADD REPLYlink written 3 months ago by genomax85k
1

so for clarity, use bowtie on file with hg38 and then what do you suggest to for looking at regions of my interest

ADD REPLYlink written 3 months ago by Mag.ds10

You can use mosdepth (LINK) or bedtools coverage (LINK) to look at your regions of interest.

ADD REPLYlink modified 3 months ago • written 3 months ago by genomax85k
1
gravatar for lieven.sterck
3 months ago by
lieven.sterck8.0k
VIB, Ghent, Belgium
lieven.sterck8.0k wrote:

As far as I know reads are never mapped to a GTF (or GFF) file, they are always mapped to a fasta file (or an indexed version of a fasta file), so that's a file with the actual sequence in it. The GTF/GFF do come into play when you want to link your mappings to annotated features (eg. genes, promoters, lncRNA, ... ) .

The best you can do is to map all your reads to the human genome.

As an alternative, you could subselect the human genome to only contain the regions you might be interested in. Based on you bed file you could do this with eg getFasta from the bedtools package . DO keep in mind that this has a huge potential in biasing your analysis.

ADD COMMENTlink modified 3 months ago • written 3 months ago by lieven.sterck8.0k

Right, sorry for misstating. I am interested in linking my mapping to annotated features (e.g. my bed file), hence why I brought up the GTF file... how do I turn my BED file into the GTF file?

ADD REPLYlink written 3 months ago by Mag.ds10

no worries, actually there is no need to convert your bed to GFF or GTF, follow genomax suggestion on how to do this without converting your bed file.

ADD REPLYlink modified 3 months ago • written 3 months ago by lieven.sterck8.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 872 users visited in the last hour