Get read depth from Multiple bam files
0
1
Entering edit mode
4.1 years ago

Hello everyone,

I need some help, I have a Panel sequencing data (107 genes) from 18000 individuals, which is already aligned it to hg38 reference genome (1 bam file for each individual), As I have to merge this data with a WES data and I do not have the region information, I was just wondering if there is a way/tool to extract the regions from those 18000 bam files into a single file that can be further used to get those same regions out of the WES data.

I would really appreciate your help/suggestions.

Regards,

genome assembly next-gen • 1.4k views
ADD COMMENT
3
Entering edit mode

way/tool to extract the regions/depth from those 18000 bam files into a single file

For the entire genome? Take a look at mosdepth.

ADD REPLY
0
Entering edit mode

Hi,

I actually wanted to get the regions out of those 18000 bam files into a single bed file based on that file I wanted to extract the exact same regions from WES data so that I can combine both data's and do the down stream analysis.

An example bed file can be found in the link below

wget -nd biobank.ndph.ox.ac.uk/showcase/showcase/auxdata/GRCh38_alt_mapping_noCHR.sorted.merged.bed

I was just wondering if it is possible to have something like this based on those 18000 bam files

Regards,

ADD REPLY
1
Entering edit mode

do you want the depth for each base of for each bed record ?.

ADD REPLY
0
Entering edit mode

If it is interval-wise then featureCounts can be a fast option. You will need to convert the BED file to a SAF like like

awk 'OFS="\t" {print $1"_"$2"_"$3, $1, $2, $3, "."}' in.bed > out.saf

and then count reads over these regions like

featureCounts -a out.saf -T ${Cores} -F SAF -o out.counts *.bam

For per-basepair use mosdepth as genomac suggested.

ADD REPLY
0
Entering edit mode

ATpoint Hi, I want to produce coverage plots for multiple .bam files together for a single .bed target file. I browsed through the mosdepth manual but couldn't find any direct commands to produce coverage plots across multiple samples. Do you suggest to use outputs from individual files in R software as shown in the blogpost here? Or are there any tools which produces such plots directly?

ADD REPLY
0
Entering edit mode

ijlal.hyder2012 : Please do not delete posts when they have received comments/answers.

ADD REPLY

Login before adding your answer.

Traffic: 1870 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6