BAM to gene names
1
0
Entering edit mode
6.0 years ago
hbrwatkins • 0

I am very new to NGS analysis and need some help.

I have a BAM file which I have filtered to contain only mapped pairs of duplicates. I want to find out the gene names that my reads are covering. What should the next step in my workflow be?

Many, many thanks,

Heather Watkins

BAM targeted • 3.6k views
ADD COMMENT
1
Entering edit mode

Do you have a genome annotation (generally in the GTF or GFF3 formats)? If you do, you can count reads mapping to each gene using featureCounts.

ADD REPLY
0
Entering edit mode

Thank you. I mapped against hg38. Will this work?

ADD REPLY
0
Entering edit mode

Yes, that should be fine. Just download a GTF (preferably from the same source as your genome fasta) for featureCounts.

ADD REPLY
0
Entering edit mode

Thank you very much. I will give this a try.

ADD REPLY
0
Entering edit mode

Is your BAM file mapped against a reference? Is it one of the commonly available genomes?

ADD REPLY
0
Entering edit mode

I mapped it against hg38.

ADD REPLY
2
Entering edit mode
6.0 years ago

As you said in the comment, you got a hg38 reference genome, I can suggest you to download the gtf or gff of this genome ( https://www.gencodegenes.org/releases/current.html ). It will be usefull for the last part.

You can process your bam file with bedtools :

bedtools genomecov -bg -ibam your.bam > bedgraph.csv

That will give you a BedGraph output format (you got chromosomes, positions and even the coverage on position)

Then you can write a script (Python, Perl, whatever your want) that will do the following :

  • Foreach chromosome/position of your bedgraph.csv (you can even here filter out the low coverage hit)
  • Research this chromosome/position in your gtf file
  • Save the corresponding gene in a list if the gene does not exist yet
  • Out of the foreach save your list in a txt file

Maybe some tools already manage this task but here is my way to go.

or (as said in comments)

  • Use FeatureCounts on your bam with your gtf
  • Filter out genes with 0 counts
  • Use Biomart to get current gene names
ADD COMMENT
0
Entering edit mode

Thank you very much. I will give this a try.

ADD REPLY

Login before adding your answer.

Traffic: 3039 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6