Question: Get gene counts from *.sam file and annotation
0
gravatar for tim.ivanov.92
11 months ago by
tim.ivanov.9210 wrote:

I have a list of alignment (.sam/.bam formats) from RNAseq experiment and i want to get gene counts from them. The problem is that these alignments have chromosomes as reference, not genes:

NB501288_407_HGYJKBGX7:1:11101:22149:1027#GAGATTGTCAGT  0       chr8    141541926       37      70M     *       0       0       ATTTCNTATGACATTGGGTTCTCATACAGGTCTGCAGTTCAAAATCCAAGTTTGAAATCTGGGACGGAAG  AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEE  XT:A:U  NM:i:1  X0:i:1  X1:i:0  XM:i:1  XO:i:0  XG:i:0  MD:Z:5C64
NB501288_407_HGYJKBGX7:1:11101:6394:1028#GAGATTGTCAGT   0       chr1    33790402        37      70M     *       0       0       GTCTGNCTGGCTCTGCTCAGTCGGGAGGAGGCGCCCTGGGCCAGAATCCTGGCTGCCACCAGCCCTAGGA  AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEE/EEAEEEEEEEE/  XT:A:U  NM:i:1  X0:i:1  X1:i:0  XM:i:1  XO:i:0  XG:i:0  MD:Z:5T64
NB501288_407_HGYJKBGX7:1:11101:19502:1031#GAGATTGTCAGT  16      chr2    97474290        25      70M     *       0       0       CCCCTCCGACCGTTCCCCAGCACACCCCACCCCACTCAGCCGCTCAGCCTCCCTCAGTTACCCANACCGC  <EEEEEEAEAEEEEEEEEEAE/EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE#AAAAA  XT:A:U  NM:i:4  X0:i:1  X1:i:0  XM:i:4  XO:i:0  XG:i:0  MD:Z:3T0G1A57G5

Is there a tool from samtools package to intersect them with gtf annotation to get gene counts, not chromosome?

rna-seq samtools alignment • 521 views
ADD COMMENTlink modified 11 months ago • written 11 months ago by tim.ivanov.9210

featureCounts and htseq-count are standard software programs to get gene counts from SAM/BAM files.

ADD REPLYlink written 11 months ago by genomax71k

thats not really solving my problem, i first need to use something like bedtools intersect

ADD REPLYlink written 11 months ago by tim.ivanov.9210

To get the gene counts you need to annotate your aligned reads first (based on the coordinates in the 3rd and the 4th column!)

ADD REPLYlink written 11 months ago by reza.jabal330

yeah, thats what my goal is the beast i've found so far is bedtools intersect

ADD REPLYlink written 11 months ago by tim.ivanov.9210

You are on track!

1) Download a bed file containing the coordinates of all Gencode genes (You can download it from the UCSC table browser).

2) Convert your bam to bed:

samtools view -bq 20 -F 1796 "XXX.bam" | bamToBed -i stdin >  XXX.bed

3) In bedtools run:

bedtools intersect -a XXX.bed -b Genes.bed  -wa  >  intersect.bed

4) Get the total number of overlapped baits:

wc -l  intersect.bed
ADD REPLYlink modified 11 months ago • written 11 months ago by reza.jabal330
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2015 users visited in the last hour