Question: the number of reads that have mapped to the each gene
0
gravatar for F
3.9 years ago by
F3.4k
Iran
F3.4k wrote:

hey friends,

i have two acceped_hits.bam files one for rna-seq and another for ribo-seq produced by tophat2...today my adviser asked me how i can count the number of reads that have mapped to the each gene...i searched in biostar posts but i could not figure out yet...can you help me please?

thank you

rna-seq toptah2 bam ribo-seq • 1.9k views
ADD COMMENTlink modified 11 months ago by Biostar ♦♦ 20 • written 3.9 years ago by F3.4k
3

http://lmgtfy.com/?q=site%3Abiostars.org+number+reads+gene

ADD REPLYlink written 3.9 years ago by Pierre Lindenbaum121k

thank you Pierre for your complete guidance!! 

ADD REPLYlink written 3.9 years ago by F3.4k
1

You are going to have plenty of tools to do that (like this answer Extracting Read Count For Each Gene/Exon From Rna-Seq Bam Files). The real question is, what and how are you counting exactly? You can count only uniquely mapped reads, reads that fall entirely into exons of your genes or just overlap them etc... You may also want a way to normalize your values (say, using FPKM).

This gets messy: ask your PI what exact method you should use. Ideally, you might find an article on the subject you are researching whose Materials and Methods RNASeq section can be recycled - you will be able to make more rigorous comparisons of your finds and those of the articles.

At minimum, you will need an annotation file with the position of the features you are counting.

ADD REPLYlink written 3.9 years ago by cyril-cros890

thank you  cyril-cros

ADD REPLYlink written 3.9 years ago by F3.4k
2
gravatar for Alex Reynolds
3.9 years ago by
Alex Reynolds28k
Seattle, WA USA
Alex Reynolds28k wrote:

Via BEDOPS gff2bed (or gtf2bed, depending on the annotation format), bam2bed, and bedmap:

$ gff2bed < genes.gff > genes.bed
$ bam2bed < reads.bam | bedmap --echo --count genes.bed - > answer.bed
ADD COMMENTlink written 3.9 years ago by Alex Reynolds28k

sorry Alex,

i downloaded x86-64 (64-bit) binaries for fedora 64 bit, then with tar xvjf bedops_linux_x86_64-v2.4.14.tar.bz2

unzipped that, but there is no unzipped file...am i right for unzipping?

sorry again,

irecognized right now...i file named bin have been created there

thank you

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by F3.4k

sorry Alex,

i have a genes.gtf file with

[izadi@lbox161 ~]$ bash
[izadi@lbox161 ~]$ cd /usr/data/nfs6/izadi/angel/ouput/
[izadi@lbox161 ouput]$ export BED=/usr/data/nfs6/izadi/gij/bin/
[izadi@lbox161 ouput]$ echo $BED
/usr/data/nfs6/izadi/gij/bin/
[izadi@lbox161 ouput]$ $BED/gtf2bed genes.gtf > genes.bed
/usr/data/nfs6/izadi/gij/bin//gtf2bed: line 122: convert2bed: command not found
[izadi@lbox161 ouput]$

what is the reason please?

 

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by F3.4k
3

please,  learn from your previous experience: no happen when runing tophat in fedora

ADD REPLYlink written 3.9 years ago by Pierre Lindenbaum121k

ohhh Pierre please,

last night i tried too much
 

ADD REPLYlink written 3.9 years ago by F3.4k
1
$ sudo cp $BED/* /usr/local/bin

Or similar.

ADD REPLYlink written 3.9 years ago by Alex Reynolds28k

sorry Alex, what does this mean please??? i should copy what???

ADD REPLYlink written 3.9 years ago by F3.4k

Alex,

i tried

[izadi@lbox161 ouput]$ $BED/bam2bed accepted_hits.bam | bedmap --echo --count hgTables.bed - > answer.bed
/usr/data/nfs6/izadi/gij/bin//bam2bed: line 148: convert2bed: command not found
bash: bedmap: command not found...
[izadi@lbox161 ouput]$ sudo cp $BED/* /usr/local/bin

We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:

    #1) Respect the privacy of others.
    #2) Think before you type.
    #3) With great power comes great responsibility.

[sudo] password for izadi: 
[izadi@lbox161 ouput]$ $BED/gtf2bed genes.gtf > genes.bed
[izadi@lbox161 ouput]$ $BED/bam2bed accepted_hits.bam | bedmap --echo --count hgTables.bed - > answer.bed
/usr/data/nfs6/izadi/gij/bin//bam2bed: line 148: convert2bed: command not found
bash: bedmap: command not found...
[izadi@lbox161 ouput]$ $BED/bam2bed accepted_hits.bam | bedmap --echo --count hgTables.bed > answer.bed
/usr/data/nfs6/izadi/gij/bin//bam2bed: line 148: convert2bed: command not found
bash: bedmap: command not found...
[izadi@lbox161 ouput]$ 

what can i do please?

ADD REPLYlink written 3.9 years ago by F3.4k
1

After copying binaries and scripts to a folder that is already in your PATH (like /usr/local/bin), then run:

$ gff2bed < genes.gff > genes.bed
$ bam2bed < reads.bam | bedmap --echo --count genes.bed - > answer.bed

You do not need the prefix $BED/ so long as binaries and scripts are copied to a folder already in your PATH (which usually includes /usr/local/bin).

The PATH variable helps your command-line session find and run binaries that are in other folders. This is an important part of UNIX to learn, so it is worth spending a couple minutes to read about.

Replace genes.gff with your source of genes, if you are converting GFF-formatted annotations to BED.

If you don't need to convert GFF-formatted annotations to BED, then you can skip this first step and just use your BED file (whatever it is) directly.

For instance, if your BED file of genes is hgTables.bed, and your BAM file of reads is accepted_hits.bam, then you can run:

$ bam2bed < accepted_hits.bam | bedmap --echo --count hgTables.bed - > answer.bed

If you need other statistics than a read count, read the bedmap documentation for available options.

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by Alex Reynolds28k

thank you so much Alex,

i copied accepted_hits.bam and gtf.genes in binary file

then i typed

[izadi@lbox161 ~]$ bash
[izadi@lbox161 ~]$ cd /usr/data/nfs6/izadi/gij/bin/
[izadi@lbox161 bin]$ gtf2bed genes.gtf > genes.bed
Error: No input is specified; please redirect or pipe in formatted data
convert2bed
  version: 2.4.13
  author:  Alex Reynolds

 

ADD REPLYlink written 3.9 years ago by F3.4k
1
$ gtf2bed < foo.gtf > foo.bed

Note the use of the UNIX input and output redirection operators "<" and ">".

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by Alex Reynolds28k

thank you Alex

ADD REPLYlink written 3.9 years ago by F3.4k

thank youuuuuuuuuuuuuuuuuuuuu Alex, you are really clever do develope such a tool

i could get the result

this is few rows but all of columns of my bed file

may you please tell me where is the the number of reads per gene?

I    334    337    "YAL069W    .    +    protein_coding    start_codon    0    exon_number "1"; gene_id "YAL069W"; gene_name "YAL069W"; p_id "P3633"; transcript_id "YAL069W"; transcript_name "YAL069W"; tss_id "TSS1128";|0
I    334    646    "YAL069W    .    +    protein_coding    CDS    0    exon_number "1"; gene_id "YAL069W"; gene_name "YAL069W"; p_id "P3633"; protein_id "YAL069W"; transcript_id "YAL069W"; transcript_name "YAL069W";

thank you

ADD REPLYlink written 3.9 years ago by F3.4k
1

The count value is at the end of the string. In your first line, for example, you see "...;|0" That means there were zero reads over the gene annotation (exon). In other situations, you might see "...;|1", "...;|2" etc. for one, two or more reads over the annotation.

It may be easier to add --delim '\t' to the bedmap command:

$ bam2bed < reads.bam | bedmap --echo --count --delim '\t' genes.bed - > answer.bed

The count value is then in its own column at the end of the line.

ADD REPLYlink written 3.9 years ago by Alex Reynolds28k

Tnxxxxxxxxxxxxxxxxxx a lot briliant guy

ADD REPLYlink written 3.9 years ago by F3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 831 users visited in the last hour