Question: featureCounts: 0% successfully assigned fragments on multiple PE .BAM files
0
gravatar for imesi
12 months ago by
imesi20
imesi20 wrote:

I have 9 files of paired RNA-seq reads from mice. I have aligned them with Rsubread (there is one output .BAM file for each paired-end reads). I am counting with featureCounts. The results show 0% assigned reads for all the files. I have no idea what I am doing wrong, any help will be appreciated. I am using bash as a wrapper to run R in a cluster, so I am not running R directly in the command line. I have searched for previous questions that addressed this and have not had any resolution. This is the code

all.bam <- list.files(path=".", pattern = ".BAM$" )
counts <- featureCounts(all.bam,annot.inbuilt="mm10",isPairedEnd=TRUE,strandSpecific=1,nthreads=16)

Here is the output.

==========     _____ _    _ ____  _____  ______          _____
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 1.28.1

//========================== featureCounts setting ===========================\\
||                                                                            ||
||             Input files : 9 BAM files                                      ||
||                           P Control-1.BAM                                  ||
||                           P Control-2.BAM                                  ||
||                           P Control-3.BAM                                  ||
||                           P IFNg-1.BAM                                     ||
||                           P IFNg-2.BAM                                     ||
||                           P IFNg-3.BAM                                     ||
||                           P M10.BAM                                        ||
||                           P M6.BAM                                         ||
||                           P M7.BAM                                         ||
||                                                                            ||
||      Dir for temp files : .                                                ||
||                 Threads : 16                                               ||
||                   Level : meta-feature level                               ||
||              Paired-end : yes                                              ||
||         Strand specific : stranded                                         ||
||      Multimapping reads : not counted                                      ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
||          Chimeric reads : counted                                          ||
||        Both ends mapped : not required                                     ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file /c1/apps/R/3.4.2/lib64/R/library/Rsubread/annot/m ... ||
||    Features : 222996                                                       ||
||    Meta-features : 27179                                                   ||
||    Chromosomes/contigs : 43                                                ||
||                                                                            ||
|| Process BAM file Control-1.BAM...                                          ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 169195669                                             ||
||    Successfully assigned fragments : 45937 (0.0%)                          ||
||    Running time : 0.75 minutes                                             ||
||                                                                            ||
|| Process BAM file Control-2.BAM...                                          ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 165739901                                             ||
||    Successfully assigned fragments : 45111 (0.0%)                          ||
||    Running time : 0.75 minutes                                             ||
||                                                                            ||
|| Process BAM file Control-3.BAM...                                          ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 131195114                                             ||
||    Successfully assigned fragments : 34758 (0.0%)                          ||
||    Running time : 0.64 minutes                                             ||
||                                                                            ||
|| Process BAM file IFNg-1.BAM...                                             ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 155071610                                             ||
||    Successfully assigned fragments : 40433 (0.0%)                          ||
||    Running time : 0.78 minutes                                             ||
||                                                                            ||
|| Process BAM file IFNg-2.BAM...                                             ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 157528629                                             ||
||    Successfully assigned fragments : 41355 (0.0%)                          ||
||    Running time : 0.76 minutes                                             ||
||                                                                            ||
|| Process BAM file IFNg-3.BAM...                                             ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 156642027                                             ||
||    Successfully assigned fragments : 41787 (0.0%)                          ||
||    Running time : 0.74 minutes                                             ||
||                                                                            ||
|| Process BAM file M10.BAM...                                                ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 167695503                                             ||
||    Successfully assigned fragments : 67108 (0.0%)                          ||
||    Running time : 0.81 minutes                                             ||
||                                                                            ||
|| Process BAM file M6.BAM...                                                 ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 160942341                                             ||
||    Successfully assigned fragments : 69735 (0.0%)                          ||
||    Running time : 0.78 minutes                                             ||
||                                                                            ||
|| Process BAM file M7.BAM...                                                 ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
rsubread featurecounts • 1.4k views
ADD COMMENTlink modified 12 months ago by genomax71k • written 12 months ago by imesi20

In this type of situations most likely culprit is that the chromosome names in your BAM files are not matching the annotation being used. I see that you are using the built-in annotation for featureCounts. I am not sure if that is from UCSC or NCBI/Ensembl genome builds. What does samtools view -H one_of_your.bam files show?

You may find it easier to use featureCounts outside R so you can use an annotation file from the same source where you obtained your genome sequence from.

ADD REPLYlink modified 12 months ago • written 12 months ago by genomax71k

Hi genomax, That was my first thought. Also, like you pointed out, the annotation file I used is built into featureCounts (mm10), so per the manual it is from NCBI annotations. Unfortunately I have Windows and Subread is not Windows-enabled. I have looked into how to work around it and found that you can by installing Cygwin........ I'll try your samtools suggestion. Many thanks.

Edited to add:

samtools view Control-1.BAM | head

GWNJ-0965:236:GW1806181178:1:1101:21684:1713    83      NC_000076.6     117061510       13                                                                              75M75S  =       117061332       -328    GCAGCTCTGCGGCTAAGACAGTAACAGAGGTAGTGCCATCACCAACTTCA                                                                              TCATCTTGAACCCTTGACATATCAACTAGAACCTTTGCTGCGGGGTTGTCCACACCAATGTTCTTGAGAATGGTAGCACCGTCGTTGGTC                                                                              ACCATCAGAN      JJFJFFJFJJJJJJJJJJJJJJJF<7JAJJJJ<A-JJJJJFJJJJJJJJJFFJFJFFJJJJJF7FFJJJJJJJJ                                                                              JJFJJJJJJJJFJJJJJJFJFAFAAFAAJJJJJJJFJ<FAFJJJFAJJJJJFJJJFJJJJJJ<JFJJJAFAAFAA#    HI:i:1  NH                                                                              :i:1    NM:i:1
ADD REPLYlink modified 12 months ago • written 12 months ago by imesi20

You probably want the "reverse" strand-specific setting. That's correct for most libraries these days.

ADD REPLYlink written 12 months ago by Devon Ryan91k

Hi Devon, Thanks for your input, but I already tried 2 standSpecific options (1 and 2). Here's the output for 2/reverse stranded:

        ==========     _____ _    _ ____  _____  ______          _____
        =====         / ____| |  | |  _ \|  __ \|  ____|   /\   |  __ \
          =====      | (___ | |  | | |_) | |__) | |__     /  \  | |  | |
            ====      \___ \| |  | |  _ <|  _  /|  __|   / /\ \ | |  | |
              ====    ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
        ==========   |_____/ \____/|____/|_|  \_\______/_/    \_\_____/
       Rsubread 1.28.1

//========================== featureCounts setting ===========================\\
||                                                                            ||
||                                                                            ||
||             Input files : 9 BAM files                                      ||
||                           P Control-1.BAM                                  ||
||                           P Control-2.BAM                                  ||
||                           P Control-3.BAM                                  ||
||                           P IFNg-1.BAM                                     ||
||                           P IFNg-2.BAM                                     ||
||                           P IFNg-3.BAM                                     ||
||                           P M10.BAM                                        ||
||                           P M6.BAM                                         ||
||                           P M7.BAM                                         ||
||                                                                            ||
||      Dir for temp files : .                                                ||
||                 Threads : 16                                               ||
||                   Level : meta-feature level                               ||
||              Paired-end : yes                                              ||
||         Strand specific : reversely stranded                               ||
||      Multimapping reads : not counted                                      ||
|| Multi-overlapping reads : not counted                                      ||
||   Min overlapping bases : 1                                                ||
||                                                                            ||
||          Chimeric reads : counted                                          ||
||        Both ends mapped : not required                                     ||
||                                                                            ||
\\===================== http://subread.sourceforge.net/ ======================//

//================================= Running ==================================\\
||                                                                            ||
|| Load annotation file /c1/apps/R/3.4.2/lib64/R/library/Rsubread/annot/m ... ||
||    Features : 222996                                                       ||
||    Meta-features : 27179                                                   ||
||    Chromosomes/contigs : 43                                                ||
||                                                                            ||
|| Process BAM file Control-1.BAM...                                          ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||
||    Total fragments : 169195669                                             ||
||    Successfully assigned fragments : 30128 (0.0%)                          ||
||    Running time : 0.76 minutes                                             ||
||                                                                            ||
|| Process BAM file Control-2.BAM...                                          ||
||    Paired-end reads are included.                                          ||
||    Assign fragments (read pairs) to features...                            ||

It is the same 0% assigned fragments. I have a suspicion of what the problem might be. I think I might have to update my RSubread version (1.28.1) to the current release (1.31.7). I'll effect that change and see what happens.

ADD REPLYlink modified 12 months ago • written 12 months ago by imesi20

@Devon Ryan, I have updated R and I still have the same issue.

ADD REPLYlink written 12 months ago by imesi20

Post one of the BAM files somewhere and give us a link to the GTF file you're using. Then one of us can have a look at why this is happening.

ADD REPLYlink written 12 months ago by Devon Ryan91k

Hi Devon,

Apologies for the delayed response. Could you please give some details about where you want me to post one of the BAM files? The GTF file I can give the link to, it is the NCBI mm10 which is built into featurecounts so all I had to do was jsut write it out per the manual.

ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/vertebrate_mammalian/Mus_musculus/latest_assembly_versions/GCF_000001635.26_GRCm38.p6 And thank you so much for your help.

ADD REPLYlink modified 12 months ago • written 12 months ago by imesi20

You can post the BAM file where ever you want (drop box, google drive, etc.). You'll need to post a link then (or email one to me).

ADD REPLYlink written 12 months ago by Devon Ryan91k

Great, thanks. I am uploading it to google drive right now. Could I please have your email?

ADD REPLYlink written 12 months ago by imesi20

dpryan79@gmail.com is the easiest given that it's google drive.

ADD REPLYlink written 12 months ago by Devon Ryan91k

Thank you Devon, I have sent it.

ADD REPLYlink written 12 months ago by imesi20
1
gravatar for Devon Ryan
12 months ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

Ah, you actually aligned against the fasta file from NCBI, that seems to be the problem. FeatureCounts is able to convert between standard chromosome names, but since NCBI doesn't use anything standard it can't match the chromosomes you have and those in its built-in annotation (this uses UCSC chromosome names). As a rule, only ever use files from Ensembl (better) or UCSC (OK), never NCBI. It's unfortunate that the Rsubread documentation links to NCBI at all, that's a serious oversight on their part.

You might get lucky and Rsubread will accept the GFF file that you can download from NCBI, but if not I suggest you change the chromosome names to something more standard or realign against the genome from Ensembl/Gencode, or UCSC.

ADD COMMENTlink written 12 months ago by Devon Ryan91k

Thank you Devon. I used NCBI files for a couple of reasons. First is that my analyses workflow is built off of a paper where Rsubread/subread and NCBI references were used. Second, the Rsubread user guide advocates the use of NCBI references and at first first blush it seemed pretty nifty, especially with the built-in annotations. Now I know. I don't think Rsubread accepted the GFF file, that's the genesis of the 0% fragment count I believe. I will consider realigning with Ensembl. Do you have any idea as to how I change the chromosome names? Your time, patience and explanations are truly appreciated.

ADD REPLYlink written 12 months ago by imesi20
1

The easiest way to change the chromosome names of the BAM file is:

samtools view -H your_file.bam > header
...edit the header in your favourite text editor ...
samtools reheader header your_file.bam > fixed.bam

Or something along those lines. I don't have the chromosome name conversions stored anywhere for mm10 between NCBI and something standard, unfortunately, so you'll have to look that up :(

ADD REPLYlink written 12 months ago by Devon Ryan91k

Great, thank you again.

ADD REPLYlink written 12 months ago by imesi20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1948 users visited in the last hour