Question

Excessive Unassigned_FragmentLength in featurecounts

0

Entering edit mode

8 months ago

alex • 0

Hello,

I've run featurecounts and have around ~40% unassigned_fragment. My reads are paired end 150bp and aligned with STAR (~90-93% uniquely successfully aligned) and a average read input length of ~299 for all my samples. Here are the conditions I've run for featurecounts and the summary output.

featureCounts \
  -p --countReadPairs -B -P \
  -F "GTF" \
  -J \
  -C \
  -T 16 \
  -g gene_id \
  -t exon \
  -a $annotation_file \
  --extraAttributes "gene_type" \
  -o processing/counts/output_file_name \
  processing/mapping/star/*.bam

Summary output:


Assigned    14367477    13814907    15921754    14159227    13658284    14979822    15491455    13762178    15848496    13407749    14694807    13113942
Unassigned_Unmapped 0   0   0   0   0   0   0   0   0   0
Unassigned_Read_Type    0   0   0   0   0   0   0   0   0   0
Unassigned_Singleton    0   0   0   0   0   0   0   0   0   0
Unassigned_MappingQuality   0   0   0   0   0   0   0   0   0
Unassigned_Chimera  0   0   0   0   0   0   0   0   0   0
Unassigned_FragmentLength   11617180    10800353    12659685    11053441    11040810    9806741 12528930    11007146    
11982978    10008445    11223062    9763096
Unassigned_Duplicate    0   0   0   0   0   0   0   0   0   0
Unassigned_MultiMapping 0   0   0   0   0   0   0   0   0   0

I know that the Unassigned_FragmentLength could result in the fragments being >600 and <50 from the default settings, but based on the bam summary that is not the case? What else could cause this to happen and how might I resolve it?

Please let me know if there is any other information that would be helpful to resolve this!

RNA-seq featurecounts • 476 views

ADD COMMENT • link updated 8 months ago by ATpoint 85k • written 8 months ago by alex • 0

0

Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or use one of (a) the option highlighted in the image below/ (b) fenced code blocks for multi-line code. Fenced code blocks are useful in syntax highlighting. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.
code_formatting

ADD REPLY • link 8 months ago by Ram 44k

0

Entering edit mode

Run without the -P flag, does this then include the problematic alignments?

ADD REPLY • link 8 months ago by ATpoint 85k