Biased GC content distribution in miRNA Illumina sequencing data
0
0
Entering edit mode
2.0 years ago

Hi All,

I am working on small RNA (miRNA) sequencing data analysis of human Glioblastoma cell line. I am checking the fastqc analysis results for my samples. Samples are already trimmed for the adapters. However I am seeing very different GC content pattern compared to the Theoretical distribution. Also, per base sequence quality is also bit weird and not able to interpret it.

  1. Whether I should go for filtering the reads using Quality control analysis tool (such as NGS QC toolkit or any other?) in this case before proceeding with the alignment? (Attached Image 1)
  2. How do I interpret the pattern of GC content in this case? (Attached Image 3)
  3. Also number of N's after 22nd base position are increasing. How to handle this? (Attached Image 2)

I would like to understand can I consider these samples for downstream analysis?

Per base sequence quality Per base N content Per base GC content

smallRNA fastqc analysis • 530 views
ADD COMMENT
0
Entering edit mode

There is something seriously wrong with this dataset. There should not be this many N's, such poor quality. Did you get this data from a public source or is this your own dataset?

ADD REPLY
0
Entering edit mode

Hi. This is our own dataset. I am really not sure why this happened? I agree, there is something wrong with this dataset. Can I try trimming bases after 22bp?

ADD REPLY

Login before adding your answer.

Traffic: 2431 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6