I have carried out a FASTQC analysis on raw reads of SAGE-RNASeq experiment from SRA database. Numerous over-represented sequences have turned up pertaining to Illumina single end adapters and primers; paired end pcr primers, indexes etc (See black quotes).
Currently, I am confused about how to carry out the QC of these reads. Kindly help.
ATAATAAAGATTGCTCTCATCATAATCGTATGCCGTCTTCTGCTTG 73871 1.5328089718844269 Illumina Single End Adapter 2 (95% over 22bp) TAATAATTGGAACTTTACATCATAATCGTATGCCGTCTTCTGCTTG 45681 0.9478719205730599 Illumina Single End Adapter 1 (95% over 22bp) AAATATAATTTCTTCATCATCATAATCGTATGCCGTCTTCTGCTTG 42720 0.8864317428883149 Illumina Single End Adapter 2 (95% over 22bp) CTAATTGTGATATAAATCATCATAATCGTATGCCGTCTTCTGCTTG 35652 0.7397721090228044 Illumina Single End Adapter 1 (95% over 22bp) AGCACCAATAAATAACTCCATCATAATCGTATGCCGTCTTCTGCTT 27850 0.5778821170280799 Illumina Paired End PCR Primer 2 (95% over 21bp) TACCCCGGTATCGCCGACCATCATAATCGTATGCCGTCTTCTGCTT 27671 0.5741679016259963 Illumina Single End Adapter 1 (95% over 21bp) TTTACACGTGATGTAATCATCATAATCGTATGCCGTCTTCTGCTTG 27570 0.5720721711477258 Illumina Single End Adapter 1 (95% over 22bp) AGGTGATGCTAAACATCCATCATAATCGTATGCCGTCTTCTGCTTG 26864 0.5574228076065472 Illumina Single End Adapter 1 (95% over 22bp) ATGGCGAGTGTGTTTCTCATCATAATCGTATGCCGTCTTCTGCTTG 25399 0.5270243407682658 Illumina Single End Adapter 2 (95% over 22bp) GGCAGGTGATCTACACGCCATCATAATCGTATGCCGTCTTCTGCTT 22834 0.47380108654287884 Illumina Single End Adapter 1 (95% over 21bp) AGGTGATGCTAAACATCACATCATAATCGTATGCCGTCTTCTGCTT 22822 0.4735520888622923 Illumina Single End Adapter 2 (95% over 21bp) GGTACACTCAAGAAGGATCATCATAATCGTATGCCGTCTTCTGCTT 22481 0.4664764047722895 Illumina Single End Adapter 1 (95% over 21bp) GGTACGAAATGGAAGGCCATCATAATCGTATGCCGTCTTCTGCTTG 21503 0.4461830938044812 Illumina Single End Adapter 1 (95% over 22bp) CAAAAGAAACTTAAAATCATCATAATCGTATGCCGTCTTCTGCTTG 21394 0.4439213648724862 Illumina Single End Adapter 1 (95% over 22bp) GCGTAGAAGACATCACAACATCATAATCGTATGCCGTCTTCTGCTT 20473 0.4248107928874642 Illumina Single End Adapter 2 (95% over 21bp) GTGGTATTTATTTTCGACATCATAATCGTATGCCGTCTTCTGCTTG 19728 0.40935218688437913 Illumina Single End Adapter 2 (95% over 22bp) TAAAAATGGAAAAAAAACATCATAATCGTATGCCGTCTTCTGCTTG 19107 0.3964665569140224 Illumina Single End Adapter 1 (95% over 22bp) CACAAAGACAATAAAGTTCATCATAATCGTATGCCGTCTTCTGCTT 18616 0.3862784018166871 Illumina Single End Adapter 1 (95% over 21bp) CCTGTACAGAAACAAACCATCATAATCGTATGCCGTCTTCTGCTTG 18389 0.38156819569225714 Illumina Single End Adapter 2 (95% over 22bp) GTAAATATTTTCAAGGTCATCATAATCGTATGCCGTCTTCTGCTTG 17817 0.3696993062509623 Illumina Paired End PCR Primer 2 (95% over 22bp) CACGCAAATTCGGACCCCATCATAATCGTATGCCGTCTTCTGCTTG 17242 0.3577681673895208 Illumina Single End Adapter 1 (95% over 22bp) TTGGCTTCACCAACAACCATCATAATCGTATGCCGTCTTCTGCTTG 16580 0.3440317953438264 Illumina PCR Primer Index 1 (95% over 22bp) TAATGATAGAAAAAATACATCATAATCGTATGCCGTCTTCTGCTTG 16508 0.34253780926030675 Illumina Paired End PCR Primer 2 (95% over 22bp) CTAATTGTGATATAAATACATCATAATCGTATGCCGTCTTCTGCTT 16443 0.34118907182379593 Illumina Paired End PCR Primer 2 (95% over 21bp) TATTGTAGCTAGTCATCTCATCATAATCGTATGCCGTCTTCTGCTT 16357 0.3394045884462586 Illumina Single End Adapter 2 (95% over 21bp) AACAATTCCTTCATTAGCATCATAATCGTATGCCGTCTTCTGCTTG 15637 0.32446472761106226 Illumina Paired End PCR Primer 2 (95% over 22bp)
I suggest you trim adapters as indicated at the top of this thread.