How to know if the sample from SRA is trimmed or un-trimmed
2
0
Entering edit mode
7 weeks ago
FadyNabil ▴ 10

I am searching for a human mRNA sample on SRA database that is untrimmed, but I do not know how to check if it's trimmed or not

FASTQC NGS SRA FASTQ NCBI • 339 views
ADD COMMENT
0
Entering edit mode

In my understanding we submit raw data in NCBI GEO database with md5sum information not trimmed data. Although, you can perform quality check to see if data is trimmed.

ADD REPLY
0
Entering edit mode
7 weeks ago
sc-ruzafa ▴ 10

You can use FASTQC to check if the sequences are trimmed or you need to remove the adapters, etc...

ADD COMMENT
0
Entering edit mode

FastQC can be of help. If data is untrimmed then all reads will be reported as full size and will match the reported length of sequencing. Generally after trimming reads will have a distribution in FastQC read length plot since all of them may not remain full length after trimming.

Note: There is a possibility that the data has NO extraneous sequence and thus would still remain full length after trimming.

ADD REPLY
0
Entering edit mode
7 weeks ago
ATpoint 52k

Basically you don't. While it is convention (afaik) to upload the raw data as they come from demultiplexing, the actual uploaded data is what the authors well...uploaded, and this in theory can be anything. There is no bullet-proof way to know beside emailing them.

Though, trimming would usually result in unequal read lengths throughout the files (adapter-containing reads get trimmed, others remain untrimmed) so this is something you can check. I mean in the end it does not really matter, does it? If you want to use the public dataset you are after then you have to use what is provided, and a good QC should always start with something like fastqc to assess whether trimming for adapters or quality was necessary, so this you anyway have to do, regardless how the data have been treated by the uploader before.

ADD COMMENT

Login before adding your answer.

Traffic: 2169 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6