Characteristic features of Chip-seq and RNA-seq data
1
1
Entering edit mode
7.4 years ago
cl10101 ▴ 80

What are the characteristic features of Chip-seq and RNA-seq data? If I have fastq files which are the results of Chip-seq and RNA-seq experiment is it possible to differentiate then, for example by comparing to Chip-seq input result, which is explicitly marked?

RNA-Seq ChIP-Seq sequencing • 2.7k views
ADD COMMENT
2
Entering edit mode

While it may be possible to differentiate the data if you don't have clear information about what is what that just seems like bad experimental practice. If someone gave you this data then you should go back and get additional information from them. If you analyze the data as is and if it turns out that some of your assumptions were wrong then you will get blamed for the fallout.

ADD REPLY
1
Entering edit mode

You are totally right. I interpreted the question as a theoretical one, but if you really end up in a situation when you don't know what your data is, then guessing the data type just by looking at it is rather deseperate.

Moreover, there is more to know than just the distinction between ChIP and RNA-seq, such as the library preparation used, the origin of the samples, whether there were some kind of selection (ex. ribodepeltion),... All of this is important for interpreting the data.

ADD REPLY
0
Entering edit mode

Let us hope the question was indeed theoretical :)

Your answer below gives good hints of how to distinguish the samples (in theory) if cl10101 has no other option but to press on.

ADD REPLY
1
Entering edit mode
7.4 years ago

RNA-seq : (In eukaryots) splicing (some reads must be split to map), very uneven read coverage, especially in total RNA-seq where rRNA reads dominate.

ChIP-seq : relatively even coverage in the input fraction.

To check genome coverage requires mapping, but over-represented sequences can be analysed with fastqc and blast to provide a quick indication directly from the fastq files : If there are over-represented sequences that corresponds to highly expressed genes, then you are dealing with RNA-seq data.

ADD COMMENT
0
Entering edit mode

Thank you for your response. I mapped my fastq files to reference genome and now I am trying to differentiate them visually using IGV. Samples mapped to genome It seems to me that sample A (upper sample) has the most uneven coverage, but peaks location do not correspond to genes location (it is mRNA-seq data). What is the best way to differentiate them?

ADD REPLY
1
Entering edit mode

but peaks location do not correspond to genes locations (it is mRNA-seq data)

Well, a characteristic feature of mRNA-seq data is that "peaks" correspond to genes. You say that it is mRNA-seq data but are u sure about this ? What do you really want to achieve here ? It sounds like a XY problem...

ADD REPLY

Login before adding your answer.

Traffic: 2453 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6