Question: Characteristic features of Chip-seq and RNA-seq data
1
gravatar for cl10101
2.4 years ago by
cl1010180
cl1010180 wrote:

What are the characteristic features of Chip-seq and RNA-seq data? If I have fastq files which are the results of Chip-seq and RNA-seq experiment is it possible to differentiate then, for example by comparing to Chip-seq input result, which is explicitly marked?

sequencing rna-seq chip-seq • 1.4k views
ADD COMMENTlink modified 2.4 years ago by Carlo Yague4.4k • written 2.4 years ago by cl1010180
2

While it may be possible to differentiate the data if you don't have clear information about what is what that just seems like bad experimental practice. If someone gave you this data then you should go back and get additional information from them. If you analyze the data as is and if it turns out that some of your assumptions were wrong then you will get blamed for the fallout.

ADD REPLYlink written 2.4 years ago by genomax67k
1

You are totally right. I interpreted the question as a theoretical one, but if you really end up in a situation when you don't know what your data is, then guessing the data type just by looking at it is rather deseperate.

Moreover, there is more to know than just the distinction between ChIP and RNA-seq, such as the library preparation used, the origin of the samples, whether there were some kind of selection (ex. ribodepeltion),... All of this is important for interpreting the data.

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by Carlo Yague4.4k

Let us hope the question was indeed theoretical :)

Your answer below gives good hints of how to distinguish the samples (in theory) if cl10101 has no other option but to press on.

ADD REPLYlink written 2.4 years ago by genomax67k
1
gravatar for Carlo Yague
2.4 years ago by
Carlo Yague4.4k
Belgium
Carlo Yague4.4k wrote:

RNA-seq : (In eukaryots) splicing (some reads must be split to map), very uneven read coverage, especially in total RNA-seq where rRNA reads dominate.

ChIP-seq : relatively even coverage in the input fraction.

To check genome coverage requires mapping, but over-represented sequences can be analysed with fastqc and blast to provide a quick indication directly from the fastq files : If there are over-represented sequences that corresponds to highly expressed genes, then you are dealing with RNA-seq data.

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by Carlo Yague4.4k

Thank you for your response. I mapped my fastq files to reference genome and now I am trying to differentiate them visually using IGV. Samples mapped to genome It seems to me that sample A (upper sample) has the most uneven coverage, but peaks location do not correspond to genes location (it is mRNA-seq data). What is the best way to differentiate them?

ADD REPLYlink written 2.4 years ago by cl1010180
1

but peaks location do not correspond to genes locations (it is mRNA-seq data)

Well, a characteristic feature of mRNA-seq data is that "peaks" correspond to genes. You say that it is mRNA-seq data but are u sure about this ? What do you really want to achieve here ? It sounds like a XY problem...

ADD REPLYlink written 2.4 years ago by Carlo Yague4.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1622 users visited in the last hour