How to discover condition-dependent SNPs in transcriptomic data where each genes has several isofofrms ?
0
0
Entering edit mode
7.3 years ago
Farbod ★ 3.4k

Dear Biostars, Hi.

I have asked similar question but I have not received an acceptable answer or guideline, so I am asking it in other words :

Imagine that I have done a de novo transcriptome assembly from RNA-seq data using Trinity (no reference genome).

I have 3 biological replications for condition1 and 3 for condition2 (I have launched standart pipeline using RSEM and DESeq2 for DEG analysis)

Questions:

1- is it practical to discover condition-dependent SNPs from this kind of RNA-seq project (non-model animals)?

2- if 1 = positive, How ? (please mention appropriate software or pipeline)

this is an example of SNP report in one paper:

Finally, a thorough genetic marker discovery pipeline led to the retrieval of 85,189 SNPs and 29,076 microsatellites enriching the available genetic markers for this species.

~ Best

NOTE: almost each gene/transcripts in Trinity assembly has several isoforms

RNA-Seq next-gen Assembly snp SNP • 2.0k views
ADD COMMENT
1
Entering edit mode

Just like in the previous topic, I wonder if this makes sense.
SNP = single nucleotide polymorphism => (germline) genetic variant.
Why would you expect to find variants specific to a condition?

ADD REPLY
0
Entering edit mode

Dear @Wouter, Hi.

Yes I simplified my other similar question about SNPs.

I wonder how in some RNA-seq papers, the authors report their SNPs along DEGs, SSRs, TFs and etc.

Are there any DE-SNPs discovery pipeline that I am not aware of ?

This is the purpose of this question. ;-)

ADD REPLY
0
Entering edit mode

Dear @Wouter, would you please have a look at the "Detection genetic markers" section for SNP detection from de novo transcriptome and help me about it ?

ADD REPLY
1
Entering edit mode

Seems like an "almost" standard way of variant calling. What's the problem with that? As you know, I work in human genetics so I might oversee some issues in variant calling after de novo transcriptome assembly.
But nothing is written about those SNPs being condition dependent.

I recently saw this paper which might be of interest for you: SuperTranscript: a data driven reference for analysis and visualisation of transcriptomes

ADD REPLY
0
Entering edit mode

Dear @WouterDeCoster, thank you,

Maybe I am not very well aware of what SNP means. I guess it means that it must be some substitution of one nucleotide in distinct populations or healthy/sick tissue/genes or responsible for two phenotypes ?

1- please correct me if it is not true.

2- I have 6 fastq for males and 6 for females RNA-seq reads of the same species (similar to the paper I have mentioned - no reference genome available), can I use the approach in that paper for SNP discovery?

3- What those SNP means (if 2=+) ? I mean what that "polymorphism" are showing if it is not case/control related ?

ADD REPLY
1
Entering edit mode

Hi Farbod,

SNPs (or single nucleotide polymorphisms) are genetic variants, most commonly just a substitution or more broadly a small deletion. Most (~99.99%) of SNPs are entirely meaningless/harmless or just having a small effect on the protein. A human genome will have ~millions of SNPs differing with the reference. Every individual differs on many position with other individuals (even monozygotic twins are not completely identical).

You definitely can map your reads to your assembly and call SNPs. But I'm not the right person to give advice on that.

Did that answer your questions?

ADD REPLY
0
Entering edit mode

Yes, it did. I really appreciate all your supports.

In regard to my #3 question, if "Most of SNPs are entirely meaningless/harmless" and "every individual differs on many position with other individuals (even in monozygotic twins)"; so what is the usage of SNP collecting in RNA-seq and what that "polymorphisms" are showing if it is "very abundant" ?

ADD REPLY
1
Entering edit mode

I'm not sure about the use of variant calling in RNA-seq. It's definitely not optimal, too. Allele-specific expression analysis can be done, but it's tricky at best.

In genetic association analysis those SNPs do have value (and for sure there are pathogenic SNPs too) but that's a different story.

ADD REPLY

Login before adding your answer.

Traffic: 2512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6