Question: How to discover condition-dependent SNPs in transcriptomic data where each genes has several isofofrms ?
0
gravatar for Farbod
9 months ago by
Farbod3.0k
Toronto
Farbod3.0k wrote:

Dear Biostars, Hi.

I have asked similar question but I have not received an acceptable answer or guideline, so I am asking it in other words :

Imagine that I have done a de novo transcriptome assembly from RNA-seq data using Trinity (no reference genome).

I have 3 biological replications for condition1 and 3 for condition2 (I have launched standart pipeline using RSEM and DESeq2 for DEG analysis)

Questions:

1- is it practical to discover condition-dependent SNPs from this kind of RNA-seq project (non-model animals)?

2- if 1 = positive, How ? (please mention appropriate software or pipeline)

this is an example of SNP report in one paper:

Finally, a thorough genetic marker discovery pipeline led to the retrieval of 85,189 SNPs and 29,076 microsatellites enriching the available genetic markers for this species.

~ Best

NOTE: almost each gene/transcripts in Trinity assembly has several isoforms

snp rna-seq next-gen assembly • 423 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by Farbod3.0k
1

Just like in the previous topic, I wonder if this makes sense.
SNP = single nucleotide polymorphism => (germline) genetic variant.
Why would you expect to find variants specific to a condition?

ADD REPLYlink written 9 months ago by WouterDeCoster22k

Dear @Wouter, Hi.

Yes I simplified my other similar question about SNPs.

I wonder how in some RNA-seq papers, the authors report their SNPs along DEGs, SSRs, TFs and etc.

Are there any DE-SNPs discovery pipeline that I am not aware of ?

This is the purpose of this question. ;-)

ADD REPLYlink written 9 months ago by Farbod3.0k

Dear @Wouter, would you please have a look at the "Detection genetic markers" section for SNP detection from de novo transcriptome and help me about it ?

ADD REPLYlink written 9 months ago by Farbod3.0k
1

Seems like an "almost" standard way of variant calling. What's the problem with that? As you know, I work in human genetics so I might oversee some issues in variant calling after de novo transcriptome assembly.
But nothing is written about those SNPs being condition dependent.

I recently saw this paper which might be of interest for you: SuperTranscript: a data driven reference for analysis and visualisation of transcriptomes

ADD REPLYlink written 9 months ago by WouterDeCoster22k

Dear @WouterDeCoster, thank you,

Maybe I am not very well aware of what SNP means. I guess it means that it must be some substitution of one nucleotide in distinct populations or healthy/sick tissue/genes or responsible for two phenotypes ?

1- please correct me if it is not true.

2- I have 6 fastq for males and 6 for females RNA-seq reads of the same species (similar to the paper I have mentioned - no reference genome available), can I use the approach in that paper for SNP discovery?

3- What those SNP means (if 2=+) ? I mean what that "polymorphism" are showing if it is not case/control related ?

ADD REPLYlink modified 9 months ago • written 9 months ago by Farbod3.0k
1

Hi Farbod,

SNPs (or single nucleotide polymorphisms) are genetic variants, most commonly just a substitution or more broadly a small deletion. Most (~99.99%) of SNPs are entirely meaningless/harmless or just having a small effect on the protein. A human genome will have ~millions of SNPs differing with the reference. Every individual differs on many position with other individuals (even monozygotic twins are not completely identical).

You definitely can map your reads to your assembly and call SNPs. But I'm not the right person to give advice on that.

Did that answer your questions?

ADD REPLYlink written 9 months ago by WouterDeCoster22k

Yes, it did. I really appreciate all your supports.

In regard to my #3 question, if "Most of SNPs are entirely meaningless/harmless" and "every individual differs on many position with other individuals (even in monozygotic twins)"; so what is the usage of SNP collecting in RNA-seq and what that "polymorphisms" are showing if it is "very abundant" ?

ADD REPLYlink modified 9 months ago • written 9 months ago by Farbod3.0k
1

I'm not sure about the use of variant calling in RNA-seq. It's definitely not optimal, too. Allele-specific expression analysis can be done, but it's tricky at best.

In genetic association analysis those SNPs do have value (and for sure there are pathogenic SNPs too) but that's a different story.

ADD REPLYlink written 9 months ago by WouterDeCoster22k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1392 users visited in the last hour