Advice on downstream analisys: data from RNA-Seq
Entering edit mode
3.4 years ago
marcelolaia ▴ 10

My scenario:

I run featureCounts in two way:

  1. Approach a: featureCounts -p -B -a Specie.transcript.fa.gtf -t exon -g gene_id -o A1.counts.txt -f A1.bbduk.bam
  2. Approach b: featureCounts -p -B -a Specie.gene_exons.gtf -t exon -g transcript_id -o A1.counts_transcript_id.txt -f A1.bbduk.bam

From 'a', I obtained a list of genes differentially expressed (GDE) by NOIseq package - 1,714 genes. From 'b', I obtained a list of 3,067 exons DE.

I submitted that two lists to Blast2GO program and got Blastx, Interpro and EC for almost all sequences in each lists.

I have downloaded the GeneSCF and I will try it, too.

From here, I need help.

I would like to conduct a more refined analysis of this data. I tried to do a heatmap (pheatmap package in R), but, the huge amount of data shows up an unintelligible graphic. So, I did a subset of the data based on M value (NOISeq foldchange) >(+-)X (absolute value of X) and got a 84 DE exons/genes suitable for a plot. However, I see that plot and it like isn't a good idea doing a subset on data in this manner.

Have you ever been in a situation like this? Large amount of data? How did you do to extract the best biological information from them?

Any suggestion/advice is very welcome!

I'm a Debian user from Potato to now, but, I am not a programmer.

If this is a off topic, please, don't hesitate to tell me. I delete the post immediately.


differentially-expressed-genes RNA-Seq • 675 views

Login before adding your answer.

Traffic: 2281 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6