Question: Advice on downstream analisys: data from RNA-Seq
0
gravatar for marcelolaia
5 months ago by
marcelolaia10
Brazil
marcelolaia10 wrote:

My scenario: I run featureCounts in two way:

a) featureCounts -p -B -a Specie.transcript.fa.gtf -t exon -g gene_id -o A1.counts.txt -f A1.bbduk.bam
b) featureCounts -p -B -a Specie.gene_exons.gtf -t exon -g transcript_id -o A1.counts_transcript_id.txt -f A1.bbduk.bam

From "a" I obtained a list of genes differentially expressed (GDE) by NOIseq package - 1,714 genes. From "b" I obtained a list of 3,067 exons DE.

I submitted that two lists to Blast2GO program and got Blastx, Interpro and EC for almost all sequences in each lists.

I have downloaded the GeneSCF and I will try it, too.

From here, I need help.

I would like to conduct a more refined analysis of this data. I tried to do a heatmap (pheatmap package in R), but, the huge amount of data shows up an unintelligible graphic. So, I did a subset of the data based on M value (NOISeq foldchange) >(+-)X (absolute value of X) and got a 84 DE exons/genes suitable for a plot. However, I see that plot and it like isn't a good idea doing a subset on data in this manner.

Have you ever been in a situation like this? Large amount of data? How did you do to extract the best biological information from them?

Any suggestion/advice is very welcome!

I'm a Debian user from Potato to now, but, I am not a programmer.

If this is a off topic, please, don't hesitate to tell me. I delete the post immediately.

Best

ADD COMMENTlink modified 4 months ago • written 5 months ago by marcelolaia10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1959 users visited in the last hour