Question: Help: RNA-seq analysis
0
gravatar for lucilepain
20 months ago by
lucilepain0
lucilepain0 wrote:

Hello everyone,

My post will probably look "basic" to many of you but either, let's go. I am really beginner (I have the foundations of the modules available via Coursera) so my terminology may be incorrect.

I have to analyze RNA-seq data in order to arrive at the top differentially expressed genes between my samples disease/health. The analyst who was in charge of performing the RNA-seq with my samples is going to reads alignment and sent me data aligned in .bed .bam and .bai format.

When i look for analysis tools I know that I must proceed to the assembly, then the quantification of the expression of the transcripts to arrive at the differential splicing and expression. I found a huge amount of tools, commands, environments to use but it's still nebulous to me. So I wanted to use Galaxy (Cufflinks-> cuffmerge-> Cuffdiff) to get my results but apparently my files are too heavy for this tool :/ Would anyone have suggestions of pipelines, worflow or even better tools / environment that are better than others to analyze this type of data? or that are used in routine in analysis? Every suggestion or advice is welcome.

Thank you very much for your help.

rna-seq • 655 views
ADD COMMENTlink modified 20 months ago by mforde841.2k • written 20 months ago by lucilepain0
2
gravatar for mforde84
20 months ago by
mforde841.2k
mforde841.2k wrote:

https://www.bioconductor.org/help/workflows/rnaseqGene/

ADD COMMENTlink written 20 months ago by mforde841.2k

Thanks, i already read that page but what remained unclear to me, was the use of files .bed and .bai if , according to the page, only my .bam are required. Any idea?

Thank you very much for your help.

ADD REPLYlink written 20 months ago by lucilepain0
1

bam files are compressed alignment files and bai files are indices on those files. you shouldn't really have to work directly with bai, though occasionally you will have to generate them explicitly for other programs to use (e.g., IGV). Bed files are interval files which group alignments and read pileups by genomic coordinates, you will use these for specific applications and I'm not entirely sure what your pipeline will need them for. Ultimately you can generate them from bam files. Also theres some general documenation of file formats from UCSC : https://genome.ucsc.edu/FAQ/FAQformat

to view bam files in plain text you can use samtools:

samtools view *.bam | head

to generate bai files you can use samtools:

samtools index *.bam

to generate bed files you can use bedtools:

bedtools bamtobed ...
ADD REPLYlink modified 20 months ago • written 20 months ago by mforde841.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1126 users visited in the last hour