Question: Help: RNA-seq analysis
gravatar for lucilepain
2.5 years ago by
lucilepain0 wrote:

Hello everyone,

My post will probably look "basic" to many of you but either, let's go. I am really beginner (I have the foundations of the modules available via Coursera) so my terminology may be incorrect.

I have to analyze RNA-seq data in order to arrive at the top differentially expressed genes between my samples disease/health. The analyst who was in charge of performing the RNA-seq with my samples is going to reads alignment and sent me data aligned in .bed .bam and .bai format.

When i look for analysis tools I know that I must proceed to the assembly, then the quantification of the expression of the transcripts to arrive at the differential splicing and expression. I found a huge amount of tools, commands, environments to use but it's still nebulous to me. So I wanted to use Galaxy (Cufflinks-> cuffmerge-> Cuffdiff) to get my results but apparently my files are too heavy for this tool :/ Would anyone have suggestions of pipelines, worflow or even better tools / environment that are better than others to analyze this type of data? or that are used in routine in analysis? Every suggestion or advice is welcome.

Thank you very much for your help.

rna-seq • 867 views
ADD COMMENTlink modified 2.5 years ago by mforde841.2k • written 2.5 years ago by lucilepain0
gravatar for mforde84
2.5 years ago by
mforde841.2k wrote:

ADD COMMENTlink written 2.5 years ago by mforde841.2k

Thanks, i already read that page but what remained unclear to me, was the use of files .bed and .bai if , according to the page, only my .bam are required. Any idea?

Thank you very much for your help.

ADD REPLYlink written 2.5 years ago by lucilepain0

bam files are compressed alignment files and bai files are indices on those files. you shouldn't really have to work directly with bai, though occasionally you will have to generate them explicitly for other programs to use (e.g., IGV). Bed files are interval files which group alignments and read pileups by genomic coordinates, you will use these for specific applications and I'm not entirely sure what your pipeline will need them for. Ultimately you can generate them from bam files. Also theres some general documenation of file formats from UCSC :

to view bam files in plain text you can use samtools:

samtools view *.bam | head

to generate bai files you can use samtools:

samtools index *.bam

to generate bed files you can use bedtools:

bedtools bamtobed ...
ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by mforde841.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1092 users visited in the last hour