Differential expression analysis on miRNA count files
4
0
Entering edit mode
3.5 years ago

Dear all,

I want to perform differential expression analysis between two mirSeq samples.

This is not my own project. I received bam files in the following format. However it is more similar to count file than alignment file:

QNAME   FLAG    RNAME   POS MAPQ    CIGAR   MRNM    MPOS    ISIZE   SEQ QUAL    OPT
@HD VN:1.0 SO:coordinate
@SQ SN:hsa-let-7a-1 LN:80
@SQ SN:hsa-let-7a-2 LN:72
@SQ SN:hsa-let-7a-3 LN:74
@SQ SN:hsa-let-7b LN:83
@SQ SN:hsa-let-7c LN:84
@SQ SN:hsa-let-7d LN:87
@SQ SN:hsa-let-7e LN:79
@SQ SN:hsa-let-7f-1 LN:87
@SQ SN:hsa-let-7f-2 LN:83
@SQ SN:hsa-mir-15a LN:83
@SQ SN:hsa-mir-16-1 LN:89
@SQ SN:hsa-mir-17 LN:84
@SQ SN:hsa-mir-18a LN:71
@SQ SN:hsa-mir-19a LN:82
@SQ SN:hsa-mir-19b-1 LN:87
@SQ SN:hsa-mir-19b-2 LN:96
@SQ SN:hsa-mir-20a LN:71
@SQ SN:hsa-mir-21 LN:72
@SQ SN:hsa-mir-22 LN:85
@SQ SN:hsa-mir-23a LN:73


Can you tell me the best way to get differential expressed miRNAs from these files?

Deseq2 does not recognize these files as count files.

Nazanin

mirSeq Differential expression • 2.1k views
1
Entering edit mode

Is this indeed a BAM file? It looks like it but should not contain a header explaining the columns. What is the output of samtools view your.bam | head and samtools view -H your.bam | head

0
Entering edit mode

Can you elaborate how you quantified the aligned file and what tools have been used? This does not look like a count matrix.

0
Entering edit mode

It doesn't matter which aligner, its looks like a BAM file. Can you do featurecounts with your Bam file and extract the read counts. One more thing, if you want to use DESEQ2 for DE you have to have a replicates, but you said you have two samples, so better to go with Noiseq.

0
Entering edit mode

The simplest way is thin online tool

http://oasis.dzne.de/

0
Entering edit mode
3.5 years ago

Hi,

As I said in my post I do not know by which program this file has been generated.

Some of my friends sent these files to me and asked me to do differential expression analysis on them. However the format of these files is not familiar. The files have are in bam format, however as you said it looks like count files.

We will try to contact to the sequencing center and asked for the raw data.

Thank you anyway

0
Entering edit mode
3.5 years ago

That does not look like a count file, it looks like the header of a bam file, made by aligning a single sample to a fasta of short targets. It is not at all clear to me that that is the best way to align data for an experiment of this kind. But if you wanted counts, you could probably use samtools idxstats to quickly total up how many reads hit each element.

0
Entering edit mode
3.5 years ago
msBinf • 0

Not sure if it is the case for these files but running the bcbio smallrnaseq pipeline produces bam files with a header like that. Perhaps produced by the tools miraligner/seqbuster

If they did use this pipeline, which uses the seqbuster tool for quantification, there should also be a counts file per sample which has more obvious count type data, but also quantifies various types of isomirs, bcbio calls it $sample_name-mirbase-ready.counts bcbio also generates an overall counts tsv with counts per mir strand per samples in the final/YYYY-MM-DD_$run_name folder called counts.tsv which could be easier if you're only interested in the mir strands not all the isomirs.

Anyways hopefully that helps you figure out the files.

0
Entering edit mode
3.4 years ago

Hi

First use HTSeq-count program on Human miRNA GTF file (in mirbase website) to get a count matrix from bam/sam files.

Then you could use DESeq2 or edgeR for DE analysis on miRs.

0
Entering edit mode

Thanks. My problem was solved a few weeks ago