Can I directly compare the regions defined by peaks from pair-end 50 and pair-end 100 ATAC-seq?
0
1
Entering edit mode
19 months ago
Dan ▴ 180

My ATAC-seq samples have both pair-end 50 and pair-end 100 reads. After peak calling using macs2, can I directly compare the counts of the regions defined by these peaks from different read lengths?

library(Rsubread)
library(GenomicRanges)
library(DESeq2)

peaks <- dir("peaks/", pattern = "*.narrowPeak", 
             full.names = TRUE)

myPeaks <- lapply(peaks, ChIPQC:::GetGRanges, simple = TRUE)

myGRangesList<-GRangesList(myPeaks)   
reduced <- reduce(unlist(myGRangesList))
consensusIDs <- paste0("consensus_", seq(1, length(reduced)))
mcols(reduced) <- do.call(cbind, lapply(myGRangesList, function(x) (reduced %over% x) + 0))
reducedConsensus <- reduced
mcols(reducedConsensus) <- cbind(as.data.frame(mcols(reducedConsensus)), consensusIDs)

bamsToCount <- dir("bam/", full.names = TRUE, pattern = "*.\\.bam$")


regionsToCount <- data.frame(GeneID = paste("ID", seqnames(reducedConsensus), 
                                            start(reducedConsensus), end(reducedConsensus), sep = "_"), Chr = seqnames(reducedConsensus), 
                             Start = start(reducedConsensus), End = end(reducedConsensus), Strand = strand(reducedConsensus))


fcResults <- featureCounts(bamsToCount, annot.ext = regionsToCount, isPairedEnd = TRUE, 
                           countMultiMappingReads = FALSE, maxFragLength=1000)

myCounts <- fcResults$counts

colnames(myCounts) <- c("sample_1", "sample_2", "sample_3", "sample_4")



# [generate the metadata] to describe/group the samples
Group <- factor(c("Control", "Control", "Treat","Treat"))

metaData <- data.frame(Group, row.names = colnames(myCounts))
atacDDS <- DESeqDataSetFromMatrix(myCounts, metaData, ~Group, rowRanges = reducedConsensus)
atacDDS <- DESeq(atacDDS)

atac_Rlog <- rlog(atacDDS)

rawCounts <- counts(atacDDS, normalized = FALSE)

Thanks

ATAC-seq • 1.1k views
ADD COMMENT
0
Entering edit mode

Please define compare.

ADD REPLY
0
Entering edit mode

I am sorry the question was not clear. I edited it. Could you tell me if the codes are reasonable or not? Thanks

ADD REPLY
0
Entering edit mode

I see. Basically that is correct I think, so using a dedicated stats framework like DESeq2. The impact of read length is probably minor and limited to regions that are difficult to map. Still, to be sure you can trim the reads all down to 50bp and then repeat alignment.

ADD REPLY
0
Entering edit mode

I used trim_galore --illumina --paired --fastqc to trim the adaptors, is it possible and how should I trim the reads to pair-end 50? Thanks a lot.

ADD REPLY
0
Entering edit mode

trim_galore has parameters that can trim specific bp from either end: --clip_R1, --clip_R2, --three_prime_clip_R1, --three_prime_clip_R2, which parameter should I use for my situation? Thanks a lot.

ADD REPLY
2
Entering edit mode

You will need to clip on 3'-end for both reads.

ADD REPLY
0
Entering edit mode

I see. Thanks

ADD REPLY
0
Entering edit mode

I always use seqtk for these kinds of things.

ADD REPLY
0
Entering edit mode

Is seqtk better than trim_galore?

ADD REPLY
0
Entering edit mode

I do not know, I do not use trim_galore. If trim_galore can do the job it is fine.

ADD REPLY

Login before adding your answer.

Traffic: 1540 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6