I processed 3 BAM files that were generated from 3 different pipelines, so in total 9 BAM files by writing scripts in bash and python. I extracted the mapped reads from the BAM files and stored them in python sets. Then, I performed pair-wise intersection operations to see which reads are common in which BAM files (despite different pipelines).
The output 3x3 matrix was written into a tsv file:
14659 14659 14647 14659 15731 15709 14647 15709 15709
Numbers correspond to the number of reads that are in one intersection between 2 files.
Now, I wanted to load the marix into R and create an UpSet R plot. I know that a Venn Diagram would also work, but later on, I will have more pipelines to compare and so I chose UpSet R plots. I tried this code:
upset(test_df, sets = 'reconstructed', 'shuffled', 'trimmed', number.angles = 30, point.size = 3.5, line.size = 2, mainbar.y.label = "Read Intersections", sets.x.label = "Blabla", text.scale = c(1.3, 1.3, 1, 1, 2, 0.75), mb.ratio = c(0.55, 0.45), order.by = 'sets', keep.order = TRUE) But an error occured: Error in start_col:end_col : argument of length 0
Unfortunately, I am only a beginner in R w/o experience. Maybe, someone has more experience in R or the UpSet package.