viewing MAF files as dataframe of size over 2GB
1
0
Entering edit mode
23 months ago
noon • 0

I used GATK's Funcotator to annotate a VCF file I have and it produced a MAF file just over 2GB in size. I've tried using pandas in Python and maftools in R and it hasn't worked. Specifically the file size seems to be too large to be opened in R throwing this error

Error in data.table::fread(file = maf, sep = "\t", stringsAsFactors = FALSE,  : File '' does not exist or is non-readable.

and pandas isn't really made for MAF files. Usually when running this annotation it was enough to open it in Excel but this file is way too big. Does anybody know an application or package (whether it be R or python or something else) to open MAF files of this size? Any help is appreciated.

R python gatk MAF annotations • 1.6k views
ADD COMMENT
0
Entering edit mode

That error is not because memory, it is pointing that the file is not located where your maf variable declares it.

Why do you need to open it? Linux commands more, grep, awk can help you to view the content.

ADD REPLY
1
Entering edit mode

I wrote a simple helper alias for files like these:

alias tsview='column -s$'\''\t'\'' -t | less -S'

Spreadsheet-like view on the terminal!

ADD REPLY
0
Entering edit mode

Yes this works so I can view it! The problem is I need to manipulate it as if it were a dataframe so I can apply certain thresholds to the data and select for specific columns.

ADD REPLY
0
Entering edit mode

Your maf variable doesn't seem to contain the path to the MAF file. Can you show the output of dput(maf)?

ADD REPLY
0
Entering edit mode

This is what I have:

 funcotation = system.file('extdata', '/Users/Downloads/funcotated.maf',
                              package='maftools')
readfunc=read.maf(maf=funcotation)

when I use dput(maf) I get this:

""
ADD REPLY
3
Entering edit mode
23 months ago
Ram 36k

That's not how system.file works. Use: read.maf(maf = '/Users/Downloads/funcotated.maf') and skip the first line altogether.

ADD COMMENT
0
Entering edit mode

Thank you that was the problem! Sorry I'm new to using maftools so I was going off the documentation tutorial.

Thank you for your time!

ADD REPLY
1
Entering edit mode

That happens. I find it useful to take the following steps while working on changing documentation code:

  1. Run the code and ensure it works as given (it usually does)
  2. Understand each function call and parameter in the lines leading up to the line I wish to change: in your case, that would mean reading through ?system.file and looking at its parameters. It is here that you'd find that it finds files that are included within packages. You're using a custom file, so system.file is not for you.
  3. OK, then, how do you give a custom file? From the ?system.file documentation, it is clear that it returns a string with the path to the file. So, if you have that string, you can do what system.file() does. As it turns out, you have the path. Replace the call to system.file with the path you have, and you're all set.

It also helps to check each line's execution and output as that line is executed. You'd have noticed the error happening in line 1 and probably solved the problem yourself.

ADD REPLY
1
Entering edit mode

BTW, please accept my answer using the green check mark on the left to mark the question solved.

Upvote|Bookmark|Accept

ADD REPLY

Login before adding your answer.

Traffic: 1995 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6