Can You Please Tell Me Where I Find Information About .Fai File Format?
4
7
Entering edit mode
13.8 years ago
Biomed 5.0k

I am playing with GATK but the web site states fai as the ref file format however I have access to reference alignments as maf files. Can you help me with this?

maf gatk • 25k views
ADD COMMENT
14
Entering edit mode
11.2 years ago

Purely for my future self if I google this again, the columns of a .fai file appear to be:

  • chromosome name
  • chromosome length
  • offset of the first base of the chromosome sequence in the file
  • length of the fasta lines
  • some other length of the fasta lines called "line_blen" in the source code? Appears to typically (for me) be length of fasta line + 1.

ETA: Oh, Pierre already answered this over here. blen is number of bytes in each fasta line.

ADD COMMENT
12
Entering edit mode
13.8 years ago
brentp 24k

that is an index of your fasta file. have a look at samtools:

Once installed, you can create an index of some.fasta as

samtools faidx some.fasta

this will create some.fasta.fai

and have a look here where it describes how to set up your data for GATK.

ADD COMMENT
0
Entering edit mode

Is there a way to do this with picard or GATK itself? I'd love to stay away from samtools if I could.

ADD REPLY
7
Entering edit mode
13.8 years ago

The FAIDX file is created by samtools faidx. The FAIDX file contains, among other things, the

  • name of the reference sequence (chr1, chr2...)
  • the offset of the first base of this sequence in the file
  • the length of the FASTA lines

with this information, samtools can quickly access any region of the genome.

See also this post I wrote about faidx

ADD COMMENT
0
Entering edit mode

Dear Pierre,

So if I understand right - if I need bed file from my fa.fai I can do just

awk 'OFS="\t" {print $1,$3,$2}' in.fai

? Thank you so much.

ADD REPLY
2
Entering edit mode
4.7 years ago

HTSlib has provided a manual page describing this format for a few years now. See man 5 faidx, also on the web at faidx(5) manual page.

ADD COMMENT

Login before adding your answer.

Traffic: 2583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6