Renaming fasta files with chromosomes
0
0
Entering edit mode
2.5 years ago
namea38 • 0

I have a .fasta file that I'm trying to use bedtools getfasta on. When I run it I get the error

WARNING. chromosome (N) was not found in the FASTA file. Skipping.

which I think is because my .fasta headers look like this

>HWI-D00270:252:CB1D5ANXX:8:1303:19141:48584/1
ACAGCTGATTAGACACAATGTCAACAAAGTACTGAAGACCAGAGAAAAACACTTATTATACTC
TTTGTTTTCAGGTGTGGAATGTGCTTTCTACCACGGCTACAAATACTACAAAGGATGTAGTA

and not like this

>chrI
ACAGCTGATTAGACACAATGTCAACAAAGTACTGAAGACCAGAGAAAAACACTTATTATACTC
TTTGTTTTCAGGTGTGGAATGTGCTTTCTACCACGGCTACAAATACTACAAAGGATGTAGTA

Is there a way I can edit the header to reflect only the chromosome.

fasta chromosome bedtools • 1.3k views
ADD COMMENT
0
Entering edit mode

You appear to have Illumina reads converted into fasta format in first example. That does not appear to be chromosome data at all.

Are you using a bed file for the intervals? What does it look like?

ADD REPLY
0
Entering edit mode

I have a bed file I downloaded from the USCS genome browser

chrI 2653 2738 (TAG)n 322 +

chrI 2974 3011 AT_rich 30 +

chrI 3034 3069 AT_rich 21 +

ADD REPLY
0
Entering edit mode

Why are you using a file with Illumina read data in fasta format instead of using the genome sequence file from UCSC? I assume you want to retrieve the fasta sequence corresponding to those intervals?

ADD REPLY
0
Entering edit mode

Yeah I'm trying to use the DNA my lab sequenced from a particular region to look for transposons

ADD REPLY

Login before adding your answer.

Traffic: 2130 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6