Question: Renaming fasta files with chromosomes
0
gravatar for namea38
8 months ago by
namea380
namea380 wrote:

I have a .fasta file that I'm trying to use bedtools getfasta on. When I run it I get the error

WARNING. chromosome (N) was not found in the FASTA file. Skipping.

which I think is because my .fasta headers look like this

>HWI-D00270:252:CB1D5ANXX:8:1303:19141:48584/1
ACAGCTGATTAGACACAATGTCAACAAAGTACTGAAGACCAGAGAAAAACACTTATTATACTC
TTTGTTTTCAGGTGTGGAATGTGCTTTCTACCACGGCTACAAATACTACAAAGGATGTAGTA

and not like this

>chrI
ACAGCTGATTAGACACAATGTCAACAAAGTACTGAAGACCAGAGAAAAACACTTATTATACTC
TTTGTTTTCAGGTGTGGAATGTGCTTTCTACCACGGCTACAAATACTACAAAGGATGTAGTA

Is there a way I can edit the header to reflect only the chromosome.

chromosome bedtools fasta • 333 views
ADD COMMENTlink written 8 months ago by namea380

You appear to have Illumina reads converted into fasta format in first example. That does not appear to be chromosome data at all.

Are you using a bed file for the intervals? What does it look like?

ADD REPLYlink modified 8 months ago • written 8 months ago by genomax71k

I have a bed file I downloaded from the USCS genome browser

chrI 2653 2738 (TAG)n 322 +

chrI 2974 3011 AT_rich 30 +

chrI 3034 3069 AT_rich 21 +

ADD REPLYlink modified 8 months ago • written 8 months ago by namea380

Why are you using a file with Illumina read data in fasta format instead of using the genome sequence file from UCSC? I assume you want to retrieve the fasta sequence corresponding to those intervals?

ADD REPLYlink modified 8 months ago • written 8 months ago by genomax71k

Yeah I'm trying to use the DNA my lab sequenced from a particular region to look for transposons

ADD REPLYlink written 8 months ago by namea380
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1707 users visited in the last hour