cell ranger custom gtf file
0
0
Entering edit mode
8 months ago
Arora • 0

I wish to make a custom gtf file using a multiline fasta file which has multiple transcripts. e.g.,

>NM_001282823.1 prolactin receptor (PRLR), mRNA
GCCAAGAGACTGGGAGTCAAAGAAAGTTTCTGAAATCAGTGGATTCTGCTTGAGAACAGAGCCTGGTTAT
>NM_001682822.1 SNAP25 (SNAP25), mRNA
GCCAAGAGACTGGGAGTCAAAGAAAGTTTCTGAAATCAGTGGATTCTGCTTGAGAACAGAGCCTGGTTAT
>NM_001287822.1 CACNA1F (CACNA1F), mRNA
GCCAAGAGACTGGGAGTCAAAGAAAGTTTCTGAAATCAGTGGATTCTGCTTGAGAACAGAGCCTGGTTAT

Is there a way I could make a gtf file using the commands below as mentioned by 10x (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/tutorial_mr#marker), but in a way it would output a gtf file containing information for all fasta entries rather than adding one by one?

cat NM_001282823.1 | grep -v "^>" | tr -d "\n" | wc -c
echo -e 'NM_001282823.1\tunknown\texon\t1\t922\t.\t+\t.\tgene_id "NM_001282823.1"; transcript_id "NM_001282823.1"; gene_name "NM_001282823.1"; gene_biotype "protein_coding";' > NM_001282823.1.gtf
gtf 10x single-cell • 338 views
ADD COMMENT
0
Entering edit mode

Look for ways to loop over entries and write a GTF based on the FASTA header. BioPython might be useful here. 10X's method is not meant to be used to put together an entire GTF like you're doing right now, so that part is going to be on you.

ADD REPLY

Login before adding your answer.

Traffic: 1826 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6