Herald:The Biostar Herald for Monday, August 15, 2022
Entering edit mode
23 months ago
Biostar 2.9k

The Biostar Herald publishes user submitted links of bioinformatics relevance. It aims to provide a summary of interesting and relevant information you may have missed. You too can submit links here.

This edition of the Herald was brought to you by contribution from Istvan Albert, and was edited by Istvan Albert,

The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual | bioRxiv (www.biorxiv.org)

We used long-read DNA sequencing to assemble the genome of a Southern Han Chinese male. We organized the sequence into chromosomes and filled in gaps using the recently completed CHM13 genome as a guide, yielding a gap-free genome, Han1, containing 3,099,707,698 bases. Using the CHM13 annotation as a reference, we mapped all genes onto the Han1 genome and identified additional gene copies, generating a total of 60,708 genes, of which 20,003 are protein coding.

submitted by: Istvan Albert

GitHub - lh3/srf: SRF: Satellite Repeat Finder (github.com)

Satellite Repeat Finder, or SRF in brief, assembles motifs in satellite DNA that are tandemly repeated many times in the genome. It takes short reads, accurate long reads or high-quality contigs as input and reports the consensus of each repeat unit. SRF can identify satellite repeats that are often missed in de novo assembly. For species enriched with high-order repeats (HORs), it tends to find HORs instead of the minimal repeat unit. SRF may also find truly circular genomes such as mitochondial or chloroplastic genomes if their abundance is high.

submitted by: Istvan Albert

GitHub - agshumate/Liftoff: An accurate GFF3/GTF lift over pipeline (github.com)

Liftoff is a tool that accurately maps annotations in GFF or GTF between assemblies of the same, or closely-related species. Unlike current coordinate lift-over tools which require a pre-generated “chain” file as input, Liftoff is a standalone tool that takes two genome assemblies and a reference annotation as input and outputs an annotation of the target genome. Liftoff uses Minimap2 (Li, 2018) to align the gene sequences from a reference genome to the target genome. Rather than aligning whole genomes, aligning only the gene sequences allows genes to be lifted over even if there are many structural differences between the two genomes.

submitted by: Istvan Albert

GitHub - EBIvariation/variant-remapping: The pipeline for remapping VCF variants between two arbitrary FASTA assemblies. (github.com)

Pipeline for remapping VCF variants between two arbitrary assemblies in FASTA format. No chain file is required. However, it does assume that the source and destination genomes are closely related and was designed with the explicit purpose of lifting over variants from one version of the genome to another.

submitted by: Istvan Albert

GIGGLE: a search engine for large-scale integrated genome analysis | Nature Methods (www.nature.com)

GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation.

submitted by: Istvan Albert

Want to get the Biostar Herald in your email? Who wouldn't? Sign up righ'ere: toggle subscription

herald • 588 views

Login before adding your answer.

Traffic: 2636 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6