JOURNAL ARTICLE

Bacteriophage classification for assembled contigs using graph convolutional network

Jiayu ShangJing‐Zhe JiangYanni Sun

Year: 2021 Journal:   Bioinformatics Vol: 37 (Supplement_1)Pages: i25-i33   Publisher: Oxford University Press

Abstract

Abstract Motivation Bacteriophages (aka phages), which mainly infect bacteria, play key roles in the biology of microbes. As the most abundant biological entities on the planet, the number of discovered phages is only the tip of the iceberg. Recently, many new phages have been revealed using high-throughput sequencing, particularly metagenomic sequencing. Compared to the fast accumulation of phage-like sequences, there is a serious lag in taxonomic classification of phages. High diversity, abundance and limited known phages pose great challenges for taxonomic analysis. In particular, alignment-based tools have difficulty in classifying fast accumulating contigs assembled from metagenomic data. Results In this work, we present a novel semi-supervised learning model, named PhaGCN, to conduct taxonomic classification for phage contigs. In this learning model, we construct a knowledge graph by combining the DNA sequence features learned by convolutional neural network and protein sequence similarity gained from gene-sharing network. Then we apply graph convolutional network to utilize both the labeled and unlabeled samples in training to enhance the learning ability. We tested PhaGCN on both simulated and real sequencing data. The results clearly show that our method competes favorably against available phage classification tools. Availability and implementation The source code of PhaGCN is available via: https://github.com/KennthShang/PhaGCN.

Keywords:
Contig Metagenomics DNA sequencing Computational biology Convolutional neural network Biology Biological classification Graph Computer science Artificial intelligence Genome Genetics Gene Theoretical computer science Evolutionary biology

Metrics

136
Cited By
19.28
FWCI (Field Weighted Citation Impact)
43
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Bacteriophages and microbial interactions
Physical Sciences →  Environmental Science →  Ecology
Genomics and Phylogenetic Studies
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
RNA and protein synthesis mechanisms
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
© 2026 ScienceGate Book Chapters — All rights reserved.