JOURNAL ARTICLE

Gene Sequence Representation Learning Based on Virus Transmission Network

Abstract

There always exists non-coding and missing sequence in obtained gene sequence data. The existing gene sequence representation methods extract features from high dimension gene sequence mostly through manual process, which usually are computationally expensive. What's more, the precision of prediction heavily relies on how to utilize the biology background knowledge. In this work, we construct a gene sequence representation method based on graph context information in virus transmission network. After coding the target node's virus sequence, we use attention mechanism to aggregate the neighbor nodes' gene sequence information, and thus we can achieve a new representation of the target node's gene sequence. The gene sequence representation model is optimized based on the fact that the similarity of gene sequence of neighbor nodes is higher than that of non-neighbor nodes. The new representation after being well trained not only extracts the feature of sequence exactly, but also reduces the dimension of gene sequence greatly and improve the computation efficiency. We first train the gene sequence representation model respectively on a simulation transmission network, SARS-CoV-2 and HIV transmission network, and then predict the un-sampled infections in each transmission network. The experimental results show the effectiveness of our model, and its performance is better than other models. What's more, its success on effectively predicting the un-sampled infections in virus transmission network has a certain practical significance in the epidemiological investigation area. © 2021, Science Press. All right reserved.

Keywords:
Computer science Sequence (biology) Alignment-free sequence analysis Representation (politics) Artificial intelligence Gene Computational biology Sequence alignment Genetics Peptide sequence Biology

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.08
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Bioinformatics and Genomic Networks
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Gene expression and cancer classification
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Machine Learning in Bioinformatics
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology

Related Documents

JOURNAL ARTICLE

WalkGAN: Network Representation Learning With Sequence-Based Generative Adversarial Networks

Taisong JinXixi YangZhengtao YuHan LuoYongmei ZhangFeiran JieXiangxiang ZengMin Jiang

Journal:   IEEE Transactions on Neural Networks and Learning Systems Year: 2022 Vol: 35 (4)Pages: 5684-5694
JOURNAL ARTICLE

Role-Based Network Representation Learning Method

XU You, WANG Xiaoping, XIONG Yun

Journal:   DOAJ (DOAJ: Directory of Open Access Journals) Year: 2021
JOURNAL ARTICLE

Point2Sequence: Learning the Shape Representation of 3D Point Clouds with an Attention-Based Sequence to Sequence Network

Xinhai LiuZhizhong HanYu-Shen LiuMatthias Zwicker

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2019 Vol: 33 (01)Pages: 8778-8785
© 2026 ScienceGate Book Chapters — All rights reserved.