JOURNAL ARTICLE

Decoy-free protein-level false discovery rate estimation

Ben TengTing HuangZengyou He

Year: 2013 Journal:   Bioinformatics Vol: 30 (5)Pages: 675-681   Publisher: Oxford University Press

Abstract

Abstract Motivation: Statistical validation of protein identifications is an important issue in shotgun proteomics. The false discovery rate (FDR) is a powerful statistical tool for evaluating the protein identification result. Several research efforts have been made for FDR estimation at the protein level. However, there are still certain drawbacks in the existing FDR estimation methods based on the target-decoy strategy. Results: In this article, we propose a decoy-free protein-level FDR estimation method. Under the null hypothesis that each candidate protein matches an identified peptide totally at random, we assign statistical significance to protein identifications in terms of the permutation P-value and use these P-values to calculate the FDR. Our method consists of three key steps: (i) generating random bipartite graphs with the same structure; (ii) calculating the protein scores on these random graphs; and (iii) calculating the permutation P value and final FDR. As it is time-consuming or prohibitive to execute the protein inference algorithms for thousands of times in step ii, we first train a linear regression model using the original bipartite graph and identification scores provided by the target inference algorithm. Then we use the learned regression model as a substitute of original protein inference method to predict protein scores on shuffled graphs. We test our method on six public available datasets. The results show that our method is comparable with those state-of-the-art algorithms in terms of estimation accuracy. Availability: The source code of our algorithm is available at: https://sourceforge.net/projects/plfdr/ Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Keywords:
False discovery rate Computer science Inference Resampling Bipartite graph Decoy Permutation (music) Statistical inference Random permutation Identification (biology) Statistical hypothesis testing Algorithm Data mining Artificial intelligence Graph Statistics Mathematics Theoretical computer science Biology

Metrics

7
Cited By
0.85
FWCI (Field Weighted Citation Impact)
16
Refs
0.75
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Gene expression and cancer classification
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Machine Learning in Bioinformatics
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Advanced Proteomics Techniques and Applications
Physical Sciences →  Chemistry →  Spectroscopy

Related Documents

JOURNAL ARTICLE

Target‐decoy false discovery rate estimation using Crema

Andy LinDonavan SeeWilliam E. FondrieUri KeichWilliam Stafford Noble

Journal:   PROTEOMICS Year: 2024 Vol: 24 (8)Pages: e2300084-e2300084
JOURNAL ARTICLE

Null-free False Discovery Rate Control Using Decoy Permutations

Kun HeMengjie LiYan FuFuzhou GongXiaoming Sun

Journal:   Acta Mathematicae Applicatae Sinica English Series Year: 2022 Vol: 38 (2)Pages: 235-253
JOURNAL ARTICLE

A new estimation of protein-level false discovery rate

Guanying WuXiang WanBaohua Xu

Journal:   BMC Genomics Year: 2018 Vol: 19 (S6)Pages: 567-567
JOURNAL ARTICLE

Target-small decoy search strategy for false discovery rate estimation

Hyunwoo KimSangjeong LeeHeejin Park

Journal:   BMC Bioinformatics Year: 2019 Vol: 20 (1)Pages: 438-438
JOURNAL ARTICLE

An algorithm for decoy-free false discovery rate estimation in XL-MS/MS proteomics

Yisu PengShantanu JainPredrag Radivojac

Journal:   Bioinformatics Year: 2024 Vol: 40 (Supplement_1)Pages: i428-i436
© 2026 ScienceGate Book Chapters — All rights reserved.