Leveraging Fecal Bacterial Survey Data to Predict Colorectal Tumors

Zhang, Bangzhou and Xu, Shuangbin and Xu, Wei and Chen, Qiongyun and Chen, Zhangran and Yan, Changsheng and Fan, Yanyun and Zhang, Huangkai and Liu, Qi and Yang, Jie and Yang, Jinfeng and Xiao, Chuanxing and Xu, Hongzhi and Ren, Jianlin (2019) Leveraging Fecal Bacterial Survey Data to Predict Colorectal Tumors. Frontiers in Genetics, 10. ISSN 1664-8021

[thumbnail of pubmed-zip/versions/1/package-entries/fgene-10-00447/fgene-10-00447.pdf] Text
pubmed-zip/versions/1/package-entries/fgene-10-00447/fgene-10-00447.pdf - Published Version

Download (2MB)

Abstract

Colorectal cancer (CRC) ranks second in cancer-associated mortality and third in the incidence worldwide. Most of CRC follow adenoma-carcinoma sequence, and have more than 90% chance of survival if diagnosed at early stage. But the recommended screening by colonoscopy is invasive, expensive, and poorly adhered to. Recently, several studies reported that the fecal bacteria might provide non-invasive biomarkers for CRC and precancerous tumors. Therefore, we collected and uniformly re-analyzed these published fecal 16S rDNA sequencing datasets to verify the association and identify biomarkers to classify and predict colorectal tumors by random forest method. A total of 1674 samples (330 CRC, 357 advanced adenoma, 141 adenoma, and 846 control) from 7 studies were analyzed in this study. By random effects model and fixed effects model, we observed significant differences in alpha-diversity and beta-diversity between individuals with CRC and the normal colon, but not between adenoma and the normal. We identified various bacterial genera with significant odds ratios for colorectal tumors at different stages. Through building random forest model with 10-fold cross-validation as well as new test datasets, we classified individuals with CRC, advanced adenoma, adenoma and normal colon. All approaches obtained comparable performance at entire OTU level, entire genus level, and the common genus level as measured using AUC. When combined all samples, the AUC of random forest model based on 12 common genera reached 0.846 for CRC, although the predication performed poorly for advance adenoma and adenoma.

Item Type: Article
Subjects: Academic Digital Library > Medical Science
Depositing User: Unnamed user with email info@academicdigitallibrary.org
Date Deposited: 08 Feb 2023 07:00
Last Modified: 23 Mar 2024 04:27
URI: http://publications.article4sub.com/id/eprint/587

Actions (login required)

View Item
View Item