Skip to main content

Table 6 Details of data normalization

From: Universal disease biomarker: can a fixed set of blood microRNAs diagnose multiple diseases?

  

Data set names/

Data normalization

Datanormalization

GEO ID

Disease

Data retrieval methods

timing

methods

GSE46579

AD

GSE46579_AD_ngs_data_summarized.xls.gz

before FE

zero mean/variance is one

GSE37472

carcinoma

getGEO

before FE

zero mean/variance is one

GSE49823

CAD

getGEO

after FE

zero mean/variance is one

GSE43329

NPC

getGEO

before FE

zero mean/variance is one +

GSE50013

HCC

getGEO

before FE #

zero mean/variance is one

GSE41922

BC

GSE41922_series_matrix.txt.gz

after FE

zero mean/variance is one

GSE49665

AML

getGEO

after FE

zero mean/variance is one

  1. *no normalization for SVM/lasso, +no normalization for SVM with PCA-based FE, #after FE for PCA-based LDA with universal features. All the sample normalizations were sample-based; i.e., each sample was normalized to have both zero mean and unit variance. AD, Alzheimer disease; CAD, coronary artery disease; NPC, nasopharyngeal carcinoma; HCC, hepatocellular carcinoma; BC, breast cancer; AML, acute myeloid leukemia. Data retrieval methods/data set names were used to name files and for analysis. getGEO indicates that individual sample profiles whose files names started with “GEO” were downloaded by the getGEO command in R.