Physiological data of patients tested for breast cancer.

breastcancer

Format

A data frame containing 699 patients (rows) and 9 variables (columns).

thickness

Clump Thickness

cellsize.unif

Uniformity of Cell Size

cellshape.unif

Uniformity of Cell Shape

adhesion

Marginal Adhesion

epithelial

Single Epithelial Cell Size

nuclei.bare

Bare Nuclei

chromatin

Bland Chromatin

nucleoli

Normal Nucleoli

mitoses

Mitoses

diagnosis

Criterion: Absence/presence of breast cancer.

Values: FALSE vs. TRUE (65.0% vs.\ 35.0%).

Source

https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Original)

Original creator:

Dr. William H. Wolberg (physician) University of Wisconsin Hospitals Madison, Wisconsin, USA

Details

We made the following enhancements to the original data for improved usability:

  • The ID number of the cases was excluded.

  • The numeric criterion with value 2 for benign and 4 for malignant was converted to logical (i.e., TRUE/FALSE).

  • 16 cases were excluded because they contained NA values.

Other than that, the data remains consistent with the original dataset.