Nathaniel Phillips, University of Basel
Reading Group, PWC, Zurich
"As the city’s principal public hospital, Cook County was the place of last resort for the hundreds of thousands of Chicagoans without health insurance. Resources were stretched to the limit. The hospital’s cavernous wards were built for another century. There were no private rooms, and patients were separated by flimsy plywood dividers. There was no cafeteria or private telephone—just a payphone for everyone at the end of the hall. In one possibly apocryphal story, doctors once trained a homeless man to do routine lab tests because there was no one else available." Malcolm Gladwell, Blink.
Neth et al. (2014). "Homo heuristicus in the financial world".
Complex | Simple | |
---|---|---|
Example | Regression, Random Forests, Bayes | Fast and Frugal Tree (FFT) |
Information Requirements | High | Low |
Cost of use | High | Low |
Search | Comprehensive | Sequential |
Speed | Slow | Fast |
Transparency, ease of use | Medium or Low | High |
Accuracy | Depends | Depends |
Problem: While there are many packages for creating non-frugal decision trees (like rpart()
), no such tool exists for fast and frugal trees.
Solution: FFTrees
An easy-to-use R package to create, visualize, and implement fast and frugal decision trees.
age | sex | cp | trestbps | chol | fbs | restecg | thalach | exang | oldpeak | slope | ca | thal | diagnosis |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
63 | 1 | ta | 145 | 233 | 1 | hypertrophy | 150 | 0 | 2.3 | down | 0 | fd | 0 |
67 | 1 | a | 160 | 286 | 0 | hypertrophy | 108 | 1 | 1.5 | flat | 3 | normal | 1 |
67 | 1 | a | 120 | 229 | 0 | hypertrophy | 129 | 1 | 2.6 | flat | 2 | rd | 1 |
37 | 1 | np | 130 | 250 | 0 | normal | 187 | 0 | 3.5 | down | 0 | normal | 0 |
41 | 0 | aa | 130 | 204 | 0 | hypertrophy | 172 | 0 | 1.4 | up | 0 | normal | 0 |
56 | 1 | aa | 120 | 236 | 0 | normal | 178 | 0 | 0.8 | up | 0 | normal | 0 |
# Step 0: Install FFTrees (v.1.2.0)
install.packages("FFTrees")
# Step 1: Load the package
library("FFTrees")
# Step 2: Create an fft decision model with FFTrees
heart.fft <- FFTrees(formula = diagnosis ~.,
data = heartdisease)
FFTs are very cheap to implement
Heart disease data
cue | cost | description | values |
---|---|---|---|
thal |
$102 | thallium scintigraphy, a nuclear imaging test that measures blood flow | normal (n), fixed defect (fd), reversible defect (rd) |
cp |
$1 | Chest pain type | Typical (ta), atypical (aa), non-anginal pain (np), asymptomatic (a) |
ca |
$101 | Number of major vessels colored by flourosopy, an x-ray imaging tool | 0, 1, 2 or 3 |
FFTrees
package can be used with any dataset with a binary criterion.mushrooms.fft <- FFTrees(poisonous ~., data = mushrooms)
breast.fft <- FFTrees(diagnosis ~ ., data = breastcancer)
install.packages("FFTrees")
install.packages("yarrr")
Calculate a decision threshold t
for each cue that maximizes the cue’s balanced accuracy bacc
in training.
Rank cues in order of their maximum balanced accuracy -- select the top N cues.
Creates all possible 2^{N−1}
trees with these cues, using all exit structures.