DAS 6B: Thinking Outside the Black Box: balancing model interpretability and accuracy

Track: Data Analytics Symposium

Session Number: DAS 6B
Date: Thu, Jul 27th, 2017
Time: 10:15 AM - 11:00 AM

Description:

Today’s technology makes it easier than ever to train complex predictive models for fundraising. Analytics professionals accustomed to predicting giving with generalized linear models now have the choice of more flexible models, such as gradient boosting trees, random forests, and neural networks. In practice, these models avoid the restrictive assumptions of parametric models and achieve greater accuracy, but they are often seen as “black box” approaches that are less insightful, harder to implement, or more prone to overfitting.

In this session, we will bridge this gap from both sides: with techniques to improve the performance of basic models and with approaches to visualizing and communicating more complex models, using planned giving data as a real-world example.

Because the presenter sympathizes with the unfortunate attendees who are all stuck in a windowless room across the street from the Happiest Place on Earth, he has themed this presentation in a fashion that will bring the Magic here to APRA (hopefully without getting sued).
Sub-Categorization: Enterprise Track
Session Type: Breakout Session (45 minutes)

Primary Competency: DA:Competency 4: Statistical Techniques and Competencies
Secondary Competency: DA:Competency 6: Communication
Tertiary Competency: None
Intended Audience Level: Level II
Learning Objective #1: Attendees will improve model accuracy by understanding the bias-variance tradeoff and the advantages of more flexible models, choosing the most relevant performance metrics for their organizations and measuring on test data.
Learning Objective #2: Attendees will explore the barriers to interpretability posed by more flexible models and review model-specific and -agnostic interpretation techniques.
Prerequisites: Prior experience with building and assessing predictive models for fundraising applications is recommended. This session assumes familiarity with statistical concepts but not software packages. Examples will be presented using R packages, but the equivalent Python libraries and SAS procedures (if applicable) will be referenced.
Sub-Categorization: Enterprise Track
Session Type: Breakout Session (45 minutes)

Primary Competency: DA:Competency 4: Statistical Techniques and Competencies
Secondary Competency: DA:Competency 6: Communication
Tertiary Competency: None
Intended Audience Level: Level II
Learning Objective #1: Attendees will improve model accuracy by understanding the bias-variance tradeoff and the advantages of more flexible models, choosing the most relevant performance metrics for their organizations and measuring on test data.
Learning Objective #2: Attendees will explore the barriers to interpretability posed by more flexible models and review model-specific and -agnostic interpretation techniques.
Prerequisites: Prior experience with building and assessing predictive models for fundraising applications is recommended. This session assumes familiarity with statistical concepts but not software packages. Examples will be presented using R packages, but the equivalent Python libraries and SAS procedures (if applicable) will be referenced.