CS 5010

Programming and Systems for Data Science

Global Terrorism Data

Group 1

Grant Redfield,    Brago Aboagye Nyame, Mary Youssef

Pro Football Focus Data

Group 3

Jesse Katz, Andrej Erkelens, Matthew Edwards


Group 5

Manpreet Dhindsa, Robel Semunegus, Amber Curran

Basketball Analytics

Group 7

Allie Ridgway, John Hazelton, Julie Crowe, Jake Kolessar

Housing Market

Group 9

Quinton Mays, Matt Suozzi, Eric Pratsch

Wildfires in California

Group 11

Will Tyree, Christian Schroeder, Antoine Edelman



COVID-19 Analytics

Group 13 Matthew Sachs, Thomas Butler,

 Karan Manwani

CFB Visualization Tool

Group 2

Sarah Rodgers, TJ McIntyre, Drew Haynes

Population Density & Climate

Group 4  

Elena Tsvetkova,Jonathan Shakes,Anita Taucher

Corn Syrup

Group 6

Robert Knuuti, Swaroop Veerabhadrappa,Nitika Kataria

UN Refugee Agency Data

Group 8

Colin Warner, Allison Hansen, Amanda Maruca

Mental Illness Data

Group 10

Samy Kebaish, Gretchen Larrick, Kelly Farrell

Impact of COVID-19 on Bike Sharing

Group 12 Carol Moore, Jenny Jang, Elina Ribakova

Unemployment/Real Estate Market Investment

Group 14 Matt Litz, Abhijeet Chawhan, Abhishek Bada

Course Project Presentations

The objective of this course is to introduce basic data analysis techniques including data analysis at scale. Students will be introduced to essential programming techniques in Python, an increasingly prominent language for data science and “big data” manipulation.

This course is project based, consisting of a semester project and final project presentations.

Evaluation Rubric

1. Introduction: Describe the project scenario.

2. An appropriate data set was used.

3. Appropriate data structures are used.

4. Data pre-processing.

5. Data Analysis / Data Processing.

6. Testing: Describe any test-driven development and/or unit tests.

7. Results displayed appropriately for each test.

8. Explanation of Results & Conclusions.

9. Presentation skills and video.