CS 5010

Programming and Systems for Data Science

The objective of this course is to introduce basic data analysis techniques including data analysis at scale.

Additionally, essential and complementary topics are taught, coming from the areas of software development methods, software testing and debugging and visualization.

For the purpose of facilitating data manipulation and analysis, students will be introduced to essential programming techniques in Python, an increasingly prominent language for data science and “big data” manipulation.

Presentation Schedule

Time Slots             May 5             Apr 28
6:30 - 6:45 pm        Group 1
6:45 - 7:00 pm        Group 4          Group 13
7:00 - 7:15 pm        Group 14        Group 7
7:15 - 7:30 pm        Group 10        Group 2
7:30 - 7:45 pm        Group 6    
7:45 - 8:00 pm        Group 5
8:00 - 8:15 pm        Group 11
8:15 - 8:30 pm        Group 8
8:30 - 8:45 pm        Group 9
8:45 - 9:00 pm        Group 12
9:00 - 9:15 pm        Group 3

Evaluation Rubric

  1. Introduction: Describe the project scenario.

  2. An appropriate data set was used.

  3. Appropriate data structures are used.

  4. Data pre-processing.

  5. Data Analysis / Data Processing.

  6. Testing: Describe any test-driven development and/or unit tests.

  7. Results displayed appropriately for each test.

  8. Explanation of Results & Conclusions.

  9. Presentation skills and video.

Course Project Presentations

CFB Visualization Tool

Group 2 Sarah Rodgers, TJ McIntyre, Drew Haynes

COVID-19 Analytics

Group 13 Matthew Sachs, Thomas Butler,

 Karan Manwani

Basketball Analytics

Group 7 Allie Ridgway, John Hazelton, Julie Crowe,

Jake Kolessar

GitHub

Slides