Upcoming Webinars & On-Demand Learning

Data Ethics: What it Is and Why it Matters

In this webinar we’ll discuss what moral obligations data ethics encompasses and why every analyst and data scientist should care, principles of data ethics together with best practices, and how data ethics relates to the use of machine-learning algorithms. Don’t miss the last webinar of 2022!

Register Here »

On-Demand Learning

Want to pick up a new skill? Check out these on-demand data demos so you can brush up on your data science skills and learn something fun in the process. We’re constantly adding new content to our on-demand learning repository so check back often!

GNU Make

Learn some basics about GNU Make, which is a tool that controls the generation of executables and other non-source files of a program from the program’s source files. Make gets its knowledge of how to build your program from a file called the makefile, which lists each of the non-source files and how to compute it from other files.

Linear Regression

Linear regression is a commonly used technique, and parts of it are often justified by heuristics. In this video, we discuss the assumptions behind linear regression, and we show how it can be derived from statistical principles. We focus on the use of mean squared error as the error metric and regularization terms, showing how they are both derived from assumptions about the nature of noise in our data.

Practical Pipelines

Do you constantly hear the word “Pipeline” inside the data world? Well here is a chance to get a basic understanding of what a pipeline is and how to implement one to fit data your needs.

He discusses the fundamentals of a data pipeline as well as common practices inside the data community. You’ll learn when a pipeline is necessary, the technologies that can help us build one, and the different forms a pipeline can take.

Visualizing Geospatial Data in Python

Been itching to pick up a new data science skill? Here’s another chance from TDI’s best and brightest!

Ana Hocevar demonstrates how to visualize geospatial data in Python. She shows off how to use Python’s Altair library and how to plot maps overlaid with spatial data in order to build an interactive geospatial visualization. You’ll also learn:

  • Clean and prepare the data
  • Make plots in Python using Altair
  • Create a geospatial visualization with Altair
  • Make the visualizations interactive

Creating a web API

Have you ever wanted to learn how to create a web API? Now’s your chance to learn from the best at TDI and pick up a new data science skill!

Don Fox teaches aspiring data scientists how to create a web API using Flask. In addition, you’ll learn the following:

  • How the web API will utilize a trained machine learning model using scikit-learn
  • Use our created web API to classify tweet sentiment using the Twitter web API

Machine Learning

Ana Hocevar built a basic natural language machine learning model and deployed it as a simple web app.

Learn common tools and techniques for building a machine learning model in Python and how to use it in a web app using Streamlit. Ana showed us some steps to:

  • Clean and prepare the data
  • Train a machine learning model using scikit-learn
  • Create a simple web app using Streamlit
  • Set up a Github repository to share notebooks and software requirements

Reproducible Research

We introduce the how and the why of presenting results in data science. We covered some common tools and techniques for sharing interactive reports and conducting reproducible research.

Learn the steps to:

  • Clean up an exploratory Jupyter notebook into a presentable report
  • Set up a Github repository to share notebooks and software requirements
  • Create a new git branch to update a notebook without modifying the master copy
  • Distribute a report in an interactive form using Binder

Data Science First Steps

We’ll introduce some common tools and techniques for exploring data and demonstrate how to use them to answer meaningful questions. Our instructor will show you the steps to:

  • Start with a question, then introduce a data set
  • Perform initial exploration of the data set
  • Turn the question into something we can compute
  • Find some “checkpoint” values we can compute

 

Incubator

Stay Current. Stay Connected.

Sign up for our newsletter!