Automatically Generating License Data from Python Dependencies

computer-1869236_960_720We all know how important keeping track of your open-source licensing is for the average startup.  While most people think of open-source licenses as all being the same, there are meaningful differences that could have potentially serious legal implications for your code base.  From permissive licenses like MIT or BSD to so-called “reciprocal” or “copyleft” licenses, keeping track of the alphabet soup of dependencies in your source code can be a pain.

Today, we’re releasing pylicense, a simple python module that will add license data as comments directly from your requirements.txt or environment.yml files.


pylicense.py requirements.txt

or

pylicense.py -e environment.yml

Under the covers, it uses xmlrpclib to fetch package data from pypi and looks for the “license” tag or the “License” classifier. The operation is also idempotent so if you’ve already commented the file with the license, it will not add them in.

Starting with an environment.yml file like this

name: website
dependencies:
- astroid=1.3.2=py27_0
- cairo=1.12.18
- cffi=1.1.2=py27_0
- dill=0.2.2=py27_0
....

will yield a file like this one

name: website
dependencies:
- astroid=1.3.2=py27_0  # LGPL
- cairo=1.12.18  # LGPL 2.1, MPL 1.1
- cffi=1.1.2=py27_0  # MIT
- dill=0.2.2=py27_0  # 3-clause BSD
....

The python module is free, and easily installable from github. For more information (including installation and usage), checkout the github page at https://github.com/thedataincubator/pylicense.  For the record, pylicense itself is licensed under the MIT license. Enjoy!

Editor’s Note: The Data Incubator is a data science education company.  We offer a free eight-week Fellowship helping candidates with PhDs and masters degrees enter data science careers.  Companies can hire talented data scientists or enroll employees in our data science corporate training.

Related Blog Posts

data science portfolio

How to Build a Strong Data Science Portfolio: 5-Step Guide

So you want to be a data scientist? Great choice! Data scientists are still the hottest jobs around. But before you can start applying for data science jobs, you need to build a strong data science portfolio. A data science portfolio is a collection of your best data science projects that demonstrate your skills and abilities.

In this blog post, I’ll provide a 5-step guide on how to build a strong data science portfolio.

Read More »
imposter syndrome

Impostor Syndrome in Tech: What It Is, Why It Exists, and How to Overcome It

Impostor syndrome isn’t experienced in just certain industries or disciplines or only by certain individuals. It’s much more widespread than you may think. If you’re in the technology field, you may be familiar with this sentiment, but maybe you’ve never heard the term impostor syndrome. So, what exactly is impostor syndrome? What causes it? And how do people in data science, the tech field or STEM industries overcome it?

Read More »