IMDb: The Internet Movie Database

What Is It?

 The Internet Movie Database is available as a collection of flat text files, each of which contains simple relationships between movie entities.  For instance, theactresses.txt file contains a list of all actresses within the IMDb and, for each actress, the list of movies (and TV shows, etc) in which that actress has appeared.   The data within these text files is a treasure trove of inspiration for graph problems and projects (think: The Kevin Bacon Game), or might even be used to motivate concepts such as parsing or relational databases.

 

The Internet Movie Database (IMDb) is a huge collection of movie data [that] ...started as a hobby project by an international group of movie fans.

 -- Imdb Website, 2009-10-18

 

 

Cost

  • As of spring 2009, IMDb charges licensing fees (>$15K / year) for commercial uses, but provides free access to their database to individuals for some non-commercial uses. (E.g., you can't use their data to create a competing movie database for non-personal use.) Students who download and use the IMDb data for a course project likely meet the requirements of the IMDb license, as long as the course project does not make the IMDb data publicly accessible or create a web application that uses the IMDb data.

Features

  • The actors.txt and actresses.txt files are currently 136 MB and 74 MB, respectively. Most files in the database are 1's or 10's of MB, while some data files are as small as 10's and 100's of KB.