Python API to TalkBankDB
The TBDBpy package provides access to TalkBankDB data from Python.
TalkBankDB ( is a database and set of tools for exploring TalkBank’s media and transcripts, specify data to be extracted, and pass these data on to statistical programs for further analysis. The TBDBpy package (TalkBankDataBase - Python) provides easy access to all information within TalkBankDB, including clinical collections. Clinical Banks are password protected. Visit to learn about gaining access to these collections.
You can install TBDBpy from GitHub using pip:
pip install git+
Then import tbdb:
import tbdb
TBDBpy allows access to data from TalkBankDB through several functions. For example, to get a table of utterances from a particular transcript in the childes/Eng-NA/MacWhinney collection:
import tbdb
utts = tbdb.getTranscripts( {"corpusName": "childes", "corpora": [['childes', 'Eng-NA', 'MacWhinney', '010411a']]} )
'colHeadings': ['path', 'filename', 'languages', 'media', 'date', 'pid', 'designType', 'activityType', 'groupType'],
'data': [['childes/Eng-NA/MacWhinney/010411a', '010411a', 'eng', 'audio', '1979-05-06', '11312/c-00016447-1', 'long', 'toyplay', 'TD']]}
The available functions for accessing different data sets are below. Each function has documentation accessible through help(functionName), for example:
import tbdb
# View docs for tbdb module:
# View docs for getTranscripts:
Functions to extract data from TalkBankDB are:
Each of these functions take a dictionary parameter defining a corpusName and a set of optional fields to define a TalkBankDB request. Each returns a dictionary with two members:
{'colHeadings': [], 'data': [[]]}
- colHeadings: List of strings describing columns in data.
- data: List of lists, where each list represents a table row.
Additional functions return metadata about TalkBankDB:
For troubleshooting, an additional function, validPath(), will return whether a given path is valid.
tbdb.validPath(['childes', 'childes', 'Clinical']);
If the path is not valid, it will return which level of the query is incorrect
tbdb.validPath(['childes', 'childes', 'somethingThatDoesNotExist'])
To access protected collections, include a final True parameter value for auth. With this final True param, a dialog will ask for the protected collection you are trying to access and to enter a username and password for it. If credentials are incorrect, a response describing the error is returned.
aphasia_transcrips = tbdb.getTranscripts({'corpusName': 'aphasia', 'corpora': [['aphasia', 'English', 'Aphasia', 'Adler']]}, True)