This site may store cookies on your machine.
Throughout the site there are icons to indicate that extracts of data from the database may be downloaded in its corresponding format:
The internal structure of all of these files is specific to the particular data contained within them and this structure may be subject to change.The language tree structure may be downloaded as JSON .
The list of all languages may be downloaded as JSON or CSV .
Any individual language (for example English), including its lexemes and geographical presences, may be downloaded as JSON .
The languages descending from any node in the language tree (for example Latin), including their lexemes and geographical presenences, may be downloaded as JSON .
Any typological data set (for example Austronesia) may be downloaded as JSON or an Office Open XML Workbook (.xslx
)
For any lexeme (for example the Proto-Germanic word *bōk(j)ō- 'beech'), its etymological context may be downloaded as JSON . The same JSON representation is used to draw the etymological diagram on the lexeme's Details page.
The word list structure may be downloaded as JSON .
Any word list (for example Culture words for Indo-European), may be downloaded including all the lexemes that are connected to it as JSON .
Data for some Swadesh word lists may be downloaded for a select numbers of language families (for example the Swadesh 100 list for Indo-European), as JSON .
For some selected word lists, their connected lexemes may be downloaded arranged according to cognacy, as both JSON and CSV files, bundled into one ZIP archive.
The entire lexical portion of the database may be downloaded as JSON .
Furthermore, some maps will have a save button on them, which allows the user to download the GeoJSON data with which the map was drawn.
Certain entities are represented in the JSON files consistently (this does not include the GeoJSON files). Their structure can be found below.
In order to minimize the resulting JSON file size, attributes with null values or default values are omitted. Examples of default values are false
(for boolean attributes), 0
(for integer attributes), ""
(for string attributes), or []
(an empty list).
"MetaData": { "TimeStamp": "date and time of compilation as an ISO 8601 string", "JsonUrl": "the URL from which the JSON file was downloaded", "Readme": "general information" }
"Languages": { "language id": { "Name": "name", "ISO639-3": "3 letter iso code", "GlottoCode": "8 character Glottolog code", "AlternativeNames": "alternative language names", "LanguageArea": "language area description (one of 'Africa', 'Australia', 'Central Asia', 'East Asia', 'Europe', 'Middle East', 'North America', 'Pacific', 'South America', 'South Asia', or 'South East Asia')", "FkFocusAreaId": focus area id, "Reliability": "language reliability description (one of 'Modern language', 'Dead (well documented)', 'Dead (fragmentary)', or 'Reconstructed')", "TimeFrame": { "From": year, "Until": year }, "FocalPointWgs84": { "Latitude": latitude as a decimal number in the WGS 84 coordinate system, "Longitude": longitude as a decimal number in the WGS 84 coordinate system } } }
"Lexemes": { "lexeme id": { "FkLanguageId": associated language id, "FormTranscriptionNoMarkup": "transcription form without markup tags", "FormTranscriptionHtmlified": "transcription form with markup tags converted to HTML", "FormTransliteration" : "transliteration form", "FormIpa": "IPA form", "Meaning": "meaning", "MeaningNote": "meaning note". "GrammaticalData": "grammatical data", "LexemeNoteNoMarkup": "lexeme note without markup tags", "LexemeNoteHtmlified": "lexeme note with markup tags converted to HTML (truncated to 500 characters)", "SourceReferences": [source references (see below)] } }
"Etymologies": { "etymological link id": { "FkParentId": id of the parent lexeme, "FkChildId": id of the child lexeme, "FkReliabilityId": id of the etymological reliability (always included, even if 0), "NoteNoMarkup": "etymology note without markup tags", "NoteHtmlified": "etymology note with markup tags converted to HTML (truncated to 100 characters)", "SourceReferences": [source references (see below)] } }
"EtymologicalReliabilities": { "etymological reliability id": { "Description": "description", "Note": "note" } }
{ "FkSourceId": associated source id, "LocationWithinSource": "description of the location within the source (e.g. page number)", "Note": "note" }
"Sources": { "source id": { "CitationKey": "citation key", "FullCitation": "full citation", "Note": "source note", "SourceType": "source type description" } }
"WordLists": { "word list id": { "Name": "word list name", "Description": "word list description", "CognateSetsDownloadable": boolean value indicating if cognate sets may be downloaded for this word list, "WordListCategories": { "word list category id": { "Name": "word list category name", "WordListItems": { "word list item id": { "Name": "word list item name", "Note": "word list item note", "ConnectedLexemes": [list of lexeme ids] } } } } } }