Glossary

THIS GLOSSARY IS ADAPTED FROM A VARIETY OF SOURCES SUCH AS THE OPEN DATA HANDBOOK AND DATA.GOV.


API

Application Programming Interface (API) is a set of instructions and standards used by an application program to communicate with the operating system or some other control program such as a database management system.

 

APPLICATION

An application or app is a software program that is designed to connect to large databases and often provides real-time information on a computer, mobile phone, and other similar platforms.

 

ATTRIBUTION

Acknowledging the source of data when consuming or re-publishing it. Data licenses may include this requirement when publishing open data.

 

CKAN

The Comprehensive Knowledge Archive Network (CKAN) is an open source data management system for storage and distribution of data across the web, built and maintained by Open Knowledge. It serves as the official data publishing platform for about 20 national governments and powers the data publishing efforts of a variety of local, community, and scientific organizations. For more information click here.

 

COPYRIGHT

A right for the creators of creative works to restrict others’ use of those works. A copyright holder is entitled to determine how others may use or restrict use of that work through a license.

 

CSV

‘Comma-separated values’, a standard format for spreadsheet data. Data is represented in a plain text file, with each data row on a new line and commas separating the values on each row. As a very simple open format, it is easy to consume and is widely used for publishing open data.

 

DATA

Data can be thought of as facts, statistics, or other values systematically collected for reference, analysis, and calculation. Data becomes informative when it's structured, often combined with other data, and analyzed to extract meaning.

 

DATA CATALOG

An authoritative listing of available open data organized in a manner that makes it easy to search and navigate.

 

DATASET

An organized collection of data commonly presented in a spreadsheet (where columns represent variables and each row contains values for those variables) or in the form of a map.

 

FILE FORMAT

The format of a file is associated with the last part of the file name or extension. For example, a CSV file could be called totalcars.csv.

 

GEOJSON

GeoJSON is an open standard designed to represent geographical features (and non-spatial features) based on JavaScript Object Notation (JSON). This format was written and is maintained by an Internet working group of developers.

 

GEOSPATIAL

Geospatial refers to data that has a geographic component to it such as coordinates, address, city, or zip code.

 

GROUPS

An organizational framework to group like datasets together to provide context and meaning to a user.

 

HACKATHON

A social event that brings together programmers, subject experts, and advocates to share information and work together to build applications, visualizations, or prototypes to often address an issue or a set of inter-related issues.

 

HTML

Hyper Text Markup Language (HTML) is a language that describes the skeletal structure of a webpage. Internet browsers reference HTML to render the contents of a webpage to a user.

 

JSON

JavaScript Object Notation (JSON) is a lightweight data-interchange format. It is a text format that is completely language independent but uses conventions similar to the C, C++ and C# family of languages.

 

KML

Keyhole Markup Language (KML) is a XML-based language for managing the display of three dimensional data on applications such as Google Earth. KML was developed for use with Google Earth and is accepted as an Open Geospatial Consortium Standard.

 

LICENSE

The license that accompanies the publication of a dataset to convey how a user can use or reference the data.

 

MACHINE READABLE

Information or data that is in a format that can be easily read or processed by a computer without human intervention. Machine readable data must be structured data and are often found in CSV, JSON, and XML file formats.

 

METADATA

Provides descriptive information about data to give it context. Descriptive elements such as title, description, publisher, and license information are important to the discovery and usability of the data that is published.

 

OPEN DATA

Open data is the proactive release of government collected data that is made publicly available through an open license to enable citizens to freely access, reuse, and redistribute. From a technical perspective, open data is available in machine readable file formats and allows users to download data related to government operations or service delivery in bulk.

 

OPEN GOVERNMENT

Open government is a governing principle that serves to support transparency, collaboration, and engagement with the public by implementing policies and utilizing technology to emphasize the sharing of government information.

 

OPEN SOURCE

Open source software is freely available to the public. Users are free to inspect it, modify it, and use the code for their own purposes.

 

PUBLIC DOMAIN

No copyright exists over the work and users can utilize available data for their own purposes without restrictions.

 

SHAPEFILE

A shapefile stores non-topological geometry and attribute information for the spatial features in a dataset. The geometry for a feature is stored as a shape comprising a set of vector coordinates. Shapefiles can support point, line, and area features.

 

STRUCTURED DATA

Structured data refers to data where structural relationships between elements are retained and stored on a computer disk. PDFs and word processing documents are not structured forms of data because the logical structure cannot (or is nearly impossible) to extract automatically.

 

TAGS

Keywords that help users discover datasets of interest.

 

URL

Uniform resource locator (URL) when used with HTTP is a character string or web address that references a web page.

 

XML

Extensible Markup Language (XML) defines rules or standards for encoding content in a format that is easily readable for both human and machine. XML can be used by any individual or group that wants to share information in a consistent way.

 

ZIP

A computer file whose contents are compressed to facilitate storage or transmission. It often carries the .zip file extension.