This is a non-technical overview of how data publishing works. For technical details, see How to publish data: step by step.
Open by default
One idea that is becoming increasingly common in open data literature is the principle of ‘open by default’ (see e.g. Socrata Field Guide). That is, there is a presumption that any information that is already public and that doesn’t contain personal information should be published as open data. (This is the approach adopted by New York City, for example.) An ‘open by default’ approach can streamline the publishing process and can work well where councils already publish data in non-open formats.
While this may be the recommended approach for certain councils, there is no requirement for all councils to open data by default. Councils wishing to become involved in open data can publish incrementally after assessing the relevant data, perhaps working towards targets for a certain number of datasets to be released. This incremental approach will also achieve the aims of open data and may be particularly appropriate for councils that are new to publishing.
Free or Low Cost
Although they may differ slightly when it comes to the issue of low cost vs no cost, commentators generally agree that fees should not be a barrier to accessing open data. This is particularly so where data is provided online and the costs associated with publishing are low.
Consistent with this, councils such as the City of Melbourne have explicitly stated that data should be provided for free except where this conflicts with legislative provisions. This accords with the Victorian Government DataVic Access Policy which states that, where possible, data should be provided at no or minimal cost.
- For more information on no-cost data provision, see ‘Open Government Data ’— Joshua Tauberer
Creative Commons licences provide a standardised way to give the public permission to share and use material.
The open council data toolkit will focus on releasing data under creative commons by attribution (CC-BY) licences as this is a common choice in the Australian open data context. A CC-BY licence meets the requirements for open licensing, as it attaches minimal restrictions to the data, while still requiring the original content-creator (e.g. the council) to be credited.
Some international commentators recommend releasing data in the public domain (or ‘CC0’), which is the most open form of data release. For completeness, further guidance on release in the public domain can be found in ‘Open Government Data ’ by Joshua Tauberer (USA).
- Creative Commons Australia is the leading source of information on Australian creative commons licences.
- The Open Data Institute has collated a useful summary of the various types of open data licences and the benefits of each.
Choosing what data to focus on
The toolkit includes a list of typical datasets that new councils may wish to release as a starting point. If you would like further guidance on the datasets to prioritise, you may want to consider:
- What datasets fit best with your council’s strategic plan?
- What data would be easiest to release?
- What data have other councils released? Could you also release this type of data? (This may make it easier for you to leverage off work being done with other councils’ datasets.)
- What sorts of data releases might save you time in the long term? (For example, do you often get requests for a particular type of data? Would it save time in future if you could simply point people to your open data?)
- What sort of data does your community want?
The open council data toolkit is designed to give you the basics to get started. If you would like more guidance and perspectives from overseas, we have collated some further reading below:
- Socrata’s guide to what data you should publish first
- 6 steps to implementation
- 6 steps to open data success
- Opening and Publishing Data – Code for America
Some councils may prefer an incremental approach, beginning with the typical datasets set out in the open council data toolkit and branching out from there. However, if you’re ready to set more detailed strategic priorities, you may wish to review Socrata’s Open Data Field Guide which has a helpful chapter on defining your open data goals.
The question of data quality
While data quality is an important consideration, it need not be a barrier to publication, provided you are up-front about any limitations on the data. Publication of open data can promote engagement with the public, and you may even be able to find people in the community who are willing to help improve the data quality.
Privacy and confidentiality
Open data is quite separate from personal or sensitive information. The term ‘open data’ applies to information collected by councils that does not have privacy or security implications. In contrast, personal information or information pertaining to topics such as national security is kept confidential and is not suitable for release to the general public.
Under Victorian privacy law, ‘personal information’ is information or an opinion about an individual:
- whose identity is apparent, or
- whose identity can reasonably be ascertained
Sometimes it is possible to alter or redact a dataset so that it no longer contains personal information, for example, by removing references to private addresses. It may also be possible to aggregate data so it shows broad trends rather than specific information about individuals or groups.
If you are unsure whether your data contains personal or sensitive information, you may want to put aside the relevant data and focus on releasing other datasets first. You could also look at what other councils have done with similar information. You can find further information through the Victorian Commissioner for Privacy and Data Protection.
Where to publish
For simplicity, the toolkit focuses on publishing to the Australian open data portal at data.gov.au. Publishing to a central repository will make it easier for the public to access data across jurisdictions. This will broaden the range of possible uses of the data.
Managing the publishing process
The process for publishing open data may seem daunting for councils without specialist knowledge in data management or GIS. However, organisations such as Socrata emphasise that open data implementation doesn’t always have to be centred on IT and can be relevant to anyone who handles information. In recognition of this, the open council data toolkit contains a step-by-step guide which walks through the technical aspects of preparing and publishing open data. For additional support, council staff can get advice from their peers through the Open Council Data Google Group.
How long will it take me to prepare and publish my data?
Councils that have published open data have reported that it can take as little as 1-2 hours to prepare and upload a dataset, assuming staff have prior experience with the process. Councils may choose to begin with simple datasets to minimise the initial investment required.