Internet

Open data

Talk (22)

Dan Eads

Dan Eads

"I would also recommend Enigma Public ..."
GW

Gabriel Withington

"Leo, Thanks for your feedback and ..."
LS

Leo Sauermann

"Gabriel Withington - as you are into ..."
BN

Brent Norris

"Thanks Robbie, I agree with you and a..."

This is the place for you if you want to help us use open data in our journalism.

Open data is data that can be used, modified and republished with few if any copyright restrictions. Different governments, non-profit organizations and private entities periodically release data. But in the past, use and reuse of these data came with a lot of restrictions and were also limited to a certain audience. However, like the open source movement, data is now released to the general public for scrutiny, use and reuse. That being said, it is important to note that open data is released with a licence that describes how the data can be used, whether or not it can be used for commercial purposes etc.

This project documents different aspects of open data, their providers, the licences and the discussions on the pros and cons of open data.

Guidelines

Whenever you want to write a story on the release of open data, don’t forget to mention the following information:

  • Data provider: Who is the provider of data? What type of organization or entity is responsible for this data? Is it a government or private institution?
  • Data collection method: How was the data collected by the provider? For example surveys, internal data, census data etc.
  • Licence: Common licence issues are:
    • Is the data released under public domain (PD) or one of the Creative Commons licenses (CC)
    • whether usage requires attribution (BY)
    • whether usage has to be equally openly licensed (look for share alike or SA conditions)
    • If derivative works are allowed (ND is the usual abbreviation for No Derivatives)
    • Whether “Commercial” use is allowed (NC is the usual abbreviation for No Commercial use)

Stories in Draft

Published Stories

Suggested Resources

Interested Collaborators

Journalists

See also

Technology

History for projects "Open data"

Select two items to compare revisions

07 February 2018

13:40:17, 07 Feb 2018 . .‎ Fiona Apps (Updated → sp)

06 February 2018

17:30:32, 06 Feb 2018 . .‎ Ingrid Strauch (Updated → added resource)

29 January 2018

15:50:23, 29 Jan 2018 . .‎ Jimmy Wales (Updated → adding call to action so it will appear on the homepage in a more appealing way)

24 January 2018

08:33:43, 24 Jan 2018 . .‎ Fiona Apps (Updated → Accepting edits with minor changes)
08:05:50, 24 Jan 2018 . .‎ Leo Sauermann (Updated → )

19 January 2018

15:13:52, 19 Jan 2018 . .‎ Fiona Apps (Updated → Saving as 'is' as correct grammar)
14:59:41, 19 Jan 2018 . .‎ Arno Klein (Updated → )

08 January 2018

10:50:32, 08 Jan 2018 . .‎ Fiona Apps (Updated → Adding story)

03 December 2017

14:14:19, 03 Dec 2017 . .‎ John Samuel (Updated → Add link about open data publishing)

21 November 2017

12:52:18, 21 Nov 2017 . .‎ Jonathan Cardy (Updated → expand)

17 November 2017

09:05:04, 17 Nov 2017 . .‎ Fiona Apps (Updated → Adding header and rearranging)

14 November 2017

06:07:23, 14 Nov 2017 . .‎ Stephanie Kemna (Updated → minor fixes of spelling, grammar, wording for consistency)

09 November 2017

09:34:01, 09 Nov 2017 . .‎ Fiona Apps (Updated → Adding story in draft and publishing)

08 November 2017

21:08:40, 08 Nov 2017 . .‎ John Samuel (Updated → corrections)
20:59:49, 08 Nov 2017 . .‎ John Samuel (Updated → guidelines and introduction)

06 November 2017

19:27:14, 06 Nov 2017 . .‎ John Samuel (Updated → Submission)
19:26:55, 06 Nov 2017 . .‎ John Samuel (Updated → Add sections)
19:03:33, 06 Nov 2017 . .‎ John Samuel (Updated → Update title)

Talk for Project "Open data"

Talk about this Project

  1. I would also recommend Enigma Public (https://public.enigma.com/) as a resource as it has a very broad collection of public datasets from US federal, state, and local governments. They are basically just taking all of the US public datasets and putting them under one common UI and API.

  2. Gabriel Withington – as you are into the topic, are you aware of DataHub? http://datahub.io/

    They manage the largest collection of available datasets and they are looking for volunteers for data curation: http://datahub.io/docs/core-data/curators

    Maybe collaborating with an existing initiative (and making it better) would be even better than starting something new.

    I think they are tustable because my friends Anja and Richard and the new maintainers from lod-cloud.net build on their data. Lod-cloud has beed the most cited open data index for some years.

    1. Leo,

      Thanks for your feedback and no, I wasn’t aware of them.

      I’ll take a look at what they’re working on because it sounds fascinating. But my interest isn’t really on accumulating lots of data, I really feel like there’s too much of it out there already. I’m more interested in the direct application of data because I feel like that is a sorely lacking piece in all of this.

      And I really am interested in exploiting the difference in rules between use and distribution. There could be great big piles of information that WikiTribune could make available to contributors but not publish directly.

      It’s good to know that the work is already being done for a lot of the public data though.

  3. It might be useful to distinguish between using and copying. There may be restrictions on publication of data where it would still be possible to make a copy of the data as a reference for journalists. (Publications of statistics or observations about the data may be a gray area but I’m guessing it’s not an issue.) The guiding legal principle is that facts can’t be copyrighted.

    And regardless, there is lots of interesting data which is published by the government and government-like organizations that is in the public domain and is intended to be publicized.

    If WikiTribune would be interested in creating a repository of such information, I’ve spent a fair amount of time building web scrapers and similar automation tools and would be happy to provide some guidance.

  4. Added two more resources in a suggested edit:
    https://webfoundation.org/research/open-data-barometer-fourth-edition/

    http://lod-cloud.net/
    possible interview candidates there: [email protected] and [email protected]. The original version was developed by Richard Cyganiak and Anja Jentzsch.

  5. Thanks for this open conversation.

    tl;dr
    I hope you will consider the fact that data can still be free.

    Thanks for possibly moving towards open data discussion. Not sure if the definition of Open Data should include semi-open data eg; “Open data is data that can be used, modified and republished with few if any copyright restrictions. ”

    “Open Data” is free of license or copyright or any kind.

    Unlike a store, digital data doesn’t need to be open part-time or inherit restriction, regulation or license to have or hold. Open Data may be associated with Open Source but it’s not the same thing. Open source licensing is not a requirement for data imho.

    At the end of the century, our efforts to honor and allow “data freedom” will be a determining factor in our personal freedom. We are in fact, data. This makes us one and many at the same time, a conundrum of opportunity for bright minds. Defining Open Data is critical. If WikiTribune were a Wikipedia entry I would have skipped over the link.

    If data are stored on a server owned by someone that requires an open source license then the data is open, like a store. If a government or group requires oversight of the data eg; a license, then a control over the data exists.

    Please know that I highly respect your privacy policy and TOS efforts.

    Is there a way we could work towards restoring the “open” in open data before discussing Open Data? Or does WikiTribune require open data to be licensed, for example? Under the current legal schema for WikiTribune the answer is more obvious than the title “Open Data” implies.

    WikiTribune clearly proposes to license and enforce data on their servers. Fine by me but we must understand how this affects contributions within a public/private framework.

    If WikiTribune would choose to allow open data the terms and conditions and privacy policy could transparently reflect such openness. I’m not proposing WikiTribune support one or the other. Rather, I’m suggesting both could be used effectively.

    However, if WikiTribune wished to provide a method for open data submission it may still be possible while some thinkers still understand the difference between open and semi-open data.

    –/
    Related text from TOS
    1. You License Freely Your Contributions – you generally must license your contributions and edits to our site under a free and open license (unless your contribution is in the public domain).

    Related text from PP
    1. For the purposes of the Data Protection Act 1998, we are the data controller. We are registered as a data controller with the Information Commissioner’s Office under registration number ZA248828.
    /–

    A possible method for contributing truly Open Data would be to separate the data from the data provider. Meaning, a submission could be made on the same page yet personal or private information could be stored on WikiTribune servers while the story or other data could be stored in a decentralized fashion. I realize this is stepping away from the central goals of WikiTribune.

    Could allowing a writer to submit a story anonymously free the data (the story) from copyright if blockchain technology were employed in the submission?

    Writer/submitter
    _data submitted (anonymous, protected with blockchain/vpn etc.)
    ——————————————————–
    Article/submission
    _data submitted (open, free from license or copyright)

    An aid in understanding this issue might be understanding the importance of the way a person accesses the data. The so-called, Observer Effect has a role in the Open Data discussion as it relates to security and alteration of the data.

    Again, I realize “Open Data” may not be a goal of WikiTribune. But I hope you will consider the fact that data can still be free.

    1. Hello Brent. While I understand your sentiment that data should be neither copyrighted nor licensed, that is not the only regime that allows for open data. Most permissive and share-alike licenses in widespread use also meet the definitions of open data and in addition add protections that the data originator/collector/developer might wish to attach. That is their choice. Moreover, a lot of data comes from public sector sources and, in Europe at least, getting permissive licenses added is normally the best we can hope for. Public domain dedications, unlike the US, are usually completely out of the question.

      An excellent treatment of the legal and technical issues surrounding open data can be found in Ball (2014):

      Ball, Alex (2014). How to license research data. Edinburgh, United Kingdom: Digital Curation Centre (DCC). http://www.dcc.ac.uk/webfm_send/1735

      1. Thanks Robbie, I agree with you and appreciate your taking the time necessary to understand my position. Data has inalienable rights. Our attempts to separate those rights from data can only lessen the value of both.

        This could become much more interesting as the conversation shifts from digital to biological data.

  6. A very basic and convenient method to display data is using tables, yet inserting tables to WikiTribune is far from easy.

    Furthermore, if it was up to me to decide, I would require at least one comparison table or info table to approve an article for publishing.

  7. This might not be right the section of WT, but I think it would be a good idea to generate regular articles based on the open data. For example, mine the datasets to find newsworthy facts or trends; these could be wrapped lightly to create articles, and/or made available to the appropriate writers to evaluate for possible expansion into a fuller story. WT could even create and update its own indices which would likely get picked up by other news media.

    Examples off the top of my head could be: 1) trends in crime stats, 2) best/worst occupations going forward by education level, 3) climate change trends by city, 4) demographic changes by city, etc. There’s a lot of government data, some of which I’ve analyzed in the past using custom programs.

    1. Please add the list of open data resources in ‘Suggested Resources’ section.

  8. I’d suggest that links / guides to how to submit Freedom of Information requests in various jurisdictions would be useful in encouraging more citizens to take an interest in local issues. The first time I successfully got a document that i thought would be kept “secret”, well – it was a really positive feeling, and the world be a better place if more people were in the habit of asking questions of officialdom. Let’s help demystify the process for them…

    1. It would be great if you can write a small story on this topic or even elaborate this project, especially the guidelines section.

  9. Would this be a place for a discussion on what is public data and efforts to obtain it from the public sector?

    1. This would definitely be the place for that sort of discussion

      1. Great. I think this topic can cover a wide range of things but my knowledge revolves around student press in the US.

        I think anyone who has worked in student press in the US (I’m not sure about other countries) are aware that public schools (private schools are a different beast) are known for their opposition to a free student press.

        There are outlets that do cover this issue, Student Press Law Center (SPLC) comes to mind, but from what I’ve seen, outside of offering advice about what students should do in response to a school administration stonewalling them and then writing a summary of the events, there really isn’t anywhere that covers this issue.

        Having an independent media outlet report on this issue is needed, in my opinion. From what I’ve seen, unless something actually goes to court, all colleges have to do is wait out the students. So colleges decide to stonewall and delay responding to questions or public information requests. An independent outlet also won’t be subject to retaliation from colleges.

        And while the SPLC is amazing and a great resource, from my experiences, they focus on the legal side of the issue. I think more can be done and WikiTribune seems like the perfect place to do that.

          1. Yeah, it seems really messed up. Like I mentioned, unless something actually goes to court, colleges can just wait the student journalist out because they’re going to leave the college at some point and pursue something else. Add into the fact that student journalists have other things going on (school, part-time jobs) so they can’t dedicate the same amount of time to an issue as other journalists could.

  10. This fits wikitribune like a glove. Already in Canada, whenever census data is released it gets media coverage because it is the authoritative information that will affect future planning in communities across the country as regards hospitals, funding, social programs, education, etc.

  11. This is a fascinating space with enormous potential. Wikidata + all the Wiki initiatives are powerful of course.
    Also worth looking at OpenStreetMap for open map data – with the spin off Humanitarian Open StreetMap team (HOTOSM) bringing volunteers together to map areas that need humanitarian support.
    WikiTrees is another interesting example in the genealogy space – looking to collaboratively build a single global family tree.
    Open data set for living people could be very powerful – although contentious re: privacy. Many interesting identification / verification initiatives already in place / underway (India Aadhar as particular example).
    Global data standards are hard, but critical to these efforts. Massive cultural challenges too – sharing instinct vs. capturing value of data in proprietary silos…
    Will be great to see some deep coverage on this topic.

    1. Yes, we have a lot of open data sources. But a lot of people are not aware of these efforts. The goal of the project is to create general awareness on this topic.

  12. Oh wow just noticed this – fantastic.

    I think there’s a big opportunity in citizen participation in journalism in this area, and a big part of it will flourish only if we can provide people with pointers to find data that’s already out there, pointers and how-to’s on tools to use on that data, and a community of people to bounce ideas off of…

    Very happy to see this excellent start!

Subscribe to our newsletter to receive news, alerts and updates

Support Us

Why this is important and why you should care about facts, journalism and democracy

WikiTribune Open menu Close Search Like Previous page Next page Back Next Open menu Close menu Play video RSS Feed Share on Facebook Follow us on Twitter Follow us on Instagram Follow us on Youtube Connect with us on Linkedin Email us Message us on Facebook Messenger Save for Later