Where and Why to Get Involved with Open Source Before 2020

While moseying around the tech space, one of the first terms you will hear is “open source.” At tech conferences for example, experts will cite “open source” as a driver for innovation. Why? Because anyone can combine open source data sets on topics they so choose with open source software platforms to address problems in our society or develop the next best tool.

Inspired by open source community projects like Tableau’s #MakeoverMonday and RStudio’s #TidyTuesday, I decided to survey the #DataEveryone community during one of our weekly Twitter chats about their favorite open source platforms and why they use open source in general.

Why Open Source?

Data in and of itself has the power to create social change within communities because it’s free, accessible and breaks down barriers to entry in what otherwise would be a very exclusive field. Paid software geared towards company teams and conferences can be pricey, but anyone can download free software like R and use it to analyze data on a subject of particular interest.

Many innovators in the field use open source data to make positive changes in their respective environments. “One of the best things we can do with statistics and data is to promote social good,” said @o_gustavo, “Information that is properly created and presented can spark positive change.” @rahatcodes agrees: “The data behind social issues is widely available. Finding ways to present the data in more impactful ways that can help towards change is what we need to work on.

Moreover, open source software and data offer ample opportunities to find common ground as a group. Through initiatives like R-Ladies, and the aforementioned #MakeoverMonday and #TidyTuesday, groups of budding data scientists can develop skills together based around a relatable topic. As @daniebrant put, “Open source data can help activists paint a more complete picture of the issues like gun violence that plague our communities.”

Discover the Data

Below are the sites recommended by members of our chat where you can search for a dataset that interests you:

  1. ipums.org

  2. kaggle

  3. UCI ML

  4. data.un.org

  5. data.worldbank.org

  6. Amazon open data

If you’re still looking for a dataset to get started with, see if your city or community has an open data portal. Then if so, search the portal for information on data that pertains to an issue within your city that you want to address. There may be data missing from the sets available, but this offers a good opportunity to learn how to clean data. In addition, you can be creative with the datasets you combine.

Like what you see?

Don’t hesitate to join us at #DataEveryone on Twitter every Thursday at 7pm ET. This week’s topic? The data surrounding workplace mental health. Search the hashtag #DataEveryone and/or show up at @dikayodata ready to share your opinions and meet other data-driven folks.

Danielle Oberdier