Loading...
Loading...
Loading...
Everyone enjoys discovering [interesting datasets](http://rs.io/100-interesting-data-sets-for-statistics/), but useful datasets are even better. The problem is that the open data movement has been too successful by some measures.
---
layout: page
title: Data
permalink: /data/
---
## Open Data
Everyone enjoys discovering [interesting datasets](http://rs.io/100-interesting-data-sets-for-statistics/), but useful datasets are even better. The problem is that the open data movement has been too successful by some measures.
We have gone from a relatively data poor environment in government and nonprofits to an amazing repositories of open data assets, including [Socrata](https://socrata.com/solutions/open-data-citizen-engagement/) and [CKAN's](https://ckan.org/about/instances/) Open Government data portals, the US Federal portal at [www.data.gov](https://www.data.gov/), and over 80,000 academic datasets posted on [Dataverse](https://dataverse.harvard.edu/).
Data is more useful when it comes with a [vignette](https://ds4ps.github.io/PROG-EVAL-III/Datasets_in_R.html) that provides some information about the nature of the data and potential uses. We will work to update the site with some helpful datasets for public affairs programs in the near future.
In the meantime, enjoy some of these resources:
<br>
**OPEN DATA**
-----------------------
* TOC
{:toc}
-----------------------
<br>
### Overview
* Background on the Open Data Movement [ [link](http://www.urban.org/sites/default/files/alfresco/publication-pdfs/413153-Putting-Open-Data-to-Work-for-Communities.PDF) ]
* Ben Wellington's TED Talk on Open Data in NYC [ [link](https://www.ted.com/talks/ben_wellington_how_we_found_the_worst_place_to_park_in_new_york_city_using_big_data?) ]
### DATA Act
* The Data Transparency Act [ [overview](/s/Electronic_Version___DATA_Act_Vision_and_Value.pdf) ] [ [link](http://www.datacoalition.org/what-is-data-transparency/data-act/) ] [ [link](http://labs.data.gov/dashboard/offices) ] [ [link](http://www.forbes.com/sites/techonomy/2014/09/12/how-open-data-is-transforming-city-life/) ]
* Keynote Speech on Importance of DATA Act [ [link](https://www.volckeralliance.org/publications/data-act-good-use-scarce-government-resources) ]
* Progress Tracker on Federal Open Data Compliance [ [link](http://labs.data.gov/dashboard/offices) ]
### Impact of Open Data
* I Quant NY [ [budget error](https://iquantny.tumblr.com/post/147446103684/open-data-reveals-791-million-error-in-newly) ] [ [metro fares](https://iquantny.tumblr.com/post/114470101209/i-quant-a-victory-mta-adds-new-button-for) ] [ [parking tickets](https://gizmodo.com/justice-the-fire-hydrant-that-earned-nyc-33-000-a-yea-1585633742) ]
* Realizing the Promise of Big Data: IBM Center for Gov. [ [link](http://www.businessofgovernment.org/sites/default/files/Realizing%20the%20Promise%20of%20Big%20Data_0.pdf) ]
* Data Used in 2017 Public Policy Dissertations [ [link](https://ds4ps.org/2019/03/20/data-for-policy-dissertations.html) ][ [broken link](http://publicmanagementresearch.com/2017/12/18/data-for-dissertations-december-18-2017/) ]
### Guides & Best Practices
* Project Open Data [ [link](https://project-open-data.cio.gov/) ] [ [principles](https://project-open-data.cio.gov/principles/) ]
* Open North standards [ [link](http://www.opennorth.ca/publications) ]
* Sunlight Foundation's Open Data Guidelines [ [link](http://sunlightfoundation.com/opendataguidelines/) ]
* Global Impact of Open Data Book: GovLab / O'Reilly [ [link](http://www.oreilly.com/data/free/files/the-global-impact-of-open-data.pdf) ]
* The Hidden Cost (and Benefits) of Open Data [ [link](http://www.governing.com/columns/tech-talk/gov-open-data-cost-problems.html) ]
* Data Maturity Framework [ [link](https://dsapp.uchicago.edu/home/resources/datamaturity/) ]
* How to Share Data for Collaboration [ [link](https://peerj.com/preprints/3139v5.pdf) ]
### Government Portal and Resources
* How to Make Government Data Sites Better [ [link](http://flowingdata.com/2014/06/10/how-to-make-government-data-sites-better/) ] [ [link](http://blogs.scientificamerican.com/guest-blog/what-s-wrong-with-open-data-sites-and-how-we-can-fix-them/) ]
* US Cities Open Data Census [ [link](http://us-city.census.okfn.org/) ]
* Statewide Portal Tested in California [ [link](http://www.governing.com/topics/mgmt/tns-california-open-data.html) ]
* Five Largest Cities Now Have Open Data Policies [ [link](http://sunlightfoundation.com/blog/2014/10/15/all-five-of-the-largest-u-s-cities-now-have-open-data-policies/) ]
* 40 Brilliant Open Data Projects for Smart Cities [[ link](https://carto.com/blog/forty-brilliant-open-data-projects) ]
### Machine Learning Training Data
* Top Sources for Machine Learning Datasets [ [link](https://vas3k.com/blog/machine_learning/) ]
<br>
-----------------------
<br>
# Useful Data Sources
### APIs
* Awesome Public Datasets Page [ [GitHub](https://github.com/awesomedata/awesome-public-datasets) ]
* Quandl API (many data sources) [ [link](https://www.quandl.com/) ] [ [r package](https://www.quandl.com/help/r) ]
* Census Data API [ [acs package](http://eglenn.scripts.mit.edu/citystate/wp-content/uploads/2013/02/wpid-working_with_acs_R2.pdf) ] [ [census api](http://rstudio-pubs-static.s3.amazonaws.com/19337_2e7f827190514c569ea136db788ce850.html) ]
* TwitteR Package API [ [link](http://davetang.org/muse/2013/04/06/using-the-r_twitter-package/) ]
* 19 Free Public Datasets (Springboard blog) [ [link](https://www.springboard.com/blog/free-public-data-sets-data-science-project/) ]
* ckanr [ [github](https://github.com/ropensci/ckanr) ] [ [vignette](https://cran.r-project.org/web/packages/ckanr/vignettes/ckanr_vignette.html) ]
* Rsocrata [ [github](https://github.com/Chicago/RSocrata) ]
* censusapi Package [ [github](https://github.com/hrecht/censusapi) ] [ [slides](http://urbaninstitute.github.io/R-Trainings/accesing-census-apis/presentation/index.html#/) ] [ [tutorial](/s/CensusAPI_Package.html) ]
* @unitedstates [ [about](https://sunlightfoundation.com/2013/08/20/a-modern-approach-to-open-data/) ] [ [github](https://github.com/unitedstates) ]
* Data USA [ [link](http://datausa.io/) ] [ [documentation](https://gist.github.com/lecy/0aa782a873cd174573f32d243233ca5b) ]
* Data Science Toolkit [ [link](http://www.datasciencetoolkit.org/) ] [ [rpackage](http://files.meetup.com/1696476/DRUG.pdf) ]
* Federal Government APIs [ [link](https://github.com/unitedstates/APIs) ]
* Strava GPS Data of Athletes by City [ [blog](http://www.databrew.cc/posts/strava.html) ]
* rtimes Package: NYTimes API for government data [ [link](https://github.com/ropengov/rtimes) ]
* rsunlight Package: Wrapper for the Open Congress and Open States APIs [ [link](https://github.com/ropengov/rsunlight) ]
### Data for Teaching
* A Little Stats [ [link](http://alittlestats.blogspot.com/p/data-sources.html) ]
* Fun Data for Teaching [ [link](http://bartomeuslab.com/2016/01/21/fun-data-for-teaching-r/) ]
* Forbes: 35 Open Data Sources of Note [ [link](http://www.forbes.com/sites/bernardmarr/2016/02/12/big-data-35-brilliant-and-free-data-sources-for-2016/#2a8d98876796) ]
* 100 Interesting Datasets [ [link](http://rs.io/100-interesting-data-sets-for-statistics/) ]
* Data and Story Library [ [link](https://dasl.datadescription.com/) ]
### Open Data for the Nonprofit Sector
* Urban Institute's NCCS Data [ [link](https://nccs-data.urban.org/index.php) ]
* Nonprofit Open Data Collective [ [link](https://nonprofit-open-data-collective.github.io/overview/) ]
* Bureau of Labor Statistics Employment Data by Sector [ [link](https://www.bls.gov/bdm/nonprofits/nonprofits.htm) ]
* Association of Religious Archives Congration Data by County 1950-2010 [ [link](http://www.thearda.com/archive/browse.asp) ]
### Poverty Action Lab Catalog of Administrative Data
A guide to data sources that have been used as the basis of sampling frameworks for randomized control trials (RCTs) in the US.
* JPAL Catalog of Data [ [link](https://www.povertyactionlab.org/sites/default/files/documents/AdminDataCatalog.pdf) ]
* JPAL Publications with Data Available [ [link](https://www.povertyactionlab.org/evaluations?f[0]=field_external_data:title:1) ]
### Data-Driven Journalism Project Portals
Data journalists are making their stories transparent by posting the data and code used for their work so that it can be easily replicated or the work can be extended.
* BBC creates graphics cookbook [ [link](https://medium.com/bbc-visual-and-data-journalism/how-the-bbc-visual-and-data-journalism-team-works-with-graphics-in-r-ed0b35693535) ] [ [cookbook](https://bbc.github.io/rcookbook/) ]
* Buzzfeed [ [all projects on GitHub](https://github.com/BuzzFeedNews/everything) ]
* LA Times [ [datadesk on GitHub](https://github.com/datadesk) ]
* Washington Post [ [projects on GitHub](https://github.com/washingtonpost) ]
* Associated Press [ [GitHub](https://github.com/associatedpress) ] [ [project template](https://github.com/associatedpress/datakit-core) ] [ [example](http://data.ap.org/projects/2017/federal-judges/processed/Federal_Judiciary_Diversity.html) ]
* The Economist [ [GitHub](https://github.com/TheEconomist) ]
* Center for Public Integrity [ [GitHub](https://github.com/PublicI) ] [ [Workplace Descrimination Story](https://www.buzzfeednews.com/article/lamvo/eeoc-sexual-harassment-data) ]
### Disaster Management
SHELDUS Database [ [link](https://cemhs.asu.edu/node/7) ]
### Police Data Initiative
The Police Data Initiative is a law enforcement community of practice that includes leading law enforcement agencies, technologists, and researchers committed to engaging their communities in a partnership to improve public safety that is built on a foundation of trust, accountability and innovation. The PDI represents the great work and leadership of more than 130 law enforcement agencies who have released more than 200 datasets to date, and originated as a result of several recommendations in the Task Force on 21st Century Policing that focused on technology and transparency.
[Available Datasets](https://www.policedatainitiative.org/datasets/)
### Data Blogs
* Data is Plural by Jeremy Singer-Vine [ [archive](https://tinyletter.com/data-is-plural/archive) ]
<br>
-----------------------
<br>
# "Awesome Data" Catalog
This is just a sample of some datasets that would be relevant to the public and nonprofit sectors from the larger catalog of open public sources curated and managed by [AwesomeData](https://github.com/awesomedata/awesome-public-datasets).
Agriculture
-----------
* <i class="far fa-check-circle" style="color:lightgray"></i> [Hyperspectral benchmark dataset on soil moisture](https://doi.org/10.5281/zenodo.1227837)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Department of Agriculture's Nutrient Database](https://www.ars.usda.gov/northeast-area/beltsville-md/beltsville-human-nutrition-research-center/nutrient-data-laboratory/docs/sr28-download-files/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Department of Agriculture's PLANTS Database](http://www.plants.usda.gov/dl_all.html)
Climate+Weather
---------------
* <i class="far fa-check-circle" style="color:lightgray"></i> [Actuaries Climate Index](http://actuariesclimateindex.org/data/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Australian Weather](http://www.bom.gov.au/climate/dwo/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Aviation Weather Center - Consistent, timely and accurate weather [...]](https://aviationweather.gov/adds/dataserver)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Brazilian Weather - Historical data (In Portuguese) - Data related to [...]](http://sinda.crn.inpe.br/PCD/SITE/novo/site/historico/index.php)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Canadian Meteorological Centre](http://weather.gc.ca/grib/index_e.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Climate Data from UEA (updated monthly)](http://www.cru.uea.ac.uk/data/)
* <i class="far fa-question-circle" style="color:lightgray"></i> [European Climate Assessment & Dataset](http://eca.knmi.nl/) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//Climate+Weather/European-Climate-Assessment-&-Dataset.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [Global Climate Data Since 1929](http://en.tutiempo.net/climate)
* <i class="far fa-check-circle" style="color:lightgray"></i> [NASA Global Imagery Browse Services](https://wiki.earthdata.nasa.gov/display/GIBS)
* <i class="far fa-check-circle" style="color:lightgray"></i> [NOAA Bering Sea Climate](http://www.beringclimate.noaa.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [NOAA Climate Datasets](http://www.ncdc.noaa.gov/data-access/quick-links)
* <i class="far fa-check-circle" style="color:lightgray"></i> [NOAA Realtime Weather Models](http://www.ncdc.noaa.gov/data-access/model-data/model-datasets/numerical-weather-prediction)
* <i class="far fa-check-circle" style="color:lightgray"></i> [NOAA SURFRAD Meteorology and Radiation Datasets](https://www.esrl.noaa.gov/gmd/grad/stardata.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [The World Bank Open Data Resources for Climate Change](http://data.worldbank.org/developers/climate-data-api)
* <i class="far fa-check-circle" style="color:lightgray"></i> [UEA Climatic Research Unit](http://www.cru.uea.ac.uk/data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [WU Historical Weather Worldwide](https://www.wunderground.com/history/index.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [WorldClim - Global Climate Data](http://www.worldclim.org)
ComplexNetworks
---------------
* <i class="far fa-question-circle" style="color:lightgray"></i> [DBLP Citation dataset](https://kdl.cs.umass.edu/display/public/DBLP) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//ComplexNetworks/DBLP-Citation-dataset.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [DIMACS Road Networks Collection](http://www.dis.uniroma1.it/challenge9/download.shtml)
* <i class="far fa-check-circle" style="color:lightgray"></i> [NBER Patent Citations](http://nber.org/patents/)
DataChallenges
--------------
* <i class="far fa-check-circle" style="color:lightgray"></i> [DrivenData Competitions for Social Good](http://www.drivendata.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Localytics Data Visualization Challenge](https://github.com/localytics/data-viz-challenge)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Netflix Prize](http://netflixprize.com/leaderboard.html)
Economics
---------
* <i class="far fa-check-circle" style="color:lightgray"></i> [American Economic Association (AEA)](https://www.aeaweb.org/resources/data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [EconData from UMD](http://inforumweb.umd.edu/econdata/econdata.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Economic Freedom of the World Data](http://www.freetheworld.com/datasets_efw.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Historical MacroEconomic Statistics](http://www.historicalstatistics.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [INFORUM - Interindustry Forecasting at the University of Maryland](http://inforumweb.umd.edu/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [International Economics Database](https://db.nomics.world/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [International Trade Statistics](http://www.econostatistics.co.za/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Internet Product Code Database](http://www.upcdatabase.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Joint External Debt Data Hub](http://www.jedh.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Jon Haveman International Trade Data Links](http://www.macalester.edu/research/economics/PAGE/HAVEMAN/Trade.Resources/TradeData.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [OpenCorporates Database of Companies in the World](https://opencorporates.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Our World in Data](http://ourworldindata.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [SciencesPo World Trade Gravity Datasets](http://econ.sciences-po.fr/thierry-mayer/data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [The Atlas of Economic Complexity](http://atlas.cid.harvard.edu)
* <i class="far fa-check-circle" style="color:lightgray"></i> [The Center for International Data](http://cid.econ.ucdavis.edu)
* <i class="far fa-check-circle" style="color:lightgray"></i> [The Observatory of Economic Complexity](http://atlas.media.mit.edu/en/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [UN Commodity Trade Statistics](http://comtrade.un.org/db/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [UN Human Development Reports](http://hdr.undp.org/en)
Education
---------
* <i class="far fa-check-circle" style="color:lightgray"></i> [College Scorecard Data](https://collegescorecard.ed.gov/data/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Student Data from Free Code Camp](https://github.com/freeCodeCamp/open-data)
Foundations
-----------
* <i class="far fa-check-circle" style="color:lightgray"></i> International Aid Transparency Initiative (iati) [ [ database of grants](https://iatistandard.org/en/) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> Ford Foundation Grants [ [database](https://www.fordfoundation.org/work/our-grants/grants-database/grants-all) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> Hewlett Foundation Grants [ [database](https://hewlett.org/grants/?sort=date) ]
GIS
---
* <i class="far fa-check-circle" style="color:lightgray"></i> [ArcGIS Open Data portal](http://opendata.arcgis.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Cambridge, MA, US, GIS data on GitHub](http://cambridgegis.github.io/gisdata.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Factual Global Location Data](https://places.factual.com/data/t/places)
* <i class="far fa-check-circle" style="color:lightgray"></i> [IEEE Geoscience and Remote Sensing Society DASE Website](http://dase.grss-ieee.org)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Geo Maps - High Quality GeoJSON maps programmatically generated](https://github.com/simonepri/geo-maps)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Geo Spatial Data from ASU](http://geodacenter.asu.edu/datalist/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Geo Wiki Project - Citizen-driven Environmental Monitoring](http://geo-wiki.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [GeoFabrik - OSM data extracted to a variety of formats and areas](http://download.geofabrik.de/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [GeoNames Worldwide](http://www.geonames.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Global Administrative Areas Database (GADM) - Geospatial data organized [...]](https://gadm.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Homeland Infrastructure Foundation-Level Data](https://hifld-geoplatform.opendata.arcgis.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Landsat 8 on AWS](https://aws.amazon.com/public-data-sets/landsat/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [List of all countries in all languages](https://github.com/umpirsky/country-list)
* <i class="far fa-check-circle" style="color:lightgray"></i> [National Weather Service GIS Data Portal](http://www.nws.noaa.gov/gis/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Natural Earth - vectors and rasters of the world](http://www.naturalearthdata.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [OpenAddresses](http://openaddresses.io/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [OpenStreetMap (OSM)](http://wiki.openstreetmap.org/wiki/Downloading_data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Pleiades - Gazetteer and graph of ancient places](http://pleiades.stoa.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Reverse Geocoder using OSM data](https://github.com/kno10/reversegeocode)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Robin Wilson - Free GIS Datasets](http://freegisdata.rtwilson.com)
* <i class="far fa-check-circle" style="color:lightgray"></i> [TIGER/Line - U.S. boundaries and roads](https://www.census.gov/geo/maps-data/data/tiger-line.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [TZ Timezones shapfiles](http://efele.net/maps/tz/world/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [TwoFishes - Foursquare's coarse geocoder](https://github.com/foursquare/twofishes)
* <i class="far fa-check-circle" style="color:lightgray"></i> [UN Environmental Data](http://geodata.grid.unep.ch/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [World boundaries from the U.S. Department of State](http://geonode.state.gov/layers/?limit=100&offset=0)
* <i class="far fa-check-circle" style="color:lightgray"></i> [World countries in multiple formats](https://github.com/mledoze/countries)
Government
----------
* <i class="far fa-check-circle" style="color:lightgray"></i> [Alberta, Province of Canada](http://open.alberta.ca)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Antwerp, Belgium](http://opendata.antwerpen.be/datasets)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Argentina (non official)](http://datar.noip.me/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Datos Argentina - Portal de datos abiertos de la República Argentina. [...]](http://datos.gob.ar/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Austin, TX, US](https://data.austintexas.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Australia (abs.gov.au)](http://www.abs.gov.au/AUSSTATS/[email protected]/DetailsPage/3301.02009?OpenDocument)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Australia (data.gov.au)](https://data.gov.au/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Austria (data.gv.at)](https://www.data.gv.at/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Baton Rouge, LA, US](https://data.brla.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Belgium](http://data.gov.be/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Brazil](http://dados.gov.br/dataset)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Buenos Aires, Argentina](http://data.buenosaires.gob.ar/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Calgary, AB, Canada](https://data.calgary.ca/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Cambridge, MA, US](https://data.cambridgema.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Canada](http://open.canada.ca/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Chicago](https://data.cityofchicago.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Chile](http://datos.gob.cl/dataset)
* <i class="far fa-check-circle" style="color:lightgray"></i> [China](http://data.stats.gov.cn/english/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Dallas Open Data](https://www.dallasopendata.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [DataBC - data from the Province of British Columbia](http://www.data.gov.bc.ca/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Denver Open Data](http://data.denvergov.org//)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Durham, NC Open Data](https://live-durhamnc.opendata.arcgis.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Edmonton, AB, Canada](https://data.edmonton.ca/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [England LGInform](http://lginform.local.gov.uk/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [EuroStat](http://ec.europa.eu/eurostat/data/database)
* <i class="far fa-check-circle" style="color:lightgray"></i> [EveryPolitician - Ongoing project collating and sharing data on every [...]](http://everypolitician.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Federal Committee on Statistical Methodology (FCSM) (formerly FedStats)](https://nces.ed.gov/FCSM/index.asp)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Finland](https://www.opendata.fi/en)
* <i class="far fa-question-circle" style="color:lightgray"></i> [France](https://www.data.gouv.fr/en/datasets/) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//Government/France.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [Fredericton, NB, Canada](http://www.fredericton.ca/en/citygovernment/Catalogue.asp)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Gatineau, QC, Canada](http://www.gatineau.ca/donneesouvertes/default_fr.aspx)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Germany](https://www-genesis.destatis.de/genesis/online)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Ghent, Belgium](https://data.stad.gent/data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Glasgow, Scotland, UK](https://data.glasgow.gov.uk/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Greece](http://www.data.gov.gr/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Guardian world governments](http://www.guardian.co.uk/world-government-data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Halifax, NS, Canada](https://www.halifax.ca/home/open-data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Helsinki Region, Finland](http://www.hri.fi/en/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Hong Kong, China](https://data.gov.hk/en/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Houston, TX, US](http://data.houstontx.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Indian Government Data](https://data.gov.in/)
* <i class="far fa-question-circle" style="color:lightgray"></i> [Indonesian Data Portal](http://data.go.id/) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//Government/Indonesian-Data-Portal.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [Ireland's Open Data Portal](https://data.gov.ie/data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Italy - Il Portale dati.gov.it è il catalogo nazionale dei metadati [...]](https://www.dati.gov.it/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Japan](http://www.e-stat.go.jp/SG1/estat/eStatTopPortalE.do)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Laval, QC, Canada](http://www.laval.ca/Pages/Fr/Citoyens/donnees.aspx)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Lexington, KY](http://data.lexingtonky.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [London Datastore, UK](http://data.london.gov.uk/dataset)
* <i class="far fa-check-circle" style="color:lightgray"></i> [London, ON, Canada](http://www.london.ca/city-hall/open-data/Pages/default.aspx)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Los Angeles Open Data](https://data.lacity.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Luxembourg - Luxembourgish Open Data Portal](https://data.public.lu/en/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [MassGIS, Massachusetts, U.S.](http://www.mass.gov/anf/research-and-tech/it-serv-and-support/application-serv/office-of-geographic-information-massgis/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Metropolitain Transportation Commission (MTC), California, US](http://mtc.ca.gov/tools-resources/data-tools/open-data-library)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Mexico](https://datos.gob.mx/busca/dataset)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Missisauga, ON, Canada](http://www.mississauga.ca/portal/residents/publicationsopendatacatalogue)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Moldova](http://data.gov.md/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Moncton, NB, Canada](http://open.moncton.ca/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Montreal, QC, Canada](http://donnees.ville.montreal.qc.ca/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Mountain View, California, US (GIS)](http://data-mountainview.opendata.arcgis.com/)
* <i class="far fa-question-circle" style="color:lightgray"></i> [NYC Open Data](https://opendata.cityofnewyork.us/) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//Government/NYC-Open-Data.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [NYC betanyc](http://betanyc.us/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Netherlands](https://data.overheid.nl/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [New Zealand](http://www.stats.govt.nz/browse_for_stats.aspx)
* <i class="far fa-check-circle" style="color:lightgray"></i> [OECD](https://data.oecd.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Oakland, California, US](https://data.oaklandnet.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Oklahoma](https://data.ok.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Open Data for Africa](http://opendataforafrica.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Open Government Data (OGD) Platform India](https://data.gov.in/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [OpenDataSoft's list of 1,600 open data](https://www.opendatasoft.com/a-comprehensive-list-of-all-open-data-portals-around-the-world/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Oregon](https://data.oregon.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Ottawa, ON, Canada](http://data.ottawa.ca/en/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Palo Alto, California, US](http://data.cityofpaloalto.org/home)
* <i class="far fa-check-circle" style="color:lightgray"></i> [OpenDataPhilly - OpenDataPhilly is a catalog of open data in the [...]](https://www.opendataphilly.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Portland, Oregon](https://www.portlandoregon.gov/28130)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Portugal - Pordata organization](http://www.pordata.pt/en/Home)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Puerto Rico Government](https://data.pr.gov//)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Quebec City, QC, Canada](http://donnees.ville.quebec.qc.ca/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Quebec Province of Canada](https://www.donneesquebec.ca/en/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Regina SK, Canada](http://open.regina.ca/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Rio de Janeiro, Brazil](http://www.data.rio/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Romania](http://data.gov.ro/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Russia](http://data.gov.ru)
* <i class="far fa-check-circle" style="color:lightgray"></i> [San Diego, CA](https://data.sandiego.gov)
* <i class="far fa-check-circle" style="color:lightgray"></i> [San Antonio, TX - Community Information Now - CI:Now is a nonprofit [...]](http://cinow.info/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [San Francisco Data sets](http://datasf.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [San Jose, California, US](http://data.sanjoseca.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [San Mateo County, California, US](https://data.smcgov.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Saskatchewan, Province of Canada](http://opendatask.ca/data/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Seattle](https://data.seattle.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Singapore Government Data](https://data.gov.sg/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [South Africa Trade Statistics](http://www.econostatistics.co.za/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [South Africa](http://www.statssa.gov.za/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [State of Utah, US](https://opendata.utah.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Switzerland](http://www.opendata.admin.ch/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Taiwan gov](https://data.gov.tw/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Taiwan](http://data.gov.tw/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Tel-Aviv Open Data](https://opendata.tel-aviv.gov.il/index_en.html#/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Texas Open Data](https://data.texas.gov/)
* <i class="far fa-question-circle" style="color:lightgray"></i> [The World Bank](https://openknowledge.worldbank.org/handle/10986/2124) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//Government/The-World-Bank.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [Toronto, ON, Canada](https://portal0.cf.opendata.inter.sandbox-toronto.ca/)
* <i class="far fa-question-circle" style="color:lightgray"></i> [Tunisia](http://www.data.gov.tn/) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//Government/Tunisia.yml) ]
* <i class="far fa-question-circle" style="color:lightgray"></i> [U.K. Government Data](http://data.gov.uk/data) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//Government/U.K.-Government-Data.yml) ]
* <i class="far fa-question-circle" style="color:lightgray"></i> [U.S. American Community Survey](https://www.census.gov/programs-surveys/acs/data.html/) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//Government/U.S.-American-Community-Survey.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. CDC Public Health datasets](https://www.cdc.gov/nchs/data_access/ftp_data.htm)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Census Bureau](http://www.census.gov/data.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Department of Housing and Urban Development (HUD)](http://www.huduser.gov/portal/datasets/pdrdatas.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Federal Government Agencies](http://www.data.gov/metrics)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Federal Government Data Catalog](http://catalog.data.gov/dataset)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Food and Drug Administration (FDA)](https://open.fda.gov/index.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. National Center for Education Statistics (NCES)](http://nces.ed.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Open Government](http://www.data.gov/open-gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [UK 2011 Census Open Atlas Project](https://data.cdrc.ac.uk/product/cdrc-2011-census-open-atlas)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Patent and Trademark Office (USPTO) Bulk Data Products](https://www.uspto.gov/learning-and-resources/bulk-data-products)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Uganda Bureau of Statistics](http://www.ubos.org/unda/index.php/catalog)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Ukraine](https://data.gov.ua/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [United Nations](http://data.un.org/)
* <i class="far fa-question-circle" style="color:lightgray"></i> [Uruguay](https://catalogodatos.gub.uy/) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//Government/Uruguay.yml) ]
* <i class="far fa-question-circle" style="color:lightgray"></i> [Valley Transportation Authority (VTA), California, US](https://data.vta.org/) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//Government/Valley-Transportation-Authority-VTA-California-US.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [Vancouver, BC Open Data Catalog](http://data.vancouver.ca/datacatalogue/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Victoria, BC, Canada](http://opendata.victoria.ca/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Vienna, Austria](https://open.wien.gv.at/site/open-data/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Congressional Research Service (CRS) Reports](https://www.everycrsreport.com/)
Healthcare
----------
* <i class="far fa-check-circle" style="color:lightgray"></i> [Composition of Foods Raw, Processed, Prepared USDA National Nutrient Database for Standard [...]](https://data.nal.usda.gov/dataset/composition-foods-raw-processed-prepared-usda-national-nutrient-database-standard-reference-release-27)
* <i class="far fa-check-circle" style="color:lightgray"></i> [EHDP Large Health Data Sets](http://www.ehdp.com/vitalnet/datasets.htm)
* <i class="far fa-check-circle" style="color:lightgray"></i> [GDC - GDC supports several cancer genome programs for CCG, TCGA, TARGET etc.](https://gdc.cancer.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Gapminder World demographic databases](http://www.gapminder.org/data/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [MeSH, the vocabulary thesaurus used for indexing articles for PubMed](https://www.nlm.nih.gov/mesh/filelist.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Medicare Coverage Database (MCD), U.S.](https://www.cms.gov/medicare-coverage-database/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Medicare Data Engine of medicare.gov Data](https://data.medicare.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Medicare Data File](http://go.cms.gov/19xxPN4)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Number of Ebola Cases and Deaths in Affected Countries (2014)](https://data.humdata.org/dataset/ebola-cases-2014)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Open-ODS (structure of the UK NHS)](http://www.openods.co.uk)
* <i class="far fa-check-circle" style="color:lightgray"></i> [OpenPaymentsData, Healthcare financial relationship data](https://openpaymentsdata.cms.gov)
* <i class="far fa-check-circle" style="color:lightgray"></i> [PhysioBank Databases - A large and growing archive of physiological data.](https://www.physionet.org/physiobank/database/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [The Cancer Imaging Archive (TCIA)](https://www.cancerimagingarchive.net)
* <i class="far fa-check-circle" style="color:lightgray"></i> [The Cancer Genome Atlas project (TCGA)](https://portal.gdc.cancer.gov/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [World Health Organization Global Health Observatory](http://www.who.int/gho/en/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Informatics for Integrating Biology & the Bedside](https://www.i2b2.org/NLP/DataSets/Main.php)
PublicDomains
-------------
* <i class="far fa-check-circle" style="color:lightgray"></i> [Amazon](http://aws.amazon.com/datasets/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Archive.org Datasets](https://archive.org/details/datasets)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Archive-it from Internet Archive](https://www.archive-it.org/explore?show=Collections)
* <i class="far fa-check-circle" style="color:lightgray"></i> [CMU JASA data archive](http://lib.stat.cmu.edu/jasadata/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [CMU StatLab collections](http://lib.stat.cmu.edu/datasets/)
* <i class="far fa-question-circle" style="color:lightgray"></i> [Data.World](https://data.world) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Data.World.yml) ]
* <i class="far fa-question-circle" style="color:lightgray"></i> [Data360](http://www.data360.org/index.aspx) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Data360.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [Enigma Public](https://public.enigma.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Google](http://www.google.com/publicdata/directory)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Grand Comics Database - The Grand Comics Database (GCD) is a nonprofit, [...]](https://www.comics.org)
* <i class="far fa-question-circle" style="color:lightgray"></i> [Infochimps](http://www.infochimps.com/) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Infochimps.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [KDNuggets Data Collections](http://www.kdnuggets.com/datasets/index.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Microsoft Azure Data Market Free DataSets](https://azuremarketplace.microsoft.com/en-us/marketplace/apps?source=datamarket&filters=pricing-free&page=1)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Microsoft Data Science for Research](http://aka.ms/Data-Science)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Microsoft Research Open Data](https://msropendata.com/)
* <i class="far fa-question-circle" style="color:lightgray"></i> [Numbray](http://numbrary.com/) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Numbray.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [Open Library Data Dumps](https://openlibrary.org/developers/dumps)
* <i class="far fa-question-circle" style="color:lightgray"></i> [Reddit Datasets](https://www.reddit.com/r/datasets) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Reddit-Datasets.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [RevolutionAnalytics Collection](http://packages.revolutionanalytics.com/datasets/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Sample R data sets](http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/00Index.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [StatSci.org](http://www.statsci.org/datasets.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Stats4Stem R data sets (archived)](https://web.archive.org/web/20151024082129/http://www.stats4stem.org:80/data-sets.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [The Washington Post List](http://www.washingtonpost.com/wp-srv/metro/data/datapost.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [UCLA SOCR data collection](http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [UFO Reports](http://www.nuforc.org/webreports.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Wikileaks 911 pager intercepts](https://911.wikileaks.org/files/index.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Yahoo Webscope](http://webscope.sandbox.yahoo.com/catalog.php)
SearchEngines
-------------
* <i class="far fa-check-circle" style="color:lightgray"></i> [Academic Torrents of data sharing from UMB](http://academictorrents.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [DataMarket (Qlik)](https://datamarket.com/data/list/?q=all)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Datahub.io](https://datahub.io/dataset)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Harvard Dataverse Network of scientific data](https://dataverse.harvard.edu/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [ICPSR (UMICH)](http://www.icpsr.umich.edu/icpsrweb/ICPSR/index.jsp)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Institute of Education Sciences](http://eric.ed.gov)
* <i class="far fa-check-circle" style="color:lightgray"></i> [National Technical Reports Library](https://ntrl.ntis.gov/NTRL/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Open Data Certificates (beta)](https://certificates.theodi.org/en/datasets)
* <i class="far fa-check-circle" style="color:lightgray"></i> [OpenDataNetwork - A search engine of all Socrata powered data portals](http://www.opendatanetwork.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Statista.com - statistics and Studies](http://www.statista.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Zenodo - An open dependable home for the long-tail of science](https://zenodo.org/collection/datasets)
SocialNetworks
--------------
* <i class="far fa-check-circle" style="color:lightgray"></i> [72 hours #gamergate Twitter Scrape](http://waxy.org/random/misc/gamergate_tweets.csv)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Ancestry.com Forum Dataset over 10 years](http://www.cs.cmu.edu/~jelsas/data/ancestry.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [CMU Enron Email of 150 users](http://www.cs.cmu.edu/~enron/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Cheng-Caverlee-Lee September 2009 - January 2010 Twitter Scrape](https://archive.org/details/twitter_cikm_2010)
* <i class="far fa-check-circle" style="color:lightgray"></i> [EDRM Enron EMail of 151 users, hosted on S3](https://aws.amazon.com/datasets/enron-email-data/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Facebook Data Scrape (2005)](https://archive.org/details/oxford-2005-facebook-matrix)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Facebook Social Networks from LAW (since 2007)](http://law.di.unimi.it/datasets.php)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Foursquare from UMN/Sarwat (2013)](https://archive.org/details/201309_foursquare_dataset_umn)
* <i class="far fa-check-circle" style="color:lightgray"></i> [GitHub Collaboration Archive](https://www.githubarchive.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Google Scholar citation relations](http://www3.cs.stonybrook.edu/~leman/data/gscholar.db)
* <i class="far fa-check-circle" style="color:lightgray"></i> [High-Resolution Contact Networks from Wearable Sensors](http://www.sociopatterns.org/datasets/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Indie Map: social graph and crawl of top IndieWeb sites](http://www.indiemap.org/)
* <i class="far fa-question-circle" style="color:lightgray"></i> [Mobile Social Networks from UMASS](https://kdl.cs.umass.edu/display/public/Mobile+Social+Networks) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//SocialNetworks/Mobile-Social-Networks-from-UMASS.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [Network Twitter Data](http://snap.stanford.edu/data/higgs-twitter.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Reddit Comments](http://files.pushshift.io/reddit/comments/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Skytrax' Air Travel Reviews Dataset](https://github.com/quankiquanki/skytrax-reviews-dataset)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Social Twitter Data](http://snap.stanford.edu/data/egonets-Twitter.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [SourceForge.net Research Data](http://www3.nd.edu/~oss/Data/data.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Twitter Data for Online Reputation Management](http://nlp.uned.es/replab2013/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Twitter Data for Sentiment Analysis](http://help.sentiment140.com/for-students/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Twitter Graph of entire Twitter site](http://an.kaist.ac.kr/traces/WWW2010.html)
* <i class="far fa-question-circle" style="color:lightgray"></i> [Twitter Scrape Calufa May 2011](http://archive.org/details/2011-05-calufa-twitter-sql) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//SocialNetworks/Twitter-Scrape-Calufa-May-2011.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [UNIMI/LAW Social Network Datasets](http://law.di.unimi.it/datasets.php)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Yahoo! Graph and Social Data](http://webscope.sandbox.yahoo.com/catalog.php?datatype=g)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Youtube Video Social Graph in 2007,2008](http://netsg.cs.sfu.ca/youtubedata/)
SocialSciences
--------------
* <i class="far fa-check-circle" style="color:lightgray"></i> [ACLED (Armed Conflict Location & Event Data Project)](http://www.acleddata.com/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Canadian Legal Information Institute](https://www.canlii.org/en/index.php)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Center for Systemic Peace Datasets - Conflict Trends, Polities, State Fragility, etc](http://www.systemicpeace.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Correlates of War Project](http://www.correlatesofwar.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Cryptome Conspiracy Theory Items](http://cryptome.org)
* <i class="far fa-question-circle" style="color:lightgray"></i> [Datacards](https://www.datacards.org/login/) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//SocialSciences/Datacards.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [European Social Survey](http://www.europeansocialsurvey.org/data/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [FBI Hate Crime 2013 - aggregated data](https://github.com/emorisse/FBI-Hate-Crime-Statistics/tree/master/2013)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Fragile States Index](http://fundforpeace.org/fsi/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [GDELT Global Events Database](http://gdeltproject.org/data.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [General Social Survey (GSS) since 1972](http://gss.norc.org)
* <i class="far fa-check-circle" style="color:lightgray"></i> [German Social Survey](http://www.gesis.org/en/home/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Global Religious Futures Project](http://www.globalreligiousfutures.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Gun Violence Data - A comprehensive, accessible database that contains [...]](https://github.com/jamesqo/gun-violence-data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Humanitarian Data Exchange](https://data.humdata.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [INFORM Index for Risk Management](http://www.inform-index.org/Results/Global)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Institute for Demographic Studies](http://www.ined.fr/en/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [International Networks Archive](http://www.princeton.edu/~ina/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [International Social Survey Program ISSP](http://www.issp.org)
* <i class="far fa-check-circle" style="color:lightgray"></i> [International Studies Compendium Project](http://www.isacompendium.com/public/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [James McGuire Cross National Data](http://jmcguire.faculty.wesleyan.edu/welcome/cross-national-data/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [MIT Reality Mining Dataset](http://realitycommons.media.mit.edu/realitymining.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [MacroData Guide by Norsk samfunnsvitenskapelig datatjeneste](http://nsd.uib.no)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Minnesota Population Center](https://www.ipums.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Notre Dame Global Adaptation Index (ND-GAIN)](https://gain.nd.edu/our-work/country-index/download-data/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Open Crime and Policing Data in England, Wales and Northern Ireland](https://data.police.uk/data/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [OpenSanctions - A global database of persons and companies of political, [...]](http://www.opensanctions.org/#downloads)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Paul Hensel General International Data Page](http://www.paulhensel.org/dataintl.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [PewResearch Internet Survey Project](http://www.pewinternet.org/?post_type=dataset)
* <i class="far fa-check-circle" style="color:lightgray"></i> [PewResearch Society Data Collection](http://www.pewresearch.org/data/download-datasets/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Political Polarity Data](http://www3.cs.stonybrook.edu/~leman/data/14-icwsm-political-polarity-data.zip)
* <i class="far fa-check-circle" style="color:lightgray"></i> [StackExchange Data Explorer](http://data.stackexchange.com/help)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Terrorism Research and Analysis Consortium](http://www.trackingterrorism.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Texas Inmates Executed Since 1984](http://www.tdcj.state.tx.us/death_row/dr_executed_offenders.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Titanic Survival Data Set](https://github.com/awesomedata/awesome-public-datasets/tree/master/Datasets)
* <i class="far fa-check-circle" style="color:lightgray"></i> [UCB's Archive of Social Science Data (D-Lab)](http://ucdata.berkeley.edu/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [UCLA Social Sciences Data Archive](https://dataverse.harvard.edu/dataverse/ssda_ucla)
* <i class="far fa-check-circle" style="color:lightgray"></i> [UN Civil Society Database](http://esango.un.org/civilsociety/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [UPJOHN for Labor Employment Research](http://www.upjohn.org/services/resources/employment-research-data-center)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Universities Worldwide](http://univ.cc/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Uppsala Conflict Data Program](http://ucdp.uu.se/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [World Bank Open Data](http://data.worldbank.org/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [WorldPop project - Worldwide human population distributions](http://www.worldpop.org.uk/data/get_data/)
TimeSeries
----------
* <i class="far fa-check-circle" style="color:lightgray"></i> [Databanks International Cross National Time Series Data Archive](http://www.cntsdata.com)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Hard Drive Failure Rates](https://www.backblaze.com/hard-drive-test-data.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Heart Rate Time Series from MIT](http://ecg.mit.edu/time-series/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Time Series Data Library (TSDL) from MU](https://datamarket.com/data/list/?q=provider:tsdl)
* <i class="far fa-check-circle" style="color:lightgray"></i> [UC Riverside Time Series Dataset](http://www.cs.ucr.edu/~eamonn/time_series_data/)
Transportation
--------------
* <i class="far fa-check-circle" style="color:lightgray"></i> [Airlines OD Data 1987-2008](http://stat-computing.org/dataexpo/2009/the-data.html)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Ford GoBike Data (formerly Bay Area Bike Share Data)](https://www.fordgobike.com/system-data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Bike Share Systems (BSS) collection](https://github.com/BetaNYC/Bike-Share-Data-Best-Practices/wiki/Bike-Share-Data-Systems)
* <i class="far fa-check-circle" style="color:lightgray"></i> [GeoLife GPS Trajectory from Microsoft Research](http://research.microsoft.com/en-us/downloads/b16d359d-d164-469e-9fd4-daa38f2b2e13/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [German train system by Deutsche Bahn](http://data.deutschebahn.com/datasets/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Hubway Million Rides in MA](http://hubwaydatachallenge.org/trip-history-data/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Montreal BIXI Bike Share](https://montreal.bixi.com/en/open-data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [NYC Taxi Trip Data 2009-](http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml)
* <i class="far fa-check-circle" style="color:lightgray"></i> [NYC Taxi Trip Data 2013 (FOIA/FOILed)](https://archive.org/details/nycTaxiTripData2013)
* <i class="far fa-check-circle" style="color:lightgray"></i> [NYC Uber trip data April 2014 to September 2014](https://github.com/fivethirtyeight/uber-tlc-foil-response)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Open Traffic collection](https://github.com/graphhopper/open-traffic-collection)
* <i class="far fa-check-circle" style="color:lightgray"></i> [OpenFlights - airport, airline and route data](http://openflights.org/data.html)
* <i class="far fa-question-circle" style="color:lightgray"></i> [Philadelphia Bike Share Stations (JSON)](https://www.rideindego.com/stations/json/) [[fixme](https://github.com/awesomedata/apd-core/tree/master/core//Transportation/Philadelphia-Bike-Share-Stations-JSON.yml) ]
* <i class="far fa-check-circle" style="color:lightgray"></i> [Plane Crash Database, since 1920](http://www.planecrashinfo.com/database.htm)
* <i class="far fa-check-circle" style="color:lightgray"></i> [RITA Airline On-Time Performance data](http://www.transtats.bts.gov/Tables.asp?DB_ID=120)
* <i class="far fa-check-circle" style="color:lightgray"></i> [RITA/BTS transport data collection (TranStat)](http://www.transtats.bts.gov/DataIndex.asp)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Renfe (Spanish National Railway Network) dataset](data.renfe.com)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Toronto Bike Share Stations (JSON and GBFS files)](https://www.toronto.ca/city-government/data-research-maps/open-data/open-data-catalogue/#84045f23-7465-0892-8889-7b6f91049b29)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Transport for London (TFL)](https://tfl.gov.uk/info-for/open-data-users/our-open-data)
* <i class="far fa-check-circle" style="color:lightgray"></i> [Travel Tracker Survey (TTS) for Chicago](http://www.cmap.illinois.gov/data/transportation/travel-tracker-survey)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Bureau of Transportation Statistics (BTS)](http://www.rita.dot.gov/bts/)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Domestic Flights 1990 to 2009](http://academictorrents.com/details/a2ccf94bbb4af222bf8e69dad60a68a29f310d9a)
* <i class="far fa-check-circle" style="color:lightgray"></i> [U.S. Freight Analysis Framework since 2007](http://ops.fhwa.dot.gov/freight/freight_analysis/faf/index.htm)
<br>
<br>
--------------------------------------
<style>
p, li {
font-family:system-ui,-apple-system,"Segoe UI",Roboto,Helvetica,Arial,sans-serif;
font-size:calc(0.85em + 0.25vw);
font-weight:300;
line-height:1.7;
-webkit-font-smoothing:antialiased;
-moz-osx-font-smoothing:grayscale;
margin-left:1%;
margin-right:0%;
}
h2{
font-size:calc(2em + 0.25vw) !important;
color: #337ab7;
font-weight:300;
margin-top:60px !important;
margin-bottom:20px;
}
h3{
font-size:calc(1.4em + 0.25vw);
font-weight:300;
margin-top:20px !important;
margin-bottom:10px;}
ul {
list-style-type:none;
margin: 0;
padding: 0;
font-size:calc(0.75em + 0.25vw);
line-height:1.2;
}
ul a {
color: rgba(229,144,42,.7);
font-size:calc(0.75em + 0.25vw);
line-height:1.2;
text-decoration: none;
font-weight: normal;
}
ul a:hover {
color: #337ab7;
text-decoration: none;
font-weight: normal;
}
#markdown-toc ul {
font-size:calc(0.85em + 0.25vw);
line-height:1.2;
font-weight: bold;
}
#markdown-toc ul li {
list-style-type: disc !important;
font-size:calc(0.65em + 0.25vw);
line-height:1.2;
margin-left: 20px;
}
#markdown-toc a {
color: black;
font-size:calc(0.65em + 0.25vw);
line-height:1.2;
font-weight: normal;
}
#markdown-toc a:hover {
color: black;
text-decoration: none;
font-weight: bold;
}
.collapsible {
background-color: #fff;
color: #444;
cursor: pointer;
padding: 18px;
width: 20%;
border: none;
text-align: left;
outline: none;
font-size: 15px;
}
.active, .collapsible:hover {
background-color: #ccc;
}
.active, .collapsible:hover {
background-color: #ccc;
}
.content {
display: none;
overflow: hidden;
}
</style>
[← Back to docs](README.md)
title: 'LabelFusion: Learning to Fuse LLMs and Transformer Classifiers for Robust Text Classification'
+ [Learning and practice of high performance computing](https://github.com/cjmcv/hpc)
title: Ruby 2.7 changes