How to Find a Dataset
Contents
How to Find a Dataset#
There are many excellent sources of open data online. You are free to use any of them. I would recommend the following.
Kaggle Datasets#
Kaggle is an online community platform for data scientists and machine learning enthusiasts.
Kaggle allows users to collaborate with other users, find and publish datasets, use GPU-integrated notebooks, and compete with other data scientists to solve data science challenges.
The aim of this online platform (founded in 2010 by Anthony Goldbloom and Jeremy Howard and acquired by Google in 2017) is to help professionals and learners reach their goals in their data science
As of today (2021), there are over 8 million registered users on Kaggle.
Look at Kaggle Competitions
The advantage of Kaggle is all of the datasets are designed for machine learning tasks.
Zenodo#
from IPython.display import IFrame
IFrame('https://about.zenodo.org/', width=800, height=1200)
Zenodo has a massive amount of data online
You need to vet the quality of the data because anyone, even you can add your own data
Open Science Framework#
This is an identical service to Zenodo.