Google Dataset Search: This new search engine helps scientists hunt for public data
Google on Wednesday launched a new search engine targeted at the scientists, data journalists, data geeks or anyone else looking for precise datasets online.
The service called Dataset Search is a targeted search that can help scientists and data journalists find the data required for their work and their stories, or simply to satisfy their intellectual curiosity.
The new search engine works similar to Google Scholar, the company’s popular search engine for academic studies and reports. Dataset Search enables users to find datasets stored across thousands of repositories on the Web, making these datasets universally accessible and useful.
“Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher’s site, a digital library, or an author’s personal web page,” Natasha Noy, Research Scientist, Google AI, said in a blog post.
ALSO READ: Google launches new job search feature
To create Dataset Search, Google developed guidelines for dataset providers to describe their data in a way that the company (and other search engines) can better understand the content of their pages.
The approach is based on an open-source standard laid out by the collaborative data community Schema.
“These guidelines include salient information about datasets: who created the dataset, when it was published, how the data was collected, what the terms are for using the data, etc. We then collect and link this information, analyze where different versions of the same dataset might be, and find publications that may be describing or discussing the dataset,” Noy said.
“We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem,” added Noy.
Dataset Search contains contents from organizations like NOAA and NASA, as well as from academic repositories such as Harvard’s Dataverse and Inter-university Consortium for Political and Social Research (ICPSR), along with government data and data provided by news organizations, such as ProPublica.
Dataset Search works in multiple languages with support for additional languages expected to come soon. You can find more information on Google’s official blog post here.
Also, check out the new search engine in action (see video below).