5

Does anybody know community-driven open data platforms?

For example, consider object detection task. Then, next platforms come to my mind: Kaggle and Roboflow. However, in my opinion, both has a significant issue that makes it difficult to use these platforms as ready-to-go data sources. The issue is the community members inability to make pull requests to existing datasets, which could fix their issues (train-test contamination, inaccurate labeling and so on).

TLDR: Is there any GitHub-like open data platform? Was it there, but commercial failure happened?

1 Answers1

6

I have not been using it very long, but I think DataHub.io might be suitable for your needs. It is open-source, has integration with Git, including pull requests and an API for access. On the downside, it lacks the ability to handle unstructured data well, all it can do is store the files, however I don't consider that to be a big issue.

Here's page from the documentation that discusses git integration:

https://datahub.io/@davidgasquez/handbook/Git

Robert Long
  • 3,518
  • 12
  • 30