Questions tagged [data-sets]

Questions asking for real-life or benchmarking data sets.

58 questions
29
votes
2 answers

Where to get graphs to test my search algorithms against?

I am implementing a set of path finding algorithms such as Dijkstra's, Depth First, etc. At first I used a couple of self made graphs, but now I'd like to take the challenge a bit further and thus I'm looking for either graphs used in…
devoured elysium
  • 600
  • 4
  • 14
22
votes
4 answers

CPU frequency per year

I know that since ~2004, Moore's law stopped working for CPU clock speed. I'm looking for a graph showing this, but am unable to find it: most charts out there show the transistor count or the capacity per year. Where can I find some data showing…
peoro
  • 323
  • 4
  • 8
7
votes
2 answers

Interesting small SAT problems?

I'm noodling around with making a hardware SAT solver on an FPGA, and I'm wondering if there are any interesting SAT problems smaller than, say, 50 variables both to stay within the limits of the FPGA board (namely the number of LEDs) and so I can…
Andrew
  • 347
  • 1
  • 5
4
votes
1 answer

Are there repositories of automatically generated (spam) webpages?

I'm interested to see if I can use machine learning/network analysis methods to automatically detect automatically generated (spam) webpages. I'm particularly interested in webpages that look structurally like a non-spam website, but on closer…
4
votes
1 answer

How is sound input and output data converted to use with machine learning networks?

Suppose one has a couple of .wav files with English spoken words, multiple ones for each word, and for each such set there exists a transcription of their right output, the pronunciation as ascii text. As far as I know, machine learning neural…
n611x007
  • 141
  • 4
3
votes
0 answers

Describe data structure using equations

Good afternoon. At work I'm currently developing a system which takes user input (well structured) and then stores it in memory to do some processing. The input is basically a dataset formed by matrix of pxq dimension, with q columns of data and p…
3
votes
1 answer

Seeking "gold standard" to evaluate accuracy of network clustering algorithm

I'm currently looking at network clustering algorithms (we're currently looking at both directed and undirected, unweighted networks). The algorithms we've tried produce visually nice clusters. However, we would like to evaluate them against some…
3
votes
2 answers

How to filter a very, very large file

I have a very large unsorted file, 1000GB, of ID pairs ID:ABC123 ID:ABC124 ID:ABC123 ID:ABC124 ID:ABC123 ID:ABA122 ID:ABC124 ID:ABC123 ID:ABC124 ID:ABC126 I would like to filter the file for 1) duplicates example ABC123 ABC124 ABC123…
3
votes
1 answer

Environment requirement in training image dataset for classifier

I have a question about preparing the dataset of positive samples for a cascaded classifier that will be used for object detection. As positive samples, I have been given 3 sets of images: a set of colored images in full size (about 1200x600) with…
2
votes
1 answer

Where I can I find Pre Trained CNN Datasets for Facial Emotion Recognition?

I am working on a project that involves recognising emotion from images. There are two parts in the project: One where I will generate features from images using API's and then classify the emotion and the other part is where I will first use a CNN…
2
votes
1 answer

Semantic information collections on the web. (Semantic Wikipedia)

There are lots of technologies supporting semantic information markup even as wiki software. I am wondering if there is any kind of broad information collection project like Wikipedia which features semantic markup of those information. Google…
2
votes
1 answer

Where can I find the data of the computer experiments in the book "Neural Networks and Learning Machines"?

The book "Neural Networks and Learning Machines" by Simon Haykin has many computer experiments to which many exercises are related. But there seems to be no data for these experiments available online. Where can I find them?
Strin
  • 1,515
  • 1
  • 11
  • 16
2
votes
1 answer

What are good counter-examples when training an apple classifier?

I am doing a project in order to recognize an apple. (I am using Emgucv with Visual Studio 2010 C#, if that's relevant). My project is a classification (is or is not an apple). I have 2000 images of apples but I need images for the second class. I…
2
votes
0 answers

Looking for dynamic network data sets

There are a number of collections of network (or graph) data sets freely available on the web, e.g. http://snap.stanford.edu/data/index.html http://www.cc.gatech.edu/dimacs10/downloads.shtml I am looking for dynamic network data sets, i.e.…
cls
  • 21
  • 2
2
votes
1 answer

Examples of difficult Hamiltonian Cycle Problems

I am working on implementing algorithms to solve Hamiltonian Cycle Problem. I need difficult problem graphs to test my implementations but my google-fu is weak and am unable to find any. Please advise where I might find a set of difficult graphs for…
gautam
  • 23
  • 3
1
2 3 4