Questions tagged [data-mining]

Using the techniques of artificial intelligence and machine learning to extract patterns from large data sets and transforming those data into a useful, organized form for future processing.

142 questions
47
votes
5 answers

Why has research on genetic algorithms slowed?

While discussing some intro level topics today, including the use of genetic algorithms; I was told that research has really slowed in this field. The reason given was that most people are focusing on machine learning and data mining. Update: Is…
30
votes
4 answers

What exactly is the difference between supervised and unsupervised learning?

I am trying to understand clustering methods. What I I think I understood: In supervised learning, the categories/labels data is assigned to are known before computation. So, the labels, classes or categories are being used in order to "learn" the…
Prot
  • 403
  • 1
  • 4
  • 5
18
votes
4 answers

Relation and difference between information retrieval and information extraction?

From Wikipedia Information retrieval is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches can be based on metadata or on full-text indexing. From…
Tim
  • 5,035
  • 5
  • 37
  • 71
13
votes
5 answers

Word Frequency with Ordering in O(n) Complexity

During an interview for a Java developer position, I was asked the following: Write a function that takes two params: a String representing a text document and an integer providing the number of items to return. Implement the function such…
user2712937
  • 131
  • 1
  • 1
  • 3
13
votes
2 answers

Identifying events related to dates in a paragraph

Is there an algorithmic approach to identify that dates given in a paragraph correlate to particular events (phrases) in the paragraph? Example, consider the following paragraph: In June 1970, the great leader took the oath. But it was only after…
check123
  • 530
  • 3
  • 11
12
votes
5 answers

Data Science vs Operations Research

The general question, as the title suggests, is: What is the difference between DS and OR/optimization. On a conceptual level I understand that DS tries to extract knowledge from the available data and uses mostly Statistical, Machine Learning…
PsySp
  • 261
  • 1
  • 2
  • 7
10
votes
0 answers

Applying the graph mining algorithm Leap Search in an unlabeled setting

I am reading Mining Significant Graph Patterns by Leap Search (Yan et al., 2008), and I am unclear on how their technique translates to the unlabeled setting, since $p$ and $q$ (the frequency functions for positive and negative examples,…
mitchus
  • 320
  • 3
  • 10
10
votes
1 answer

Looking for a ranking algorithm that favors newer entries

I'm working on a ranking system that will rank entries based on votes that have been cast over a period of time. I'm looking for an algorithm that will calculate a score which is kinda like an average, however I would like it to favor newer scores…
Logan Besecker
  • 203
  • 1
  • 4
9
votes
2 answers

What are some efficient ways to find the differences between two large corpuses of text that have similar, but differently ordered content?

I have two large files containing paragraphs of English text: The first text is about 200 pages long and has about 10 paragraphs per page (each paragraph is 5 sentences long). The second text contains almost precisely the same paragraphs and text…
9
votes
1 answer

String inputs in Machine Learning

Several popular machine learning algorithms such as Logistic regression or Neural networks require its inputs to be numeric. What I'm interested in is how you make these algorithms work on non-numeric inputs (such as short strings). As an example,…
Martin Konicek
  • 191
  • 1
  • 1
  • 5
8
votes
1 answer

Machine Learning: Identify Patterns in Time-Series Data

I work in renewable energy. My company gathers a lot of data from equipment. This typically includes process data (such as transformer temperature, line voltages, currents, etc.) and discrete alarms (e.g. breaker trip, inverter alarm values,…
8
votes
1 answer

difference between multilayer perceptron and linear regression

What is the difference between multilayer perceptron and linear regression classifier. I am trying to learn a model with numerical attributes, and predict a numerical value. Thanks
user20287
  • 93
  • 1
  • 1
  • 3
7
votes
0 answers

Guided mining of common substructures in large set of graphs

Disclaimer: I'm not a CS so I basically have no idea what I'm talking about I have a large (>1000) set of directed acyclic graphs with a large (>1000) set of vertices each; the vertices are labeled. I want to identify substructures that appear…
7
votes
2 answers

any hope for a universal automatic parser?

Say you are a program, and you are given some source code but you don't know in what language, it can be C++/Java/Python/Lisp/... all you know is that it is highly structured and LR(1) parse-able, and you want to make some guesses on the…
7
votes
1 answer

Difference between decision tree and rule based reasoner

I am new to this topic, and in some scientific papers I've been reading about prediction in sports I encountered the term rule based reasoner. Is it this term the same as a semantic reasoner( where the two main directions are forward and backwards…
Pio
  • 171
  • 1
  • 5
1
2 3
9 10