Questions tagged [supervised-learning]

Supervised learning is a type of machine learning algorithm that learns a mapping function y = f(x) between input variables (x) and output variables (y). The two most common supervised learning tasks are classification and regression.

Supervised learning is the machine learning task of inferring a function from labeled training data. The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal).

381 questions
46
votes
8 answers

What would I prefer - an over-fitted model or a less accurate model?

Let's say we have two models trained. And let's say we are looking for good accuracy. The first has an accuracy of 100% on training set and 84% on test set. Clearly over-fitted. The second has an accuracy of 83% on training set and 83% on test set.…
25
votes
2 answers

What kinds of learning problems are suitable for Support Vector Machines?

What are the hallmarks or properties that indicate that a certain learning problem can be tackled using support vector machines? In other words, what is it that, when you see a learning problem, makes you go "oh I should definitely use SVMs for…
19
votes
2 answers

Why does data science see class imbalance as a problem for supervised learning when statistics does not?

Why does data science see class imbalance as a problem in supervised learning when statistics says it is not? Data science seems to seem class imbalance as problematic and needing special techniques to remedy this problem. For instance, this DS.SE…
18
votes
5 answers

Merging sparse and dense data in machine learning to improve the performance

I have sparse features which are predictive, also I have some dense features which are also predictive. I need to combine these features together to improve the overall performance of the classifier. Now, the thing is when I try to combine these…
13
votes
1 answer

Supervised learning vs reinforcement learning for a simple self driving rc car

I'm building a remote-controlled self driving car for fun. I'm using a Raspberry Pi as the onboard computer; and I'm using various plug-ins, such as a Raspberry Pi camera and distance sensors, for feedback on the car's surroundings. I'm using OpenCV…
Ryan Zotti
  • 4,209
  • 3
  • 21
  • 33
12
votes
2 answers

Why neural networks do not perform well on structured data?

I was recently working on some classification problem where decision trees performed better than neural networks. I had tried various combinations with neural networks altering the number of neurons / hidden layers with an objective to beat the…
12
votes
3 answers

Which supervised learning algorithms are available for matching?

I'm working on a non-profit where we try to help potential university applicants by matching them with alumni that want to share their experience/wisdom and, at the moment, it is happening manually. So I'll have two tables, one with students and one…
11
votes
2 answers

Is max_depth in scikit the equivalent of pruning in decision trees?

I was analyzing the classifier created using a decision tree. There is a tuning parameter called max_depth in scikit's decision tree. Is this equivalent of pruning a decision tree? If not, how could I prune a decision tree using scikit? dt_ap =…
9
votes
1 answer

Neural network with flexible number of inputs?

Is it possible to create a neural network which provides a consistent output given that the input can be in different length vectors? I am currently in a situation where I have sampled a lot of audio files, which are of different length, and have to…
8
votes
1 answer

Ideas for prospect scoring model

I have to think about a model to identify prospects (companies) that have a high chance of being converted into clients, and I'm looking for advice on what kind of model could be of use. The databases I will have are, as far as I know (I don't have…
7
votes
2 answers

When is the sum of models the model of the sum?

The response variable in a regression problem, $Y$, is modeled using a data matrix $X$. In notation, this means: $Y$ ~ $X$ However, $Y$ can be separated out into different components that can be modeled independently. $$Y = Y_1 + Y_2 + Y_3$$ Under…
7
votes
2 answers

Why will the accuracy of a highly unbalanced dataset reduce after oversampling?

I have created a synthetic dataset, with 20 samples in one class and 100 in the other, thus creating an imbalanced dataset. Now the accuracy of classification of the data before balancing is 80% while after balancing (i.e., 100 samples in both the…
6
votes
1 answer

Is there any difference between a weak learner and a weak classifier?

While reading about decision tree ensembles Gradient Boosting, AdaBoost etc. I have found the following two concepts weak learner and weak classifier. Are they the same? If there is any difference what is it?
6
votes
2 answers

Why does feature scaling improve the convergence speed for gradient descent?

From this article, it says: We can speed up gradient descent by scaling. This is because θ will descend quickly on small ranges and slowly on large ranges, and so will oscillate inefficiently down to the optimum when the variables are very…
6
votes
2 answers

Is NN with no hidden layer is behave like a regression?

Is a NN with no hidden layer is behave like a regression? What we could say that NN without hidden layer can say us? ​ If we have for instance 20 input and 4 output and I have no true label, is it similar to regression? If it is a regression then it…
user10296606
  • 1,906
  • 6
  • 18
  • 33
1
2 3
25 26