Questions tagged [cluster]
31 questions
5
votes
2 answers
Efficiently partition tree into clusters of similar diameter
I am looking for a way to split a tree into $k$ clusters so that the cluster with largest diameter is as small as possible. All edges have the same length. I'm hoping for an algorithm that can handle arbitrary trees (with arbitrary branching…
Bartosz
- 51
- 1
4
votes
1 answer
Finding similar high dimensional real vectors
I have a collection of vectors $v_1,v_2\in [0,1]^n$ and I want to find similar pairs quickly. For similarity, I want to use the Euclidean distance metric $L: [0,1]^n \times [0,1]^n \longrightarrow R$. $n$ will be around $200$.
I've implemented…
AdrianGW
- 43
- 4
4
votes
2 answers
Fast algorithm for clustering groups of elements given their size/time
I don't know if there is a canonical problem reducing my practical problem, so I will just try to describe it the best that I can.
I would like to cluster files into the specified number of groups, where each groups size (= the sum of sizes of files…
gaborous
- 346
- 2
- 9
4
votes
1 answer
How to cluster similar objects into fixed size groups?
I have $n$ people each of which can meet on certain days of the week. I want to group them into $\frac{n}{k}$ groups of size $k$ such that all people in a group can meet on a day.
eg - Suppose there are 3 people who are free on (M,T,W,Th,F), (M,W)…
ask
- 221
- 1
- 2
- 7
4
votes
1 answer
What is the global function we are trying to Optimise with Clustering Algorithms?
I am doing some reading (and implementation) of some Clustering Algorithms.
First I started with the well known K-Mean algorithm and implemented it directly from a paper.
Got a kind of decent understanding of what I going on there.
What I am…
Frames Catherine White
- 453
- 1
- 4
- 15
3
votes
2 answers
The nearest points in a set
I have $N$ points and I have a distance between every pair of points stored in a 2D matrix. The goal is to find the nearest $K$ points among these $N$ points. "Nearest" means the sum of all distances between the $K$ points is smallest. A brute-force…
jackykuo
- 39
- 4
3
votes
1 answer
Optimal way for grouping events
I am creating an event notification system.
Each event has a user and a subject, such that, 'user did event to the subject'. Now while presenting these the events need to be grouped.
All the events in the same group must either refer to the same…
Optimus
- 133
- 5
3
votes
0 answers
Document clustering for summarization
I am curious as to what steps one would reasonably need to take to perform an extraction-based text summarizer.
I've taken a look at some papers I've found on Google such as this one, which explains that UPGMA is the best clustering algorithm (out…
DaniG2k
- 131
- 3
3
votes
1 answer
Is an $\mathcal{O}(n\times \text{Number of clusters})$ clustering algorithm useful?
I am a physicist, with little formal training in computer science - please don't assume I know even obvious things about computer science!
Within the context of data analysis, I was interested in identifying clusters within a $d$-dimensional list…
innisfree
- 145
- 6
3
votes
1 answer
Analysis and classification based on data points
I'm not sure if this is the correct stack exchange or correct tags, but my question is as follows:
I am working on a sort-of ratings system for players in a particular game. After allowing the ratings to develop for many games, I have set up a…
ctlaltdefeat
- 153
- 8
2
votes
0 answers
Clustering with probabilities / vector quantization with arbitrary distance measures
Suppose I'm given $n$ points $x_1,\dots,x_n$ in some space $\mathcal{S}$ (think: $\mathbb{R}^d$), and probabilities $p_1,\dots,p_n$ that form a probability distribution (so $p_1 + \dots + p_n=1$). Imagine I have a source that outputs a point by…
D.W.
- 167,959
- 22
- 232
- 500
2
votes
2 answers
Community detection in weighted directed graphs for fixed number of communities
I have a weighted directed graph $G=(V,E)$ with positive weights. Say these vertices represent cities and the weight $w : V_1 \rightarrow V_2$ represents number of students moving into other cities for college. Note here that a city $A$ may have a…
2
votes
3 answers
How to calculate IV, EV and optimal k for K-means?
Could someone explain how to calculate the following 3 evaluative properties:
Intercluster Variability (IV) - How different are the data points within the same cluster
Extracluster Variability (EV) - How different are the data points that are in…
Tesla
- 43
- 4
2
votes
0 answers
Clustering images based on timestamp
I want to make folders of images of users in a meaningful way. The images have only the timestamp of creation associated with them. Each folder can have a maximum of k images.
I can use median or mean as a way of deciding which images were taken in…
tanvi
- 121
- 3
2
votes
0 answers
Footprint finding algorithm
I'm trying to come up with an algorithm to optimize the shape of a polygon (or multiple polygons) to maximize the value contained within that shape.
I have data with 3 columns:
X: the location of the block on the x axis
Y: the location of the block…
gtwebb
- 121
- 2