2

I am writing an N-body simulation in C++ that has to be able to deal with large N ($N \le 10^6$).

Everything has been going well so far, but now that I have started to code in collisions between bodies (which can result in mergers - which means that one body is added to the system and two are removed) I cannot help but wonder whether there is a more suitable data structure than the std::vector (which I have been using so far).

Given that each body in the system has a unique ID, I have thought of storing all bodies in an std::map (to allow fast lookup by ID), but, at the same time, I have to repeatedly iterate over all bodies (in direct integration methods) and consider each pairwise interaction ($n_{interactions}=\frac{N(N-1)}{2}$), for which (I believe) the std::vector is faster than the std::map.

What would be the best data structure for this, given that I have to iterate repeatedly over all bodies, but also have to be able to add and remove bodies?

4 Answers4

2

Why choose? You can have it both ways:

  1. a vector that contains all the bodies, and,
  2. a map from ID to index in that vector

Any simple modification of the vector (push_back, swap, pop_back) would also only require a simple modification of the map, and you can use these operations to remove an element from the middle of the vector by doing an unordered-remove (swap the removed element with the last element, then pop_back)

user555045
  • 2,148
  • 14
  • 15
2

With $N=10^6$ bodies, the obvious method to track their movements by having a system of simultaneous differential equations taking $\mathcal{O}(N^2)$ operations per step with a rather large constant will take you ages.

There is a faster method, as far as I know, that takes $\mathcal{O}(Nlog(N))$ operations; I've only read about it, so you'll have to look it up yourself. You then design your data structures so that the implementation is as fast as possible.

gnasher729
  • 32,238
  • 36
  • 56
2

If you want to find collision pairs for $10^6$ objects, I would suggest using a spatial data structure (as suggested by @D.W. in the comments).

For collision detection, some useful structures are quadtrees, R-trees and BVH.

For collision detection you usually want window queries (rectangular query windows) and not nearest neighbor queries (which technically can be used but are usually much slower).

As far as I can tell R-Trees (e.g. Boost R-Tree) have the best performance for window queries, however they are quite slow to update. Quadtrees are much faster to update.

There is for example the Point Cloud Library and libSpatialIndex which both provide several implementations of spatial indexes.

A special type of quadtree is the PH-tree for C++ (disclaimer: self advertisement), for a 1M point dataset it allows around 1M updates per seconds and 500K small window queries per second on my 5 year old desktop. Smaller datasets are obviously faster. For 10K points it would be 5-10M updates per second and 5M queries per second.

TilmannZ
  • 764
  • 4
  • 6
1

Presumably all the accesses to the bodies that you need to perform are sequential, so a linked list would be a good candidate. (By the way, I would not delete two bodies and insert a new one, but update one and delete the other.)

Anyway, before rushing to replace the data structure, I would compare the cost of the integration step ($O(n^2)$) and that of the vector update ($O(n)$ or less) to check if it is worth the effort.

Should I add that if order does not matter, insertions/deletions in an array are made in constant time ?


Though the difference between direct and indirect accesses will be tiny, I would expect a non-vector solution to be slower.