8

Is there any book or tutorial that teaches us how to efficiently apply the common algorithms (sorting, searching, etc.) on large data (i.e. data that cannot be fully loaded into main memory) and how to efficiently apply those algorithms considering the cost of block transfer from external memory ? For example, almost all algorithm textbooks say that B and B+-trees can be used to store data on disk. However, actually how this can be done, especially handling the pointers where the data is present on disk is not explained. Similarly, though many books teach searching techniques, they do not consider data present in secondary memory.

I have checked Knuth's book. Although it discusses these ideas, I still did not understand how to actually apply them in a high-level language. Is there any reference that discusses these details?

Gilles 'SO- stop being evil'
  • 44,159
  • 8
  • 120
  • 184
Arani
  • 523
  • 4
  • 11

4 Answers4

2

Database books are good example. However, have a look at the field I/O efficient data structures (and algorithms). To my knowledge, there are some courses about this topic, but very few books.

Check this book: U. Meyer, P. Sanders, and J. Sibeyn (eds.), Algorithms for Memory Hierarchies, Lecture Notes in Computer Science 2625, Springer, 2003.

Check these courses: http://www.win.tue.nl/~hermanh/teaching/2IL35/ http://www.daimi.au.dk/~large/ioS12/

and these slides: algo2.iti.kit.edu/sanders/courses/algen09-10/rdslides.pdf

AJed
  • 2,432
  • 19
  • 25
1

Ramkrishnan and Gehrke's database book discusses these things in some detail.

Arani
  • 523
  • 4
  • 11
1

Probably what you are looking for in one neat book:Algorithms and Data Structures for External Memory by Jeffrey Scott Vitter.

Evil
  • 9,525
  • 11
  • 32
  • 53
0

Nowadays this field is known as big data, and it is evolving very rapidly and quickly based on the strong connection with virtualization and relational database technology is only seen as a subset. Also as comments note, key/value databases and NoSQL are where much new innovation and momentum is moving. But from your comments, you seem to be more interested in relational database design principles and techniques. Try the following refs:

peterh
  • 468
  • 4
  • 19
vzn
  • 11,162
  • 1
  • 28
  • 52