Questions tagged [databases]

220 questions
20
votes
6 answers

Why must uncommitted transactions be undone in backwards order?

I have a database log where some transactions win (they are committed before crash) and some lose (not committed yet). We learned in class that the losers' actions have to be undone backwards. Is there any reason for doing this backwards? Can anyone…
prjctdth
  • 201
  • 2
  • 3
14
votes
3 answers

Who needs linearizability?

I've been reading about the differences between serializability and linearizability, which are both consistency criteria for replicated systems such as replicated databases. However, I don't know in which cases linearizability would be needed, even…
Eduardo Bezerra
  • 245
  • 2
  • 8
12
votes
2 answers

Good snapshottable data structure for an in-memory index

I'm designing an in-memory object database for a very specific use case. It is single writer, but must support efficient concurrent reads. Reads must be isolated. There is no query language, the database only supports: get object/-s by…
dm3
  • 223
  • 1
  • 6
9
votes
1 answer

Fixed-length decision-tree-like feature selection to minimize average search performance

I have a complex query $Q$ used to search a dataset $S$ to find $H_\text{exact} = \{s \in S \mid \text{where $Q(s)$ is True}\}$. Each query takes on average time $t$ so the overall time in the linear search is $t\cdot |S|$. I can break a query down…
9
votes
1 answer

What is the difference between a R-tree and a BVH?

I've just read about R-Trees: The key idea of the data structure is to group nearby objects and represent them with their minimum bounding rectangle in the next higher level of the tree; the "R" in R-tree is for rectangle. This seems to be a…
Martin Thoma
  • 2,360
  • 1
  • 22
  • 41
8
votes
1 answer

A relational algebra extended to model the full DML ("CRUD") domain

There are multiple references about the relational algebra for modeling queries (SELECT) but I have found very very little on the expanded algebra that would include concepts in all of DML such as INSERT, UPDATE, DELETE and maybe even MERGE,…
Jason Kleban
  • 557
  • 3
  • 12
8
votes
2 answers

Linearizability and Serializability in context of Software Transactional Memory

I've been trying to grasp serializability and linearizability in the context of software transactional memory. However, I think both notions can be applied to transactional memory in general. At this point, the following is my understanding of both…
6
votes
2 answers

Optimizing a join where each table has a selection

Consider the following query: SELECT Customer.Name FROM Customer INNER JOIN Order on Order.CustomerId = Customer.Id WHERE Customer.Preferred = True AND Order.Complete = False Let's suppose all of the relevant attributes (Customer.Preferred,…
Xodarap
  • 1,538
  • 1
  • 10
  • 17
6
votes
1 answer

Confused between 2 phase locking and 2 phase commit

I understand that both algorithms are very different, but what I don't understand is whether they achieve the same thing in the end. 2PC is for atomic commits and 2PL is for serializable isolation. But don't they both achieve the two things? don't…
nestor556
  • 171
  • 3
5
votes
1 answer

Differences between the OS and DB buffer pool?

How does the DB manage a pool of buffers when the operating system ends up controlling what's really in memory? For example, couldn't the operating system decide to evict a page frame from a DB's buffer?
Jae
  • 151
  • 3
5
votes
1 answer

Time Complexity of Sort-Merge Join

According to this German Wikipedia article, the time required to merge relations $R$ and $S$ is $\in \mathcal{O}(|R| + |S|)$ if both relations are already sorted. [Note: You don't really need to read the text and the link jumps right to where the…
UTF-8
  • 197
  • 1
  • 11
5
votes
1 answer

blockchain database - why so redundant

I've become interested in blockchain databases. Everywhere I've read it said that each user has to have his/her own copy of database. Why can't distributed blockchain database be 'distributed' to reside partly on many computers to achieve say fixed…
Alex Martian
  • 151
  • 3
5
votes
1 answer

Anonymization of dataset preserving unique identities

The $k$-anonymization paradigm (and its refinements) means to create datasets where every tuple is identical with $k-1$ others. However I'm in a situation where people are in the dataset many times. And I want to follow their progress through the…
The Unfun Cat
  • 1,803
  • 2
  • 19
  • 29
5
votes
1 answer

How to find a basis which is guaranteed to need 9 or less characters to represent a 12 digits number?

I'm trying to map a 12 digit number into a fixed width file. For a number of reasons, it must be compressed in such a way that it is guaranteed to be less than or equal to 9 characters (alpha numeric is fine). My first thought was a change of…
user34961
  • 59
  • 1
5
votes
1 answer

n closest points in a set of lat/long coordinates

Here's my problem: I have a website where people can search based on their location (which is converted to lat/long coordinates). I have many products stored in a database with their lat/long coordinates. I also have a function to calculate the…
Wouter Florijn
  • 215
  • 1
  • 6
1
2 3
14 15