Questions tagged [databases]
220 questions
20
votes
6 answers
Why must uncommitted transactions be undone in backwards order?
I have a database log where some transactions win (they are committed before crash) and some lose (not committed yet). We learned in class that the losers' actions have to be undone backwards.
Is there any reason for doing this backwards? Can anyone…
prjctdth
- 201
- 2
- 3
14
votes
3 answers
Who needs linearizability?
I've been reading about the differences between serializability and linearizability, which are both consistency criteria for replicated systems such as replicated databases. However, I don't know in which cases linearizability would be needed, even…
Eduardo Bezerra
- 245
- 2
- 8
12
votes
2 answers
Good snapshottable data structure for an in-memory index
I'm designing an in-memory object database for a very specific use case. It is single writer, but must support efficient concurrent reads. Reads must be isolated. There is no query language, the database only supports:
get object/-s by…
dm3
- 223
- 1
- 6
9
votes
1 answer
Fixed-length decision-tree-like feature selection to minimize average search performance
I have a complex query $Q$ used to search a dataset $S$ to find $H_\text{exact} = \{s \in S \mid \text{where $Q(s)$ is True}\}$. Each query takes on average time $t$ so the overall time in the linear search is $t\cdot |S|$. I can break a query down…
Andrew Dalke
- 221
- 1
- 4
9
votes
1 answer
What is the difference between a R-tree and a BVH?
I've just read about R-Trees:
The key idea of the data structure is to group nearby objects and represent them with their minimum bounding rectangle in the next higher level of the tree; the "R" in R-tree is for rectangle.
This seems to be a…
Martin Thoma
- 2,360
- 1
- 22
- 41
8
votes
1 answer
A relational algebra extended to model the full DML ("CRUD") domain
There are multiple references about the relational algebra for modeling queries (SELECT) but I have found very very little on the expanded algebra that would include concepts in all of DML such as INSERT, UPDATE, DELETE and maybe even MERGE,…
Jason Kleban
- 557
- 3
- 12
8
votes
2 answers
Linearizability and Serializability in context of Software Transactional Memory
I've been trying to grasp serializability and linearizability in the context of software transactional memory. However, I think both notions can be applied to transactional memory in general.
At this point, the following is my understanding of both…
Christophe De Troyer
- 273
- 2
- 10
6
votes
2 answers
Optimizing a join where each table has a selection
Consider the following query:
SELECT Customer.Name FROM Customer
INNER JOIN Order on Order.CustomerId = Customer.Id
WHERE Customer.Preferred = True AND
Order.Complete = False
Let's suppose all of the relevant attributes (Customer.Preferred,…
Xodarap
- 1,538
- 1
- 10
- 17
6
votes
1 answer
Confused between 2 phase locking and 2 phase commit
I understand that both algorithms are very different, but what I don't understand is whether they achieve the same thing in the end. 2PC is for atomic commits and 2PL is for serializable isolation. But don't they both achieve the two things? don't…
nestor556
- 171
- 3
5
votes
1 answer
Differences between the OS and DB buffer pool?
How does the DB manage a pool of buffers when the operating system ends up controlling what's really in memory? For example, couldn't the operating system decide to evict a page frame from a DB's buffer?
Jae
- 151
- 3
5
votes
1 answer
Time Complexity of Sort-Merge Join
According to this German Wikipedia article, the time required to merge relations $R$ and $S$ is $\in \mathcal{O}(|R| + |S|)$ if both relations are already sorted.
[Note: You don't really need to read the text and the link jumps right to where the…
UTF-8
- 197
- 1
- 11
5
votes
1 answer
blockchain database - why so redundant
I've become interested in blockchain databases. Everywhere I've read it said that each user has to have his/her own copy of database.
Why can't distributed blockchain database be 'distributed' to reside partly on many computers to achieve say fixed…
Alex Martian
- 151
- 3
5
votes
1 answer
Anonymization of dataset preserving unique identities
The $k$-anonymization paradigm (and its refinements) means to create datasets where every tuple is identical with $k-1$ others.
However I'm in a situation where people are in the dataset many times. And I want to follow their progress through the…
The Unfun Cat
- 1,803
- 2
- 19
- 29
5
votes
1 answer
How to find a basis which is guaranteed to need 9 or less characters to represent a 12 digits number?
I'm trying to map a 12 digit number into a fixed width file. For a number of reasons, it must be compressed in such a way that it is guaranteed to be less than or equal to 9 characters (alpha numeric is fine). My first thought was a change of…
user34961
- 59
- 1
5
votes
1 answer
n closest points in a set of lat/long coordinates
Here's my problem:
I have a website where people can search based on their location (which is converted to lat/long coordinates). I have many products stored in a database with their lat/long coordinates. I also have a function to calculate the…
Wouter Florijn
- 215
- 1
- 6