Introduction to ACID

ACID

ACID

ACID is the acronym for four relational database properties know as Atomicity, Consistency, Isolation, and Durability. Relational Databases provide these features and we will see why these features are needed and what they are.

What is Transaction in database?

The transaction is a collection of SQL statements that act as a single. Consider this transaction.

#Transfer amount of 100 from Alice to Bob

Begin Transaction
Statement 1: Select amount from user where user_name = 'Alice'
Statement 2: amount > 100
Statement 3: Update table customer set amount = amount-100 where user_name ='Alice'
Statement 4: Update table customer set amount = amount + 100 where user_name = 'Bob'

End Transaction

Here, we want those following operations to act as a single unit of work this is known as a transaction. When all the statements execute successfully we say that transaction has been complete or committed else we say that transaction is rolled back. Here committed means the changes has been updated to the database. The transaction follows ACID properties

Atomicity

In the above example, even if one statement fails and another one executes there will be inconsistency in the system. If anyone fails all should roll back to the original state, this is what atomicity means. The transaction should be atomic in nature.

Consider in the above example, after statement 3 the DB fails, 100 dollars has been deducted from Alice Account but Bob won’t receive it because of failure in the system. Ideally, if the payment fails Alice should get the 100 dollars back.

Isolation

We will come to Consistency later because Isolation is a prerequisite for that module.

Isolation tells us whether an inflight transaction can see other transaction changes. To understand this concept we can divide this into 2 parts.

1. Read phenomena, which indicates if there are no isolation

2. Isolation degree, Isolation of a transaction can be done in degree. Like to some extent or entirely.

Read Phenomena

Read Phenomena are the problems that can be caused if there is no Isolation. They are

  1. Dirty Read
  2. Non Repeatable Reads
  3. Phantom Reads

Dirty Read

Dirty Read happens when a transaction reads a data of another transaction which is not committed yet.

Consider a following example of sales table.

Sales Table

Here the table have 2 products, and one transaction is calculating the total sell price of the product by multiplying quantity * price. That is

Product 1 -> 10 * 5 = 50
Product 2 -> 20 * 4 = 80

Meanwhile a second transaction begins, which updates the quantity of product 1

Dirty Read

Ideally as per the transaction 1, the answer should be 50 + 80 = 130 but we got 155$ because transactions 2 have updated the quantity which is not yet committed yet and later got rolled back. This is called Dirty Read because no such value exists.

Non-Repeatable Read

Non-Repeatable read is a phenomena when reading a same value twice in a transaction gives us different result each time. This is same as Dirty read except the fact the transaction 2 is committed.

Non Repeatable Read

Phantom Reads

Phantom read happens when during a transaction we add a new records. This generally happens in range queries. Dirty read and un repeatable reads can be solved by locking the row which taking part in transaction but controlling new row which got added recently is difficult.

Isolation levels

  1. Read uncommitted : No Isolation, any change from the outside is visible to the transaction.
  2. Read committed: Each query in the transaction only sees committed stuff.
  3. Repeatable Read: Each query in a transaction only sees committed updates at the beginning of transaction.
  4. Serializable: Transactions are serialized.

Isolation level and read phenomena comparision

Consistency

Consistency (or Correctness) refers to the requirement that any given database transaction must change affected data only in allowed ways. Lack of Atomcity and Isolation leads to in consistency. There are two types of consistency we should talk about.

  1. Consistency in Data

The definition is defined by the user, consider sales system which has 2 table, one which maintains record of each order and another maintain day wise total sales.

Table 1
|Product code|Date|Price|
-------------------------
|Product 1|2021-09-1|76|
|Product 2|2021-09-1|89|
------------------------

Table 2
|Day    |Sales|
---------------
|2021-09-1|165|

As per the user, the consistency will be sum of total sales on a day should be equal to the entry in table.

If we violate Atomicity or Isolation, we may store wrong data which will lead to inconsistency.

Consistency in Read

Consistency in Read means, If we update a value of X in the database, the next transaction which will be reading the X should get the updated value. Looks Obvious but not happens often.

  1. If a transaction committed a change will new transaction immediately see the change.?

The problem happens where the multiple DB servers. SQL allow to lock the row where we are updating so the next read will always gets the updated value provided we only have one server. While Scaling Mysql we use Master and Salve node which don’t provide consistency but provide eventual consistency. That is eventually master and salve become sync.

Durability

Committed transaction must be persisted even if the database node fails. The value should be persisted in durable and non volatile storage.

These were the ACID properties provided by SQL I hope the article was helpful to you. This blog is a part of the series Introduction to Database
you can check related post in that link.

Related Post