What is DBMS?
A database management system (DBMS) is system software for creating and managing databases. The DBMS provides users and programmers with a systematic way to create, retrieve, update and manage data.
Objectives of DBMS The following objectives are used in DBMS
- Data availability-The data availability is responsible for the cost performance and the query update. Availability functions make the database available to users helps in defining and creating a database and getting the data in and out of a database.
- Data integrity-The data integrity provides protection for the existence of the database and maintaining the quality of the database.
- Data independence- DBMS provides two types of data independences. First is a physical data independence program, which remains unaffected from the changes in the storage structure or access method, and the second is the logical data independence program, which remains unaffected from the changes in the schema.
Data Abstraction refers to the process of hiding irrelevant details from the user. So, what is the meaning of irrelevant details? Let’s understand this with one example. Example: If we want to access any mail from our Gmail then we don’t know where that data is physically stored i.e is the data present in India or USA or what data model has been used to store that data? We are not concerned about these things. We are only concerned with our email. So, information like these i.e. location of data and data models are irrelevant to us and in data abstraction, we do this only. Apart from the location of data and data models, there are other factors that we don’t care of. We hide the unnecessary data from the user and this process of hiding unwanted data is called Data Abstraction.
There are mainly three levels of data abstraction and we divide it into three levels in order to achieve Data Independence. Data Independence means users and data should not directly interact with each other. The user should be at a different level and the data should be present at some other level. By doing so, Data Independence can be achieved. So, let’s see in details what are these three levels of data abstraction:
- View Level
- Conceptual Level
- Physical Level
View Level or External Schema
This level tells the application about how the data should be shown to the user. Example: If we have a login-id and password in a university system, then as a student, we can view our marks, attendance, fee structure, etc. But the faculty of the university will have a different view. He will have options like salary, edit marks of a student, enter attendance of the students, etc. So, both the student and the faculty have a different view. By doing so, the security of the system also increases. In this example, the student can’t edit his marks but the faculty who is authorized to edit the marks can edit the student’s marks. Similarly, the dean of the college or university will have some more authorization and accordingly, he will has his view. So, different users will have a different view according to the authorization they have.
Conceptual Level or Logical Level
This level tells how the data is actually stored and structured. We have different data models by which we can store the data. Example: Let us take an example where we use the relational model for storing the data. We have to store the data of a student, the columns in the student table will be student_name, age, mail_id, roll_no etc. We have to define all these at this level while we are creating the database. Though the data is stored in the database but the structure of the tables like the student table, teacher table, books table, etc are defined here in the conceptual level or logical level. Also, how the tables are related to each other are defined here. Overall, we can say that we are creating a blueprint of the data at the conceptual level.
Physical Level or Internal Schema
As the name suggests, the Physical level tells us that where the data is actually stored i.e. it tells the actual location of the data that is being stored by the user. The Database Administrators(DBA) decide that which data should be kept at which particular disk drive, how the data has to be fragmented, where it has to be stored etc. They decide if the data has to be centralized or distributed. Though we see the data in the form of tables at view level the data here is actually stored in the form of files only. It totally depends on the DBA, how he/she manages the database at the physical level.
So, the Data Abstraction provides us with a different view and help in achieving Data Independence. That’s it for this blog.
Data Independence in DBMS
Definition: The acquired skill to change a conceptual pattern by not altering the conceptual pattern of the next superior level is defined as the data independence.
The conventional data processing does not provide data independence in application programs. So, any kind of changes in the information, layouts, or arrangements need the change in application programs also.
But in the database system the data independence becomes easy because of it’s multilayer feature and DBMS furnish interface in application programs and data to have data independency. Also DBMS manages separate files for managing data and for programs which utilizes information. Sometimes it becomes to retrieve data for other programs.
If alterations are done in the representation of information then metadata which is maintained by DBMS is only changed and the DBMS keeps on stipulating data to the application programs. Well, metadata has architecture of layers and in this if the information of one layer is changed then it does not alter the information in the next layer. Thus, data has the parameter of not only independence but also mapping to each other. So, the operation of data manipulation is handled by Database management system i.e. DBMS.
Advantages of Data independence
The advantages of data independency in DBMS are as follows:
- Ability of improving performance
- Alterations in data structure does not requires alterations in application programs
- Implementation details can be hidden from the users
- Reduction of incongruity
- Tractability in improvement of system
- Affordable prices of maintaining system
- Providing the best services to the users
- Permit users to focus on general structure
- Enforcement of standards
- Improvement of security
- The state of being undamaged or undivided can be improved
Types of Data independence
The data independency is classified as two types and they are as below:
- Logical Data Independence
- Physical Data Independence
The diagrammatic representation of the logical and physical data independence is as a shown below:
Logical Data Independence
Logical data independence points out that the conceptual pattern can be altered by undamaging the current external patterns or schemas. The external level and conceptual level has mapping in between them and it takes all the made alterations. It also protects and isolates application programs from actions like combination of dual records into a single record or separating a single record into two or more records.
Logical data independence is a complex parameter to succeed when compared to the physical data independence because it needs pliancy in the scheme of database. For example, two users A and B select the same fields ‘student name’ and ‘student roll number’ then user A adds a new column to the field i.e. ‘student age’ then this does not affect the external view of user B but the internal patterns of both the users will be changed.
Physical Data Independence
Physical data independence points out the physical storing patterns changes by undamaging conceptual structures or arrangements. The presence of internal level in the architecture of database and the operation of changes from the conceptual level to internal level achieves the physical data independence.
Mapping between conceptual level and internal level provides a way to propagate from conceptual records to physical or stored records. If sophistication is made in the physical devices then likewise changes should be made in mapping of conceptual level and internal level which maintains conceptual level unchanged. To make conceptual schema as physically independent of data then external patterns defined on conceptual schema should be physical data independent
ACID properties are an important concept for databases. The acronym stands for Atomicity, Consistency, Isolation, and Durability. The ACID properties of a DBMS allow safe sharing of data. Without these ACID properties, everyday occurrences such using computer systems to buy products would be difficult and the potential for inaccuracy would be huge. The following basic concepts are used in database management systems (DBMS)
The first ACID property is atomicity. When an update occurs to a database, either all or none of the update becomes available to anyone beyond the user or application performing the update. This update to the database is called a transaction and it either commits or aborts. This means that only a fragment of the update cannot be placed into the database, should a problem occur with either the hardware or the software involved. Features to consider for atomicity
- It is maintained in the presence of deadlocks.
- It is maintained in the presence of database software failures.
- It is maintained in the presence of application software failures.
- It is maintained in the presence of CPU failures.
- It can be turned off at the system level.
- It can be turned off at the session level.
Consistency is the ACID property that ensures that any changes to values in an instance are consistent with changes to other values in the same instance. A consistency constraint is a predicate on data which serves as a precondition, post-condition, and transformation condition on any transaction.
Any reads or writes performed on the database will not be impacted by other reads and writes of separate transactions occurring on the same database. The isolation portion of the ACID Properties is needed when there are concurrent transactions. Concurrent transactions are transactions that occur at the same time, such as shared multiple users accessing shared objects. A global order is created with each transaction queueing up in line to ensure that the transactions complete in their entirety before another one begins.
This doesn’t mean two operations can’t happen at the same time at all. Multiple transactions can occur as long as those transactions have no possibility of impacting the other transactions occurring at the same time. Doing this can have impacts on the speed of transactions as it may force many operations to wait before they can initiate. However, this tradeoff is worth the added data security provided by isolation
The final aspect of the ACID approach to database management is durability. Durability ensures that changes made to the database (transactions) that are successfully committed will survive permanently, even in the case of system failures. Durability refers to the ability of the system to recover committed transaction updates if either the system or the storage media fails. Features to consider for durability. This ensures that the data within the database will not be corrupted by service outages, crashes, or other cases of failure. This is achieved through the use of changelogs that are referenced when databases (or portions of the database) are restarted.
- Recovery to the most recent successful commit after an application software failure.
- Recovery to the most recent successful backup after a disk failure.
- Recovery to the most recent successful commit after a CPU failure.
- Recovery to the most recent successful commit after a data disk failure.
- Recovery to the most recent successful backup after a disk failure.
Concurrency Control and Locking Concurrency control and locking is the mechanism used by DBMSs for the sharing of data. Atomicity, consistency, and isolation are achieved through concurrency control and locking. The amount of data that can be locked with the single instance or groups of instances defines the granularity of the lock. The types of granularity are illustrated here are
- Page Locking
- Cluster Locking
- Class or Table Locking
- Object or Instance Locking
Page Locking Page locking (or page-level locking) concurrency control is shown in the figure below. In this situation, all the data on a specific page are locked. A page is a common unit of storage in computer systems and is used by all types of DBMSs. In this figure, each rectangle represents a page. Locking for objects is on the left and page locking for relational tuples is on the right. If the concept of pages is new to you, just think of a page as a unit of space on the disk where multiple data instances are stored.
Cluster Locking Cluster locking or container locking for concurrency control is illustrated in the figure below. In this form of locking, all data clustered together (on a page or multiple pages) will be locked simultaneously. This applies only to clusters of objects in ODBMSs. Note that in this example, the cluster of objects spans portions of three pages.
Class or Table Locking Class or table locking means that all instances of either a class or table are locked, as is illustrated below. This shows one form of concurrency control. Note the circle at the lower left. It represents all instances of a class, regardless of the page where they are stored.
Object or Instance Locking Instance locking locks a single relational tuple in an RDBMS or a single object in an ODBMS. This type of concurrency control is illustrated below.