What is the database paradigm, should it be strictly followed, and under what circumstances should it not be followed?
This is the back-end small class of the monastery. Each article is shared from
[background introduction] [knowledge analysis] [common problems] [solutions] [coding practice] [extended thinking] [more discussion] [References]
Eight aspects of in-depth analysis of back-end knowledge / skills. This article shares:
[what is the database paradigm? Should we strictly follow the paradigm? Under what circumstances should we not follow the paradigm?]
[small java class of Xiuzhen academy] what is the database paradigm? Should we strictly abide by the paradigm? Under what circumstances should we not abide by the paradigm?
Hello, I'm Zhang Quanliang, a student of Zhengzhou branch of it Academy. I'm an honest, pure and kind java programmer. Today I'd like to share with you,
Task 3 of Java on the official website of Xiuzhen academy, knowledge points in deep thinking - what is the database paradigm, whether it should strictly abide by the paradigm, and under what circumstances it should not abide by the paradigm?
1. Background:
(1) Relational database
Relational database is a database created according to relational model
Relational model refers to two-dimensional table model, so a relational database is a data organization composed of two-dimensional tables and their relationships
The relational model includes data structure (data storage problem, two-dimensional table), operation instruction set (SQL statement) and integrity constraints
(intra table data constraints, constraints between tables)
(2) Common relational databases:
Oracle、DB2、Postgresql、Microsoft sql Server、Microsoft Access、MysqL
Secure (because stored on disk)
Easy to understand (based on relational model)
But not space saving)
The rules generated in the middle are paradigms and constraints
2. Knowledge analysis:
(1) What is paradigm
When designing a relational database, it is necessary to comply with different specification requirements and design a reasonable relational database,
These different specification requirements are called normal forms,
The higher the paradigm, the smaller the database redundancy. Applying the database paradigm can bring many benefits,
But the main purpose is to eliminate duplicate data, reduce data redundancy and make the data in the database better organized,
Make more efficient use of disk space.
Disadvantages of the paradigm: the paradigm makes the query quite complex and requires more connections during the query,
The columns of some composite indexes are divided into different tables due to the need of normalization, resulting in poor index strategy.
(2) What is the first, second, third, BC paradigm?
The so-called "paradigm" refers to a certain level of relationship, so a relationship R is often called the paradigm.
At present, relational database has six paradigms: the first paradigm (1NF), the second paradigm (2NF), the third paradigm (3NF)
Bath Codd paradigm (BCNF), the fourth paradigm (4NF) and the fifth paradigm (5NF, also known as perfect paradigm).
The paradigm that meets the minimum requirements is the first paradigm (1NF). The paradigm that further meets more specification requirements on the basis of the first paradigm is called the second paradigm (2NF),
The other paradigms are followed by analogy. The prerequisite for meeting the high-level paradigm is that the low-level paradigm must be met first.
4NF: it is required to delete many to many relationships in the same table
5NF: rebuild the original structure from the final structure
(3) Some common concepts used in the paradigm
a. Entity: refers to the things to be described with data in practical application. It is an objective thing that exists in the real world and can be distinguished
b. Data item: that is, fields, which can also be called fields, attributes and columns. Data item is the smallest indivisible unit of data.
c. Data element: data element is the basic unit of data. Data element is also called element, row, ancestor and record.
A data element can consist of several data items. A row in a table is a tuple.
d. Code: also known as key, it is a basic concept in the database system. The so-called code is an attribute that can uniquely identify an entity. It is the property of the whole entity set, not the property of a single entity. It includes super code, candidate code and master code.
e. Super Code: a super code is a collection of one or more attributes. The combination of these attributes can uniquely identify an entity in an entity set.
If K is a super code, any superset of K is also a super code, that is, if K is a super code, all sets containing K are also super codes.
f. Candidate code: a super code may contain irrelevant attributes. If any true subset of some super codes cannot become a super code, such a minimum super code is called a candidate code.
g. Master code: select a combination of the least keys from the candidate codes, which is called the master code (primary key).
Each master code should have the following characteristics: 1 The only one. 2. Smallest (select the combination of the least keys as far as possible). 3. Non empty. 4. Non updatable
h. Full code: if a code contains all attributes, the code is all key.
Primary attribute: as long as an attribute appears in any candidate code, this attribute is the primary attribute.
Non primary attribute: Contrary to the primary attribute, it does not appear in any candidate code. This attribute is non primary attribute or non code attribute
g. Foreign code: if an attribute or attribute group x in relationship mode R is not the code of R, but x is the code of another relationship mode, X is the foreign code of R, also known as foreign key.
k. Dependent table: also known as weak entity, it is a child table that needs to be identified by the parent table.
m. Associative table: it is a child table of two parent tables in a many to many relationship.
n. Function dependency: function dependency refers to that the value of one or a group of attributes in a relationship can determine the value of other attributes.
Functional dependency: X → y. Functional dependency does not refer to the constraints satisfied by one or some relationships of relational pattern R,
It refers to the constraints that all relations of R must meet.
o. Full functional dependency: in a relationship, if a non primary attribute data item depends on all keywords, it is called full functional dependency.
p. Transfer function dependency: if there is a decision relationship of "a → B → C", the C transfer function depends on a.
3. Frequently asked questions:
(1) How to better distinguish the three paradigms
The first paradigm and the second paradigm lie in whether there are two tables,
The second paradigm is that if a table contains many different entity attributes, it must be divided into multiple tables,
The third paradigm requires that multiple tables have been divided, so one table can only have the ID (primary key) of another table,
There can be no other information (all other information is queried in another table with the primary key).
(2) Notes on foreign keys
Only InnoDB's database engine supports foreign keys
The data type of the foreign key must be the same as that of the reference column (the numeric type requires the same length and unsigned, the string type requires the same type, and the length can be different)
Constraint foreign key name foreign key references reference table (reference field)
When cascade deletes or updates the reference field of the reference table, the records of the foreign key table are deleted and updated synchronously
Set null when deleting or updating the reference field of the reference table, the foreign key of the foreign key table is set to null
(3) What is anti paradigm? What situations require anti paradigm design?
Denormalization, literally, is the antonym of paradigm. In fact, it is also.
Generally speaking, following the paradigm is to ensure data integrity and reduce redundancy,
However, intuitively, we can know that a database designed completely according to the paradigm with extremely low redundancy,
It is likely that the performance will be lost to databases with relatively more redundancy (for example, 3NF databases have many tables, complex relationships, and many IO times of the database, which will affect the performance).
The most typical example is that not only the user as the foreign key is stored in some data tables_ ID, also save user_ Name, which violates the database paradigm and adds user_ Name field,
But it improves the efficiency and reduces the number of users_ ID and then go to the user table to get the user name
4. Coding practice:
5. Expand thinking:
(1) : what are the database constraints
Primary key: sets the primary key constraint
Unique: sets the uniqueness constraint. Duplicate values are not allowed
Default default value constraint, height double (3,2) default 1.2 if height is not entered, the default value is 1.2
Not null: set a non NULL constraint. This field cannot be empty;
Foreign key: set foreign key constraints.
(2). Purpose of the paradigm:
Reduce redundancy and exceptions (delete, update, insert)
Make the data organization more harmonious
Through the continuous upgrading of paradigms, we will find that the higher the level of applied paradigms, the more tables. Many tables can cause many problems:
Multiple tables should be connected during query, which increases the complexity of query
Multiple tables need to be connected during query, which reduces the performance of database query
6. References:
https://blog.csdn.net/sylvanasgq/article/details/79368159
https://blog.csdn.net/apt1203jn/article/details/80314605
https://yq.aliyun.com/ziliao/495680
7. More discussion:
Q1: what is "cascade" and what does cascade operation mean?
A1:
Cascading is used to design one to many relationships. For example, a table stores the teacher's information: table a (name, gender, age), and the name is the primary key.
There is also a table to store the class information taught by teachers: Table B (name, class). They cascade through names. Cascade operations include cascade update and cascade deletion.
After enabling a cascading update option, you can change a primary key value on the premise that there are matching foreign key values. All matching foreign key values are updated accordingly. If the record named Zhang San is changed to Li Si in table a, all records named Zhang San in table B will be changed to Li Si. Cascading deletion is similar to updating. If the record named Zhang San is deleted in table a, all records named Zhang San in table B will also be deleted.
Q2: how to add data after setting foreign keys?
A2: (1) when inserting data into the primary table first, ensure that the foreign key table associated with the foreign key has the data to be inserted into the primary table,
For example, in the class information table, the teacher's name is a foreign key. When we insert it into the class information, we should ensure that the corresponding name in the teacher table exists
(2) Temporarily turning off foreign key constraints may result in inconsistent data
SET sql_ SAFE_ UPDATES=0; SET FOREIGN_ KEY_ CHECKS=0;
Q3: how to query the data of two associated tables without establishing foreign keys and table connections?
A3: Well, you can use where statements, such as select s.name, c.name, where s.id = c.id
8. Acknowledgement:
9. Conclusion:
That's all for today's sharing. You are welcome to like, forward, leave messages and make bricks~
Ppt link video link