Database. SQL – Primary Key

Translation

I am posting a continuation of the translation of a series of articles for beginners.
In the present and subsequent ones - more information essentially.
Start - .

4. TABLES AND PRIMARY KEYS

As you already know from previous parts, data is stored in tables, which contain lines or in another way records. Earlier I gave an example of a table containing information about lessons. Let's take a look at it again.

There are 6 lessons in the table. All 6 are different, but for each lesson the values of the same fields are stored in the table, namely: tutorial_id (lesson identifier), title (title) and category (category). Tutorial_id – primary key lesson tables. A primary key is a value that is unique for each record in a table.
In the customer table below, customer_id is the primary key. IN in this case primary key is also unique value(number) for each entry.

Primary keys in everyday life

In a database, primary keys are used for identification. In life, primary keys are everywhere around us. Any time you encounter a unique number, that number can serve as a primary key in the database (can, but does not necessarily have to, be used as such. All databases are capable of automatically generating a unique value for each record as a number that is automatically incremented and inserted along with each new entry[So-called synthetic or surrogate primary key – approx. transl.]).

A few examples

The order number that you receive when purchasing in an online store can be the primary key of some order table in the database of this store, because it is a unique value.
The social security number can be the primary key in some table in the database government agency, because it is also unique, as in the previous example.
The invoice number can be used as a primary key in a database table that stores invoices issued to customers.
A numeric customer number is often used as a primary key in a customer table.

What do these examples have in common? The fact that in all of them a unique, non-repeating value for each record is selected as the primary key. Again. The value of the database table field selected as the primary key is always unique.

What characterizes a primary key? Characteristics of a primary key.

The primary key is used to identify records.

The primary key is used for identification records in the table, so that each record becomes unique. Another analogy... When you call customer service technical support, the operator usually asks you to provide a number (contract, telephone, etc.) by which you can be identified in the system.
If you have forgotten your number, the technical support operator will ask you to provide some other information that will help uniquely identify you. For example, a combination of your birthday and last name. They can also be a primary key, or rather a combination of them.

The primary key is unique.

The primary key always has a unique value. Imagine that its meaning is not unique. Then it could not be used to identify the data in the table. This means that any primary key value can appear only once in the column that is selected as the primary key. RDBMSs are designed in such a way that they will not allow you to insert duplicates into a primary key field, you will get an error.
One more example. Imagine that you have a table with fields first_name and last_name and there are two records:

Those. there are two Vasyas. You want to select a specific Vasya from the table. How to do it? The entries are no different from each other. This is where the primary key helps. Add an id column (the classic version of a synthetic primary key) and...

Now every Vasya is unique.

Types of primary keys.

Typically the primary key is numeric value. But it can also be any other data type. It is not common practice to use a string as a primary key (a string is a piece of text), but it is theoretically and practically possible.
Composite primary keys.
Often the primary key consists of one field, but it can also be a combination of several columns, for example two (three, four...). But you remember that the primary key is always unique, which means that the combination of the nth number of fields, in this case 2, must be unique. I'll tell you more about this later.

Auto numbering.

The primary key field is often, but not always, processed by the database itself. You can basically tell the database to automatically assign a unique numeric value to each record when it is created. The database usually starts numbering at 1 and increases this number by one for each record. Such a primary key is called auto-incrementing or auto-numbering. Using auto-incrementing keys – good way to set unique primary keys. The classic name for such a key is a surrogate primary key [As mentioned above. – approx. transl.]. This key does not contain useful information, relating to the entity (object), information about which is stored in the table, which is why it is called surrogate.

5. LINKING TABLES USING FOREIGN KEYS

When I started developing databases, I often tried to store information that seemed related in a single table. I could, for example, store order information in a customer table. After all, the orders belong to the customers, right? No. Customers and orders are separate entities in the database. Both need their own table. And the records in these two tables can be linked in order to establish a relationship between them. Database design is a solution to two issues:

defining which entities you want to store in it
what connections exist between these entities?

One-to-many.

Customers and orders have a connection (are in a relationship) one-to-many because one the client may have a lot of orders, but each specific order (their a bunch of) issued only one client, i.e. can only have one client. Don't worry if this moment understanding of this connection is vague. I'll talk more about connections in future parts.

One thing that is important now is that for one-to-many communication it is necessary two separate tables. One for customers, the other for orders. Let's have some practice creating these two tables.

What information will we store? Let's solve the first question.

To begin with, we will determine what information about orders and about clients we will store. To do this, we must ask ourselves the question: “What units of information relate to customers, and what units of information relate to orders?”

Designing a customer table.

Orders really belong to the clients, but the order is not minimum block of information, which relates to customers (i.e. this block can be divided into smaller ones: order date, order delivery address, etc., for example).
The fields below are minimal blocks information that relates to clients:

customer_id (primary key) – customer identifier
first_name - name
last_name - middle name
address - address
zip_code – postal code
country - country
birth_date – date of birth
username – user registration name (login)
password – password

Let's move on to directly creating this table in SQLyog (of course, you can use any other program). Below is an example of what a table might look like in SQLyog once created. All graphic applications for database management have approximately the same interface structure. You can also create a table using command line without using a graphical utility.

Creating a table in SQLyog. Notice that the primary key (PK) checkbox for the customer_id field is selected. The customer_id field is the primary key. The Auto Incr checkbox is also selected, which means that the database will automatically insert a unique numeric value that, starting at zero, will increment by one each time.

Designing an orders table.
What are the minimum pieces of information we need for an order?

order_id (primary key) – order identifier
order_date – order date and time
customer – the client who made the order

Below is an example of a table in SQLyog.

These two tables ( clients And orders) are connected because the field customer in the orders table refers to the primary key ( customer_id) customer tables. This connection is called foreign key relationship. You should think of a foreign key as a simple copy (a copy of the value) of another table's primary key. In our case, the field value customer_id from the table clients copied to table orders each time a record is inserted. Thus, with us, each order is linked to the client. And each client can have many orders, as mentioned above.

Creating a foreign key relationship.

You may be wondering, “How can I make sure or how can I see that the customer field in the orders table references the customer_id field in the customers table.” The answer is simple - you cannot do this because I have not yet shown you how to create a connection.
Below is the SQLyog window with the window I used to create the relationship between the tables.

Creating a foreign key relationship between the orders and customers tables.

In the window above, you can see how the customer field of the orders table on the left is linked to the primary key (customer_id) of the customer table on the right.

Now when you look at the data that might be in the tables, you will see that the two tables are related.

Orders are associated with customers through the customer field, which references the customers table.

In the image you can see that the client mary placed three orders, customer pablo placed one, and the client john- no one.
You may ask: “A What Is that what all these people ordered?” That's a good question. You might have expected to see ordered items in the orders table. But this bad example design. How would you fit multiple products into a single entry? Goods are separate entities that must be stored in a separate table. And the relationship between tables orders And goods will be a one-to-many relationship. I'll talk about this further.

6. CREATE AN ENTITY-RELATIONSHIP DIAGRAM

Previously, you learned how records from different tables are linked to each other in relational databases. Before creating and linking tables, it is important that you think about entities that exist on your system (for which you are creating a database) and decide how these entities would contacted together. In database design, entities and their relationships are typically provided in entity-relationship diagram (ERD). This diagram is the result of the database design process.

Entities.

You may be wondering what an entity is. Well... it's a “thing” in the system. There. My Mom always wanted me to become a teacher because I am very good at explaining things.

In the context database design essence is something that deserves your own table in your database model. When you design a database, you must define these essence on the system for which you are creating the database. It is more a matter of dialogue with the client or with yourself in order to find out what data your system will work with.

Let's take an online store as an example. Online store sells goods. Product could become an obvious entity in the online store system. Goods are being orderedclients. So you and I saw two more obvious entities: orders And clients.

The order is paid for by the client... this is interesting. We're going to create separate table for payments in the database of our online store? Maybe. But are payments really the minimum piece of information that relates to orders? This is also possible.

If you're not sure, just think about what payment information you want to store. You may want to store payment method or payment date. But these are still minimal pieces of information that could relate to order. You can change the wording. Payment method - payment method for the order. Payment date – date of payment of the order. So I don't see the need to endure payments into a separate table, although conceptually you could distinguish payments as an entity, because you could think of payments as a container of information (payment method, payment date).

Let's not get too academic.

As you can see, there is a difference between an entity and the table itself in the database, i.e. it's not the same thing. Industry specialists information technologies can be VERY academic and pedantic in this matter. I'm not that kind of specialist. This difference depends on your point of view on your data, your information. If you look at data modeling from the point of view software, then you may end up with a lot of entities that cannot be transferred directly to the database. IN this manual we look at data strictly from a database perspective and in our small world entity is a table.

Hang in there, you're really close to getting your database degree.

As you can see, determining what entities your system has is a bit of an intellectual process that requires some experience and is often a subject for change, revision, reflection, but of course it is not rocket science.

An entity-relationship diagram can be quite large if you are working on complex application. Some charts may contain hundreds or even thousands of tables.

Connections

The second step in database design is choosing what relationships exist between the entities in your system. This may be a little difficult to understand now, but again, it's not rocket science. With some experience and rethinking of the work done, you will complete the next database model correctly or almost correctly.

So. I told you about the connection one-to-many and I'll tell you more about connections in later parts of this guide, so I won't dwell on it any further for now. Just remember that deciding what relationships your entities will have is an important part database design and these connections are displayed in the diagram entity-relationship.

Translation

I am posting a continuation of the translation of a series of articles for beginners.
These and subsequent ones contain more information on the merits.
Start - .

4. TABLES AND PRIMARY KEYS

There are 6 lessons in the table. All 6 are different, but for each lesson the values of the same fields are stored in the table, namely: tutorial_id (lesson identifier), title (title) and category (category). Tutorial_id – primary key lesson tables. A primary key is a value that is unique for each record in a table.
In the customer table below, customer_id is the primary key. In this case, the primary key is also a unique value (number) for each record.

Primary keys in everyday life

In a database, primary keys are used for identification. In life, primary keys are everywhere around us. Any time you encounter a unique number, that number can serve as a primary key in the database (can, but does not necessarily have to, be used as such. All databases are capable of automatically generating a unique value for each record as a number that is automatically incremented and inserted along with each new record [So-called synthetic or surrogate primary key – approx. translation]).

A few examples

The order number that you receive when purchasing in an online store can be the primary key of some order table in the database of this store, because it is a unique value.
A social security number can be a primary key in some table in a government database because... it is also unique, as in the previous example.
The invoice number can be used as a primary key in a database table that stores invoices issued to customers.
A numeric customer number is often used as a primary key in a customer table.

What characterizes a primary key? Characteristics of a primary key.

The primary key is used to identify records.

The primary key is used for identification records in the table, so that each record becomes unique. Another analogy... When you call technical support, the operator usually asks you to give some number (contract, phone, etc.) by which you can be identified in the system.
If you have forgotten your number, the technical support operator will ask you to provide some other information that will help uniquely identify you. For example, a combination of your birthday and last name. They can also be a primary key, or rather a combination of them.

The primary key is unique.

Now every Vasya is unique.

Types of primary keys.

Typically the primary key is a numeric value. But it can also be any other data type. It is not common practice to use a string as a primary key (a string is a piece of text), but it is theoretically and practically possible.
Composite primary keys.
Often the primary key consists of one field, but it can also be a combination of several columns, for example two (three, four...). But you remember that the primary key is always unique, which means that the combination of the nth number of fields, in this case 2, must be unique. I'll tell you more about this later.

Auto numbering.

The primary key field is often, but not always, processed by the database itself. You can basically tell the database to automatically assign a unique numeric value to each record when it is created. The database usually starts numbering at 1 and increases this number by one for each record. Such a primary key is called auto-incrementing or auto-numbering. Using auto-incrementing keys is a good way to define unique primary keys. The classic name for such a key is a surrogate primary key [As mentioned above. – approx. transl.]. Such a key does not contain useful information related to the entity (object), information about which is stored in the table, which is why it is called surrogate.

5. LINKING TABLES USING FOREIGN KEYS

defining which entities you want to store in it
what connections exist between these entities?

One-to-many.

Customers and orders have a connection (are in a relationship) one-to-many because one the client may have a lot of orders, but each specific order (their a bunch of) issued only one client, i.e. can only have one client. Don't worry if your understanding of this connection is unclear at this point. I'll talk more about connections in future parts.

What information will we store? Let's solve the first question.

Designing a customer table.

Orders really belong to the clients, but the order is not minimum block of information, which relates to customers (i.e. this block can be divided into smaller ones: order date, order delivery address, etc., for example).
The fields below are the minimum pieces of information that apply to clients:

customer_id (primary key) – customer identifier
first_name - name
last_name - middle name
address - address
zip_code – postal code
country - country
birth_date – date of birth
username – user registration name (login)
password – password

Let's move on to directly creating this table in SQLyog (of course, you can use any other program). Below is an example of what a table might look like in SQLyog once created. All graphical database management applications have approximately the same interface structure. You can also create a table using the command line without using a graphical utility.

Designing an orders table.
What are the minimum pieces of information we need for an order?

order_id (primary key) – order identifier
order_date – order date and time
customer – the client who made the order

Below is an example of a table in SQLyog.

Creating a foreign key relationship.

Creating a foreign key relationship between the orders and customers tables.

In the window above, you can see how the customer field of the orders table on the left is linked to the primary key (customer_id) of the customer table on the right.

Now when you look at the data that might be in the tables, you will see that the two tables are related.

Orders are associated with customers through the customer field, which references the customers table.

In the image you can see that the client mary placed three orders, customer pablo placed one, and the client john- no one.
You may ask: “A What Is that what all these people ordered?” That's a good question. You might have expected to see ordered items in the orders table. But this is a bad example of design. How would you fit multiple products into a single entry? Goods are separate entities that must be stored in a separate table. And the relationship between tables orders And goods will be a one-to-many relationship. I'll talk about this further.

6. CREATE AN ENTITY-RELATIONSHIP DIAGRAM

Entities.

You may be wondering what an entity is. Well... it's a “thing” in the system. There. My Mom always wanted me to become a teacher because I am very good at explaining things.

The order is paid for by the client... this is interesting. Are we going to create a separate table for payments in our online store database? Maybe. But are payments really the minimum piece of information that relates to orders? This is also possible.

Let's not get too academic.

As you can see, there is a difference between an entity and the table itself in the database, i.e. it's not the same thing. IT industry professionals can be VERY academic and pedantic about this. I'm not that kind of specialist. This difference depends on your point of view on your data, your information. If you look at data modeling from a software perspective, you may end up with a lot of entities that cannot be transferred directly to the database. In this tutorial, we're looking at data strictly from a database perspective, and in our little world, the entity is a table.

Hang in there, you're really close to getting your database degree.

The entity-relationship diagram can be quite large if you are working on a complex application. Some charts may contain hundreds or even thousands of tables.

Connections

Last update: 07/02/2017

Databases can contain tables that are related to each other various connections. A relationship represents an association between entities of different types.

When selecting a relationship, a primary or parent table (primary key table / master table) and a dependent, child table (foreign key table / child table) are selected. The child table depends on the parent table.

Foreign keys are used to organize communication. A foreign key represents one or more columns from one table that is also a potential key from another table. The foreign key does not have to match the primary key from the main table. Although, as a rule, a foreign key from a dependent table points to a primary key from the main table.

Relationships between tables are of the following types:

One to one

One to many

many to many(Many to many)

One to one communication

This type of connection is not found often. In this case, an object of one entity can be associated with only one object of another entity. For example, on some sites a user can only have one blog. That is, a relationship arises: one user - one blog.

Often this type of relationship involves splitting one large table into several small ones. The primary parent table in this case continues to contain frequently used data, while the child dependent table typically stores data that is used less frequently.

In this regard, the primary key of the dependent table is at the same time a foreign key that references the primary key of the main table.

For example, the Users table represents users and has the following columns:

UserId(id, primary key)

Name (username)

And the Blogs table represents user blogs and has the following columns:

BlogId (identifier, primary and foreign key)

Name (blog name)

In this case, the BlogId column will store the value from the UserId column from the users table. That is, the BlogId column will act as both a primary and a foreign key.

One to many relationship

This is the most common type of connection. In this type of relationship, multiple rows from the child table depend on one row in the parent table. For example, one blog can have several articles. In this case, the blogs table is the parent and the articles table is the child. That is, one blog - many articles. Or another example, several football players can play on a football team. And at the same time, one football player can only play in one team at a time. That is, one team - many players.

For example, let's have a table called Articles that represents blog articles and has the following columns:

ArticleId(id, primary key)

BlogId (foreign key)

Title (article title)

Text (article text)

In this case, the BlogId column from the articles table will store the value from the BlogId column from the blogs table.

many to many relationship

With this type of relationship, one row from table A can be associated with many rows from table B. In turn, one row from table B can be associated with many rows from table A. A typical example is students and courses: one student can take several courses , and accordingly several students can enroll in one course.

Another example is articles and tags: several tags can be defined for one article, and one tag can be defined for several articles.

But in SQL Server at the database level, we cannot establish a direct many-to-many relationship between two tables. This is done through an auxiliary staging table. Sometimes the data from this staging table represents a separate entity.

For example, in the case of articles and tags, let there be a Tags table that has two columns:

TagId(identifier, primary key)

Text (tag text)

Also let there be an intermediate table ArticleTags with the following fields:

TagId (identifier, primary and foreign key)

ArticleIdId (identifier, primary and foreign key)

Technically, we will get two one-to-many relationships. The TagId column from the ArticleTags table will reference the TagId column from the Tags table. And the ArticleId column from the ArticleTags table will refer to the ArticleId column from the Articles table. That is, the TagId and ArticleId columns in the ArticleTags table represent a composite primary key and are also foreign keys for the relationship with the Articles and Tags tables.

Referential data integrity

When changing primary and foreign keys, the following aspect should be observed: referential data integrity(referential integrity). Its basic idea is for two tables in a database that store the same data to maintain consistency. Data integrity represents correctly constructed relationships between tables with correct installation links between them. In what cases can data integrity be violated:

Deletion anomaly(deletion anomaly). Occurs when a row is deleted from the main table. In this case, the foreign key from the dependent table continues to reference deleted line from the main table

Insertion Anomaly(insertion anomaly). Occurs when a row is inserted into a dependent table. In this case, the foreign key from the dependent table does not match the primary key of any of the rows from the main table.

Update anomalies(update anomaly). With such an anomaly, several rows of the same table may contain data that belongs to the same object. When you change data in one row, it may conflict with data in another row.

Deletion anomaly

To resolve a deletion anomaly, you must set one of two constraints on a foreign key:

If a row from a dependent table necessarily requires a row from the main table, then a cascade delete is set for the foreign key. That is, when a row is deleted from the main table, the associated row(s) are deleted from the dependent table.

If a row from a dependent table allows no relationship with a row from the main table (that is, such a relationship is optional), then the foreign key is set to NULL when the related row is deleted from the main table. The foreign key column must be nullable.

Insertion Anomaly

To resolve an insert anomaly when adding data to a dependent table, the column that represents the foreign key must be nullable. And thus, if the added object has no connection with the main table, then the foreign key column will contain a NULL value.

Update anomalies

To solve the update anomaly problem, normalization is used, which will be discussed later.

Three main entity classes are defined:

1) Rod– an independent entity. The names are placed in a rectangle.

2) Associative– a many-to-many relationship between two or more entities. Associations are treated as entities in their own right. They can participate in other associations and have a set of attributes.

a. Designations (denoting an entity) are many-to-one or one-to-one relationships between two entities. It differs from a characteristic in that it does not depend on the designating entity.

3) Characteristics(characteristic) – a many-to-one or one-to-one relationship between two entities. It is a special case of association. The sole purpose of a characteristic is to describe or clarify some other entity. The existence of a characteristic depends entirely on the entity being characterized.

A key or potential key is just a set of attributes whose values can be used to uniquely find the required instance of an entity.

Minimality means that lexically from a set of any attribute does not allow identifying the entity by the remaining ones.

One of the keys is taken as the primary key and the rest are called alternative keys. Potentially, a key consisting of one attribute is called simple. It is not allowed for the primary key of a core entity to take on an undefined value, otherwise a contradictory situation arises - a non-individual and, therefore, non-existent instance of the core entity will appear. For the same reasons, it is necessary to ensure the uniqueness of the primary key.

If entity C links entities A and B, then it must include foreign keys corresponding to the primary keys of entities A and B.

For each foreign key, three questions need to be addressed:

1) Can an additional foreign key accept null values, in short, can there be some instance of an entity for which the target entity specified by the foreign key is known.

2) What should happen when you try to delete target entity, which is referenced by the foreign key.

There are three possibilities to resolve this issue:

· Cascading

· Restriction

· Setting to a specific value

3) What should happen when trying to update the primary key of a target entity that is referenced by some foreign key.

Thus, for each foreign key in a database project, it is necessary to specialize not only the field or combination of fields that make up that foreign key, but also the answers to the above questions.

Data types and domains.

The relational data model is characterized by a simple data structure and user-friendly presentation.

The relational model is designed to organize data in the form of two-dimensional tables. A relational table is two-dimensional array and has the following properties:

1) Each table element is one data element

2) All columns in the table are homogeneous - all elements in the column have same type and data length

3) Each column has unique name

4) There are no identical rows in the table

5) The order of the row of columns can be arbitrary

Data types

Any data used in programming has its own data types. The relational model requires that the data types used be simple.

Typically, data types are divided into three groups:

1) Simple data types

2) Structured types data

3) reference data types

Simple (atomic) data types have no internal structure. This type of data is called scalar. These include logical, numeric, and string data types. The concept of atomicity is quite relative. Thus, the string data type can be considered as a one-dimensional array of characters, and whole type data as a set of bits. The only important thing here is that when switching to such low level semantics, that is, the meaning of the data, is lost.

Structuring data types is intended to specify complex structures data that is constructed from constituent elements, which in turn may have an internal structure (arrays, records, structures).

A reference data type is designed to provide the ability to point to other data. This data type is intended for procedural languages that have memory areas for storing data.

For relational model data, the type of data used is not that important. The requirement that the data type be simple must be understood to mean that it should not be taken into account in relational operations internal structure data.

Domain porno.ru

In the relational data model, the concept of data type is closely related to the concept of domain, which can be considered a clarification of the data type.

Domain – semantic concept. It can be thought of as a subset of the values of some data type.

Domain properties:

1) the domain has a unique name within the database

2) the domain is defined on some simple type data or on another domain

3) a domain may have some logical condition that allows one to describe a subset of data that is valid for a given domain.

4) the domain carries some semantic meaning

For example, some domain D, meaning “employee age”, can be described as some subset of the set of natural numbers

D=(nϵN: n ≥ 18 and n ≤ 60)

The difference between the domain of the concept of a subset is precisely that the domain reflects the semantics of a certain subject area. There may be several domains that coincide as a subset, but carry different meanings. For example, the domains “part weight” and “quantity available” can equally be described as a set of non-negative integers, but the meaning of these domains will be different, and they will be different domains. The main meaning of domain is that domains limit comparisons. It is not logically correct to compare values of different domains, even if they are of the same type. It is syntactically correct to produce a list of all parts whose part weight is greater than the available quantity does not correspond to the meaning of the concepts of quantity and weight.

5. Relations and their properties, attributes and tuples.
The concept of a relationship is a fundamental concept in the relational data model. Relationship attribute:<Имя_атрибута: Имя_домена>. Attribute names must be unique within the relationship. Often the attribute names are the same as the corresponding domain names. Some relation R, defined on the set of domains D 1 , D 2 , ... D n, contains two parts: a header and a body. The relation head contains a fixed number of relation attributes.

(,,…)

The body of a relationship contains many relationship maps. Each relationship map represents a set of pairs of the form

<Имя_атрибута: Значение_атрибута>

(,,… ).

In this case, the value Val i belongs to the attribute A i D i . the value is written:

R ( ,,…).

The number of attributes in a relation is called the degree or arity of the relation. The number of relation cards is called the cardinality of the relation. The relation head describes the Cartesian product of domains on which the relation is defined. The header is static. It does not change while working with the database. If attributes are changed, added, or removed from a relation, the result is a different relation. The body of a relation is a set of cartes, that is, a subset of the Cartesian product of domains and is a relation in the mathematical sense of the word. The body of the relationship can change while working with the database, that is, cards can be changed, added, and so on.

A relational database is a set of relationships. A relational database schema is a set of relationship headers included in a database.

Although any relationship can be represented as a table, a relationship is not a table. These are close but not corresponding concepts. The terms that the relational data model operates have corresponding “tabular” synonyms.

Properties of relationships

The properties of relations mainly consist of differences between relations

1) Regarding unequal cards.

The body of a relation is a set of cartes and, like any set, cannot contain indistinguishable elements. Tables, unlike relationships, can contain identical rows.

2) The maps are not ordered (from top to bottom) since the body of the relation is a set.

The same attitude cannot be depicted different tables, in which the lines are in a different order

3) Attributes are not ordered from left to right. Since each attribute has a unique name within the relationship, the order of the attributes does not matter. The same relationship can be represented by different tables in which the columns are in different orders.

4) All attribute values are atomic.

From the properties of a relation it follows that not every table can define relations. To do this she needs to have simple structure, do not contain identical rows, any of its columns must contain data of only one type, and all data types used must be simple.

The challenge in logical design of a relational database is making informed decisions about what relationships the database should consist of and what attributes those relationships should have.

The relational data model fixes two basic integrity requirements that must be supported in any relational DBMS.

1) The requirement for the integrity of entities, which is that any map of any relationship must be distinguishable from any other map of this relationship, that is, any relationship must contain a primary key.

2) The referential integrity requirement (foreign key integrity requirement) is that for each foreign key value in the referenced relation,. There must be a map with the same primary key value, or the foreign key value must be undefined.

The Figure shows a table (a ratio of degree 5) containing some information about the employees of a hypothetical enterprise. Table rows correspond to tuples. Each row is actually a description of one real-world object (in this case, an employee), the characteristics of which are contained in the columns. Relational relationships correspond to sets of entities, and tuples correspond to entities. The columns in a table representing a relational relationship are called attributes.

Each attribute is defined on a domain, so a domain can be thought of as a set acceptable values of this attribute. Multiple attributes of the same relationship, and even attributes of different relationships, can be defined on the same domain.

An attribute whose value uniquely identifies tuples is called key (or simply key). The key is the "Personnel number" attribute, since its value is unique for each employee of the enterprise. If tuples are identified only by concatenating the values of several attributes, then the relation is said to have a composite key.

Primary key- in a relational data model, one of the potential keys of a relationship, selected as the primary key (or default key).

A relation can contain multiple keys. One of the keys is always declared primary, its values cannot be updated. All other relation keys are called possible keys.

From a theoretical point of view, all potential (possible) relation keys are equivalent, that is, they have the same uniqueness and minimality properties. However, the primary key is usually selected from the potential keys that is most convenient for certain practical purposes, for example, for creating external keys in other respects or to create a clustered index. Therefore, as a rule, the primary key is chosen to be the one that has smallest size(physical storage) and/or includes the least number of attributes.

If primary key consists of a single attribute, it is called with a simple key.

If primary key consists of two or more attributes, it is called compound key. So, first name, last name, patronymic, passport number, passport series cannot be primary keys individually, since they may be the same for two or more people. But there are no two personal documents of the same type with the same series and number. Therefore, in a relation containing data about people, the primary key can be a subset of attributes consisting of the personal document type, its series and number.

Unlike hierarchical and network models In relational data there is no concept of group relationship. To reflect associations between tuples of different relations, duplication of their keys is used.

Attributes that are copies of keys of other relationships are called foreign keys.

For example, the relationship between the DEPARTMENT and EMPLOYEE relationships is created by copying the primary key "Department_number" from the first relationship to the second. Thus, in order to obtain a list of employees of a given department, it is necessary: 1) From the DEPARTMENT table, set the attribute value "Department_number" , corresponding to this “Department_Name”. 2) select all records from the EMPLOYEE table, attribute value "Department_number" which is equal to that obtained at the previous step. In order to find out in which department an employee works, you need to perform the reverse operation: 1) Determine "Department_number" from the EMPLOYEE table. 2) Using the obtained value, we find the entry in the DEPARTMENT table.

18. Normalization in relational databases, concept normal shape when designing databases.

Normal form - a property of a relationship in a relational data model, characterizing it from the point of view of redundancy, which can potentially lead to logically erroneous results of sampling or changing data. Normal form is defined as a set of requirements that a relation must satisfy.

The process of converting a database to normal form is called normalization . Normalization is intended to bring the database structure to a form that provides minimal redundancy, that is, normalization is not intended to reduce or increase work productivity or reduce or increase the volume of the database. The ultimate goal of normalization is to reduce the potential inconsistency of information stored in the database.

Elimination of redundancy is carried out, as a rule, by decomposing relations in such a way that only primary facts are stored in each relation (that is, facts that are not inferred from other stored facts).

Functional dependencies.

Relational base data contains both structural and semantic information. The structure of a database is determined by the number and type of relationships it contains, and the one-to-many relationships that exist between the tuples of these relationships. The semantic part describes the set of functional dependencies that exist between the attributes of these relationships. Let us define functional dependence.

19. 1NF: Basic definitions and transformation rules.

To discuss first normal form, two definitions are necessary:

Simple attribute - an attribute whose values are atomic (indivisible).

Complex attribute - is obtained by combining several atomic attributes that can be defined on one or different domains(also called a vector or data aggregate).

Definition of first normal form:

a relation is in 1NF if the values of all its attributes are atomic. . Otherwise, it is not a table at all and such attributes must be decomposed.

Let's look at an example:

In the database of the enterprise's HR department, it is necessary to store information about employees that can be attempted to be presented in relation to

EMPLOYEE(EMPLOYEE_NUMBER, NAME, DATE OF BIRTH, WORK_HISTORY, CHILDREN).

From a careful consideration of this relationship it follows that the attributes "work_history" And "children" are complex, moreover, the attribute "work_history" includes another complex attribute "salary_history".
These units look like this:

 JOB_HISTORY (RECEPTION_DATE, NAME, SALARY_HISTORY),

 SALARY_HISTORY (APPOINTMENT_DATE, SALARY),

 CHILDREN (CHILD_NAME, BIRTH_YEAR).

Their connection is shown in Fig. 3.3.

Fig.3.3. Initial attitude.

To bring the original relation SERVANT to the first normal form, it is necessary to decompose it into four relations, as this is shown in the following figure:

Fig.3.4. Normalized set of relations.

Here, the primary key of each relationship is highlighted with a blue frame, the names of the foreign keys are in font of blue color. Recall that foreign keys are used to represent functional dependencies that exist in the original relation. These functional dependencies are indicated by lines with arrows.

The normalization algorithm is described by E.F. Codd as follows:

Starting with the relation at the top of the tree (Figure 3.3.), its primary key is taken, and each immediately subordinate relation is expanded by inserting a domain or combination of domains of that primary key.
The Primary Key of each relation expanded in this way consists of the Primary Key that the relation had before the extension and the added Primary Key of the parent relation.
After this, all non-simple domains are deleted from the parent relation, the top node of the tree is removed, and the same procedure is repeated for each of the remaining subtrees.

20. 2NF: Basic definitions and transformation rules.

Very often the primary key of a relationship includes several attributes (in which case it is called composite) - see, for example, the relation CHILDREN shown in Fig. 3.4 question 19. At the same time, the concept is introduced full functional dependence.

Definition:

a non-key attribute is functionally fully dependent on a composite key if it is functionally dependent on the entire key as a whole, but is not functionally dependent on any of its constituent attributes.

Example:

Let there be a relation SUPPLY (N_SUPPLIER, PRODUCT, PRICE).
A supplier may supply different products, and the same product may be supplied by different suppliers. Then the relation key is "N_supplier + product". Let all suppliers supply goods at the same price. Then we have the following functional dependencies:

N_supplier, product -> price
product -> price

The incomplete functional dependence of the “price” attribute on the key leads to the following anomaly: when the price of a product changes, it is necessary full view relationship in order to change all records about its suppliers. This anomaly is a consequence of the fact that two semantic facts are combined in one data structure. The following expansion gives the relations in 2NF:

DELIVERY (N_SUPPLIER, PRODUCT)
PRODUCT_PRICE (PRODUCT, PRICE)

So you can give

Definition of second normal form: A relation is in 2NF if it is in 1NF and each non-key attribute is fully functionally dependent on the key.

21. 3NF: Basic definitions and transformation rules.

Before discussing third normal form, it is necessary to introduce the concept: transitive functional dependence.

Definition:

Let X, Y, Z be three attributes of some relation. In this case, X --> Y and Y --> Z, but there is no reverse correspondence, i.e. Z -/-> Y and Y -/-> X. Then Z depends transitively on X.
Let there be a relation STORAGE ( FIRM, WAREHOUSE, VOLUME), which contains information about companies receiving goods from warehouses and the volumes of these warehouses. Key attribute - "firm". If each company can receive goods from only one warehouse, then in this regard there are the following functional dependencies:

firm -> stock
stock -> volume

In this case, anomalies arise:

if at the moment no company receives goods from the warehouse, then data on its volume cannot be entered into the database (since the key attribute has not been defined)
if the warehouse volume changes, it is necessary to view the entire relationship and change the cards for all companies associated with this warehouse.

To eliminate these anomalies, it is necessary to decompose the original relation into two:

STORAGE ( FIRM, STOCK)
STORAGE_VOLUME ( STOCK, VOLUME)

Definition of third normal form:

A relation is in 3NF if it is in 2NF and each non-key attribute does not depend transitively on the primary key.