Multidimensional data representation. General scheme for organizing a data warehouse


Characteristics of an OLTP system Large volume of information Often different databases for different departments Normalized scheme, no duplication of information Intensive data changes Transactional mode of operation Transactions affect a small amount of data Processing current data - a snapshot Many clients Short response time - a few seconds Characteristics of an OLAP system Large amount of information Synchronized information from various databases using common classifiers Unnormalized database schema with duplicates Data changes rarely, Change occurs through batch loading Complex ad-hoc queries are performed on large amounts of data with extensive use of groupings and aggregate functions. Time dependency analysis Small number of working users - analysts and managers Longer response time (but still acceptable) - several minutes






Codd's rules for relational databases 1. Information rule. 2. Guaranteed access rule. 3. Rule for supporting invalid values. 4. Dynamic directory rule based on the relational model. 5. Rule of exhaustive data sublanguage. 6. View update rule. 7. Rule for adding, updating and deleting. 8. Rule of independence of physical data. 9. Rule of independence of logical data. 10. Rule of independence of integrity conditions. 11. Rule of independence of distribution. 12. Rule of uniqueness.


Codd's Rules for OLAP 1. Conceptual multidimensional representation. 2. Transparency. 3. Availability. 4. Consistent performance in report development. 5. Client-server architecture. 6. General multidimensionality. 7. Dynamic Control sparse matrices. 8. Multi-user support. 9. Unlimited cross operations. 10. Intuitive data manipulation. 11. Flexible options for receiving reports. 12. Unlimited dimension and number of aggregation levels.


Implementation of OLAP Types of OLAP - MOLAP (Multidimensional OLAP) servers - both detailed data and aggregates are stored in a multidimensional database. ROLAP (Relational OLAP) - detailed data is stored in a relational database; aggregates are stored in the same database in specially created service tables. HOLAP (Hybrid OLAP) - detailed data is stored in a relational database, and aggregates are stored in a multidimensional database.








Features of ROLAP - star schema 1. One fact table, which is highly denormalized 2. Several dimension tables, which are also denormalized 3. The primary key of the fact table is composite and has one column for each dimension 4. Aggregated data is stored together with the original Disadvantages If aggregates are stored together with the source data, then in measurements it is necessary to use additional parameter– hierarchy level











Storage structure in ORACLE DBMS SQL clientMOLAP client Java API JDBC OCI ODBC OLE DB CWM or CWM2 OLAP storage (BLOB in a relational table) Star schema Metadata registration Multidimensional core (process in the ORACLE core) OLAP DML SQL interface to OLAP (DBMS_AW, OLAP_TABLE, ... ) Multidimensional metadata

data warehouses are formed on the basis of data recorded over a long period of time snapshots operational databases information system and possibly various external sources. Data warehouses use database technologies, OLAP, deep data analysis, and data visualization.

Main characteristics of data warehouses.

  • contains historical data;
  • stores detailed information, as well as partially and completely summarized data;
  • the data is mostly static;
  • an ad hoc, unstructured and heuristic way of processing data;
  • medium and low transaction processing intensity;
  • unpredictable way of using data;
  • intended for analysis;
  • focused on subject areas;
  • support for strategic decision making;
  • serves a relatively small number of management employees.

The term OLAP (On-Line Analytical Processing) is used to describe the model for presenting data and, accordingly, the technology for processing it in data warehouses. OLAP uses a multidimensional representation of aggregated data to provide quick access to strategically important information for the purpose of in-depth analysis. OLAP applications must have the following basic properties:

  • multidimensional data presentation;
  • support for complex calculations;
  • correct consideration of the time factor.

Advantages of OLAP:

  • promotion productivity production staff, developers application programs. Timely access to strategic information.
  • providing sufficient opportunity for users to make their own changes to the schema.
  • OLAP applications rely on data warehouses and OLTP systems, receiving current data from them, which allows saving integrity control corporate data.
  • reducing the load on OLTP systems and data warehouses.

OLAP and OLTP. Characteristics and main differences

OLAP OLTP
Data store should include both internal corporate data and external data the main source of information entering the operational database is the activities of the corporation, and data analysis requires the involvement of external sources of information (for example, statistical reports)
The volume of analytical databases is at least an order of magnitude larger than the volume of operational ones. to conduct reliable analysis and forecasting in data store you need to have information about the corporation’s activities and market conditions over several years For operational processing data for the last few months is required
Data store must contain uniformly presented and consistent information that is as close as possible to the content of operational databases. A component is needed to extract and “clean” information from different sources. In many large corporations At the same time, there are several operational information systems with their own databases (for historical reasons). Operational databases may contain semantically equivalent information presented in different formats, with different indications of the time of its arrival, sometimes even contradictory
The set of queries to an analytical database cannot be predicted. data warehouses exist to respond to ad hoc requests from analysts. You can only count on the fact that requests will not come too often and will involve large amounts of information. The size of the analytical database encourages the use of queries with aggregates (sum, minimum, maximum, average value etc.) Data processing systems are built with solutions in mind. specific tasks. Information from the database is selected frequently and in small portions. Typically, a set of queries to an operational database is known already during design
When the variability of analytical databases is low (only when loading data), the ordering of arrays, faster indexing methods for mass sampling, and storage of pre-aggregated data turn out to be reasonable Data processing systems by their nature are highly variable, which is taken into account in the DBMS used (normalized database structure, rows stored out of order, B-trees for indexing, transactional)
Analytical database information is so critical for a corporation that greater granularity of protection is required (individual access rights to certain rows and/or columns of the table) For data processing systems it is usually sufficient information protection at table level

Codd's rules for OLAP systems

In 1993, Codd published OLAP for User Analysts: What It Should Be. In it he outlined the basic concepts of operational analytical processing and identified 12 rules that must be met by products that provide online analytics capabilities.

  1. Conceptual multidimensional representation. An OLAP model must be multidimensional at its core. A multidimensional conceptual diagram or custom representation facilitates modeling and analysis as well as calculations.
  2. Transparency. The user is able to obtain all the necessary data from the OLAP engine, without even knowing where it comes from. Regardless of whether the OLAP product is part of the user's tools or not, this fact should be invisible to the user. If OLAP is provided by client-server computing, then this fact should also, if possible, be invisible to the user. OLAP must be provided in true context open architecture, allowing the user, wherever he is, to communicate using an analytical tool with the server. In addition to this, transparency should also be achieved when the analytical tool interacts with homogeneous and heterogeneous database environments.
  3. Availability. OLAP must provide its own logic circuit to access in a heterogeneous database environment and perform appropriate transformations to provide data to the user. Moreover, it is necessary to take care in advance about where and how, and what types of physical organization of data will actually be used. An OLAP system should access only the data that is actually required, and not apply general principle“kitchen funnel”, which entails unnecessary input.
  4. Constant performance when developing reports. Performance the ability to generate reports should not drop significantly as the number of dimensions and database size increases.
  5. Client-server architecture. It requires that the product not only be client-server, but also that the server component be intelligent enough to allow different clients to connect with a minimum of effort and programming.
  6. General multidimensionality. All dimensions must be equal, each dimension must be equivalent in both structure and operational capabilities. True, additional operational capabilities for individual dimensions (presumably time is implied), but such additional functionality must be provided to any dimension. It should not be so that basic data structures, computational or reporting formats were more specific to one dimension.
  7. Dynamic Control sparse matrices. OLAP systems should automatically configure their physical diagram depending on the type of model, data volumes and database sparsity.
  8. Multi-user support. An OLAP tool must provide capabilities sharing(query and completion), integrity and security.
  9. Unlimited cross operations. All types of operations must be allowed for any measurements.
  10. Intuitive data manipulation. Data manipulation was carried out through direct actions on cells in viewing mode without using menus and multiple operations.
  11. Flexible reporting options. Dimensions should be placed in the report the way the user needs it.
  12. Unlimited

Today, among the tools offered by the information technology market for processing and visualizing data for adoption management decisions OLTP and OLAP technologies are most suitable. OLTP technology is focused on operational data processing, and the more modern OLAP technology is focused on interactive data analysis. Systems developed on their basis make it possible to achieve an understanding of the processes occurring at a management facility through prompt access to various data slices (representations of the contents of databases, organized to reflect various aspects of the enterprise’s activities). In particular, providing graphical representation data, OLAP is able to make processing results data easy for perception.

OLTP (Online Transaction Processing) - real-time transaction processing. A method of organizing a database in which the system works with small-sized transactions, but with a large flow, and at the same time the client requires the fastest possible response time from the system.

In modern DBMSs, transaction serialization is organized through a locking mechanism, i.e. During the execution of a transaction, the DBMS locks the database or part of it accessed by the transaction; the lock is maintained until the transaction is committed. If in progress parallel processing When another transaction attempts to access the locked data, transaction processing is suspended and resumed only after the transaction that locked the data completes and the lock is released. The smaller the object being blocked, the greater the efficiency of the database. A transaction that updates data across multiple network nodes is called DISTRIBUTED. If a transaction works with a database located on one node, then it is called LOCAL. From the user's point of view, local and distributed transactions should be processed in the same way, i.e. The DBMS must organize the process of executing transaction distribution so that all local transactions included in it are synchronously committed on all nodes affected by them distributed system. In this case, a distributed transaction should be committed only if all its constituent local transactions are committed, and if at least one of the local transactions is interrupted, the entire distributed transaction must be interrupted. To implement these requirements in practice, the DBMS uses a two-stage transaction commit mechanism.

1. The database server that commits a distributed transaction sends the “Prepare to commit” command to all network nodes registered to perform transactions. If at least one of the servers does not respond that it is ready, then the distributed database server rolls back the local transaction on all nodes.

2. All local DBMSs are ready for committing, i.e. the server processes the distributed transaction, finishes committing it, sending a command to commit the transaction to all local servers.

OLAP (eng. online analytical processing, analytical processing in real time) is an information processing technology, including the compilation and dynamic publication of reports and documents. Used by analysts for quick processing complex queries to the database. Serves for preparing business reports on sales, marketing, management purposes, the so-called. data mining - data mining (a method of analyzing information in a database in order to find anomalies and trends without finding out semantic meaning records).

OLAP takes a snapshot of a relational database and structures it into spatial model for inquiries. The stated processing time for queries in OLAP is about 0.1% of similar queries in a relational database.

An OLAP structure created from operational data is called an OLAP cube. A cube is created by joining tables using a star schema or a snowflake schema. At the center of the star schema is a fact table, which contains the key facts on which queries are made. Multiple dimension tables are joined to a fact table. These tables show how aggregated relational data can be analyzed. The number of possible aggregations is determined by the number of ways in which the original data can be hierarchically displayed.

For example, all clients can be grouped by city or by region of the country (West, East, North, etc.), so 50 cities, 8 regions and 2 countries will make up 3 levels of hierarchy with 60 members. Also, customers can be united in relation to products; if there are 250 products in 2 categories, 3 product groups and 3 production divisions, then the number of units will be 16560. When adding dimensions to the diagram, the number possible options quickly reaches tens of millions or more.

An OLAP cube contains basic data and information about dimensions (aggregates). The cube potentially contains all the information that might be needed to answer any queries. Due to the huge number of units, often a full calculation occurs only for some measurements, while for the rest it is performed “on demand”.

The challenge in using OLAP is creating queries, selecting reference data, and developing a schema, which is why most modern OLAP products come with a huge amount pre-configured queries. Another problem is in the underlying data. They must be complete and consistent

The first product to perform OLAP queries was Express (IRI). However, the term OLAP itself was coined by Edgar Codd, “the father of relational databases.” And Codd's work was funded by Arbor, a company that had released its own OLAP product, Essbase (later acquired by Hyperion, which was acquired by Oracle in 2007) the year before.

Other well-known OLAP products include Microsoft Analysis Services (formerly OLAP Services, SQL part Server), Oracle OLAP Option, DB2 OLAP Server from IBM (in fact, EssBase with additions from IBM), SAP BW, SAS OLAP Server, products from Brio, BusinessObjects, Cognos, MicroStrategy and other manufacturers.

OLAP is most commonly used in business planning and data warehouse products.

OLAP uses a multidimensional view of aggregated data to provide quick access to strategic information for in-depth analysis. OLAP applications must have the following basic properties:

  • multidimensional data representation;
  • support for complex calculations;
  • correct consideration of the time factor.

Advantages of OLAP:

  • increasing the productivity of production personnel and application program developers. Timely access to strategic information.
  • providing sufficient opportunity for users to make their own changes to the schema.
  • OLAP applications rely on data warehouses and OLTP systems to provide up-to-date data, thereby maintaining control over the integrity of corporate data.
  • reducing the load on OLTP systems and data warehouses.
OLAP OLTP
The data warehouse should include both internal corporate data and external data the main source of information entering the operational database is the activities of the corporation, and data analysis requires the involvement of external sources of information (for example, statistical reports)
The volume of analytical databases is at least an order of magnitude larger than the volume of operational ones. To conduct reliable analysis and forecasting in a data warehouse, you need to have information about the corporation’s activities and market conditions over several years For prompt processing, data for the last few months is required
The data warehouse must contain uniformly presented and consistent information that is as close as possible to the content of operational databases. A component is needed to extract and “clean” information from different sources. In many large corporations, several operational information systems with their own databases simultaneously exist (for historical reasons). Operational databases may contain semantically equivalent information presented in different formats, with different indications of the time of its receipt, sometimes even contradictory
The set of queries to an analytical database cannot be predicted. Data warehouses exist to answer ad hoc queries from analysts. You can only count on the fact that requests will not come too often and will involve large amounts of information. The size of the analytical database encourages the use of queries with aggregates (sum, minimum, maximum, average, etc.) Data processing systems are created to solve specific problems. Information from the database is selected frequently and in small portions. Typically, a set of queries to an operational database is known already during design
When the variability of analytical databases is low (only when loading data), the ordering of arrays, faster indexing methods for mass sampling, and storage of pre-aggregated data turn out to be reasonable Data processing systems by their nature are highly variable, which is taken into account in the DBMS used (normalized database structure, rows stored out of order, B-trees for indexing, transactional)
Analytical database information is so critical for a corporation that greater granularity of protection is required (individual access rights to certain rows and/or columns of the table) For data processing systems, information protection at the table level is usually sufficient.

The objectives of an OLTP system are to quickly collect and most optimal placement information in the database, as well as ensuring its completeness, relevance and consistency. However, such systems are not designed for the most efficient, fast and multidimensional analysis.

Of course, it is possible to build reports based on the collected data, but this requires the business analyst either to constantly interact with an IT specialist, or to have special training in programming and computer technology.

What does the traditional decision-making process look like? Russian company using an information system built on OLTP technology?

The manager gives the task to the information department specialist in accordance with his understanding of the issue. The information department specialist, having understood the task in his own way, builds a request to the operational system, receives an electronic report and brings it to the attention of the manager. This scheme of adoption is critical important decisions has the following significant shortcomings:

  • a negligible amount of data is used;
  • the process takes long time, since drawing up requests and interpreting an electronic report are rather tedious operations, while the manager may need to make a decision immediately;
  • the cycle is required to be repeated if it is necessary to clarify the data or consider the data in a different context, as well as if additional questions. Moreover, this slow cycle has to be repeated and, as a rule, several times, while even more time is spent on data analysis;
  • in a negative way affects the difference in vocational training and areas of activity of a specialist in information technology and a leader. Often they think in different categories and, as a result, do not understand each other;
  • an unfavorable effect is exerted by such a factor as the complexity of electronic reports for perception. The manager does not have time to select the numbers of interest from the report, especially since there may be too many of them. It is clear that the work of preparing data most often falls on specialists in information departments. As a result, a competent specialist is distracted by routine and ineffective work of compiling tables, diagrams, etc., which, naturally, does not contribute to improving his skills.

There is only one way out of this situation, and it was formulated by Bill Gates in the form of the expression: “Information at your fingertips.” Initial information must be available to its direct consumer – the analyst. It is directly accessible. And the task of the information department employees is to create a system for collecting, accumulating, storing, protecting information and ensuring its availability to analysts.

The global industry has long been familiar with this problem, and for almost 30 years there have been OLAP technologies that are designed specifically to enable business analysts to operate with accumulated data and directly participate in their analysis. Such analytical systems are the opposite of OLTP systems in the sense that they eliminate information redundancy (“collapse” information). At the same time, it is obvious that it is the redundancy of primary information that determines the effectiveness of the analysis. DSS, combining these technologies, makes it possible to solve whole line tasks:

  • Analytical tasks: calculating specified indicators and statistical characteristics of business processes based on retrospective information located in data warehouses.
  • Data visualization: presentation of all available information in user-friendly graphical and tabular form.
  • Obtaining new knowledge: determining the relationship and interdependence of business processes based on existing information (testing statistical hypotheses, clustering, finding associations and temporal patterns).
  • Simulation tasks: math modeling behavior of complex systems over an arbitrary period of time. In other words, these are tasks related to the need to answer the question: “What will happen if...?”
  • Control synthesis: determination of acceptable control actions that ensure the achievement of a given goal.
  • Optimization problems: integration of simulation, management, optimization and statistical methods of modeling and forecasting.

Enterprise managers using tools OLAP technologies, even without special training, can independently and quickly obtain all the information necessary for studying business patterns, and in the most various combinations and business analysis cross-sections. A business analyst has the opportunity to see in front of him a list of measurements and indicators of a business system. With such simple interface an analyst can build any reports, rearrange measurements (say, make cross-tabs - superimpose one measurement on another). In addition, he gets the opportunity to create his own functions based on existing indicators, conduct a “what if” analysis – obtain a result by specifying the dependencies of any indicators of business functions or a business function on indicators. In this case, the maximum response of any report does not exceed 5 seconds.

To solve problems of data analysis and search for solutions, it is necessary to accumulate and store sufficiently large volumes of data. Databases (DBs) serve these purposes.

To store data according to any domain model, the database structure must correspond as much as possible to this model. The first such structure used in a DBMS was a hierarchical structure, which appeared in the early 60s of the last century.

The hierarchical structure involved storing data in the form of a tree structure.

Trying to improve hierarchical structure there was a network structure of the database, which involves representing the data structure in the form of a network.

Relational databases are the most common nowadays. To store this type of information, it is proposed to use post-relational models in the form of object-oriented data storage structures. The general approach is to store any information as objects. In this case, the objects themselves can be organized within a hierarchical model. Unfortunately, this approach, unlike the relational structure, which relies on relational algebra, is not formalized enough, which does not allow it to be widely used in practice.

In accordance with Codd's rules, the DBMS must ensure the execution of operations on the database, while providing the ability simultaneous work to multiple users (from multiple computers) and ensuring data integrity. To implement these rules, the DBMS uses a transaction management mechanism.

A transaction is a sequence of operations on a database, considered by the DBMS as a single whole. A transaction moves the database from one integral state to another.

As a rule, a transaction consists of operations that manipulate data belonging to different tables and logically related to each other. If, when executing a transaction, operations are performed that modify only part of the data, and the remaining data is not changed, then integrity will be violated. Therefore, either all operations included in a transaction must be completed, or none of them must be completed. The process of undoing a transaction is called transaction rollback. Saving changes made as a result of transaction operations is called committing a transaction.

The property of a transaction to transfer a database from one integral state to another allows us to use the concept of a transaction as a unit of user activity. In the case of simultaneous users accessing the database, transactions initiated by different users, are not executed in parallel (which is impossible for one database), but, in accordance with some plan, are queued and executed sequentially. Thus, for the user on whose initiative the transaction was created, the presence of transactions of other users will be invisible, except for some slowdown in operation compared to the single-user mode.


There are several basic algorithms for scheduling transactions. In centralized DBMSs, the most common algorithms are those based on synchronizing the capture of database objects.

When using any algorithm, situations of conflicts between two or more transactions to access database objects are possible. In this case, one or more transactions must be rolled back to maintain the plan. This is one of the cases when a user of a multi-user DBMS can actually feel the presence of other users' transactions in the system.

The history of DBMS development is closely related to the improvement of approaches to solving problems of data storage and transaction management. The developed transaction management mechanism in modern DBMSs has made them the main means of building OLTP systems, the main task of which is to ensure the execution of database operations.

3.1.3. Using OLTP technology
in decision support systems

OLTP online transaction processing systems are characterized by big amount changes, the simultaneous access of many users to the same data to perform various operations - reading, writing, deleting or modifying data. Locks and transactions are used to ensure the normal operation of multiple users. Efficient processing transactions and support for locking are among the most important requirements for online transaction processing systems.

By the way, the first DSS also belong to this class of systems − Information Systems manuals. Such systems are usually built on the basis relational DBMS, include subsystems for collecting, storing and information retrieval analysis of information, and also contain a predefined set of queries for daily work. Every new request, unforeseen when designing such a system, must first be formally described, coded by the programmer, and only then executed. The waiting time in this case can be hours and days, which is unacceptable for prompt decision-making.

The practice of using OLTP systems has shown the ineffectiveness of their use for comprehensive information analysis. Such systems quite successfully solve the problems of collecting, storing and retrieving information, but they do not meet the requirements for modern DSS. Approaches related to increasing the functionality of OLTP systems have not yielded satisfactory results. The main reason for the failure is the contradictory requirements for OLTP and DSS systems.

The main requirements for OLTP and DSS systems are the following:

1. Degree of detail of the stored data. A typical query in an OLTP system tends to selectively affect individual records in tables, which are efficiently retrieved using indexes.

2. Data quality. OLTP systems, as a rule, store information entered directly by system users (computer operators). Presence" human factor" when entering increases the likelihood of erroneous data and can create local problems in system.

3. Data storage format. OLTP systems serving different areas of work are not interconnected. They are often implemented on different software and hardware platforms. The same data in different databases can be represented in in various forms and may not match (for example, data about a client who interacted with different departments of the company may not match in the databases of these departments).

4. Allowing redundant data. The structure of the database serving an OLTP system is usually quite complex. It may contain many dozens or even hundreds of tables referencing each other. The data in such a database is highly normalized to optimize the resources it takes. Analytical queries to the database are very difficult to formulate and are extremely inefficient to execute, since they contain views that combine a large number of tables.

5. Data management. The main requirement for OLTP systems is to ensure that modification operations are performed on the database. It is assumed that they should be performed in real mode, and often very intensely.

6. Amount of data stored. As a rule, analysis systems are designed to analyze time dependencies, while OLTP systems usually deal with the current values ​​of some parameters.

7. The nature of data queries. In OLTP systems, due to database normalization, composing queries is quite complex work and requires the necessary qualifications.

8. Processing time for data requests. OLTP systems typically operate in real time, so they have stringent data processing requirements.

9. The nature of the computing load on the system. As noted earlier, work with OLTP systems is usually performed in real time.

10. Priority of system characteristics. For OLTP systems, the priority is high performance and data availability, since they work with them in real time. For analysis systems, the higher priority tasks are to ensure system flexibility and user independence, i.e., what analysts need to analyze data.

It should be noted that the contradictory requirements for OLTP systems and systems focused on deep analysis of information complicates the task of integrating them as subsystems of a single DSS. Currently, the most popular solution to this problem is a data warehousing approach.

General idea data warehouses consists of separating the database for − systems and the database for performing analysis and then designing them taking into account the corresponding requirements.

DSS solves three main tasks: collection, storage and analysis of stored information. The task of analysis in general view may include: information retrieval analysis, operational analytical analysis and predictive analysis.

Subsystems for collecting, storing information and solving problems of information retrieval analysis are currently being successfully implemented within the framework of information retrieval analysis systems using DBMS. To implement subsystems that perform operational analytical analysis, the concept of multidimensional data representation is used. The data mining subsystem implements the methods.

To simplify the development of application programs using databases, database management systems (DBMS) are created − software for data management, storage and data security.

DBMSs have a developed transaction management mechanism, which has made them the main means of creating online transaction processing systems (OLTP systems). Such systems include the first DSS, problem solvers information retrieval analysis - ISR.

OLTP systems cannot be effectively used to solve problems of operational analytical and intellectual information analysis. The main reason is the contradictory requirements for the OLTP system and the DSS.

Currently, in order to increase the efficiency of operational analytical and intellectual analysis, the concept of data warehouses is used to combine OLTP subsystems and analysis subsystems within one system. The general idea is to allocate a database for OLTP subsystems and a database for performing analysis. This ensures an optimal approach to data processing in decision support systems.

Questions for self-control

1. List the main tasks that decision support systems solve.

2. Outline the conceptual directions for building data warehouses in decision support systems.

3. Specify the types of structures for organizing data warehouses in the DSS. What are the advantages and disadvantages of each type of structure?

4. Justify the feasibility of using the post-relational model of the subsystem for collecting and processing information in the DSS.

5. How is the concept of transaction interpreted in data processing systems?

6. What is the main property of a transaction in data processing systems?

7. Briefly describe the mechanism for managing transactions in OLTP systems.

8. Specify the role and place of OLTP systems for online transaction processing. Why are OLTP systems ineffective for solving operational analytical and predictive analysis problems?

9. What are the basic requirements for OLTP systems. What are the contradictory requirements for OLTP systems?

10. Name ways to increase the efficiency of operational analytical and intellectual analysis in DSS.

In the previous subsection, it was noted that for an adequate representation of the subject area, ease of development and maintenance of the database, the relations must be reduced to the third normal form (there are forms of normalization of higher orders, but in practice they are used quite rarely), that is, be strongly normalized. At the same time, weakly normalized relations also have their advantages, the main one of which is that if the database is accessed mainly only with queries, and modifications and additions of data are carried out very rarely, then their sampling is much faster. This is explained by the fact that in weakly normalized relations their connection has already been made and processor time is not wasted on this. There are two classes of systems for which strongly and weakly normalized relations are more suitable.

Highly normalized data models are well suited for OLTP applications − On-Line Transaction Processing (OLTP) – online transaction processing applications. Typical examples of OLTP applications are systems warehouse accounting͵ ticket orders, operational banking systems and others. Main function similar systems is to carry out large quantity short transactions. The transactions themselves are quite simple, but the problems are that there are a lot of such transactions, they are executed simultaneously, and if errors occur, the transaction must be rolled back and return the system to the state it was in before the transaction began. Almost all database queries in OLTP applications consist of insert, update, and delete commands. Selection queries are mainly intended to provide users with a selection of data from various types of directories. However, most of the requests are known in advance at the system design stage. Critical to OLTP applications is the speed and reliability of short data update operations. The higher the level of data normalization in OLTP applications, the faster and more reliable it is. Deviations from this rule can occur when, already at the development stage, some frequently occurring requests are known that require connecting relationships and the operation of applications significantly depends on the speed of execution of which.

Another type of application is OLAP applications − On-Line Analytical Processing (OLAP) – applications for online analytical data processing. This is a generalized term that characterizes the principles of building decision support systems - Decision Support System (DSS), data warehouses - Data Warehouse, data mining systems - Data Mining. Such systems are designed to find dependencies between data, to conduct dynamic analysis based on the “what if...” principle and similar tasks. OLAP applications operate with large amounts of data accumulated in the enterprise or taken from other sources. Such systems are characterized by the following features:

New data is added to the system relatively rarely in large blocks, for example, once a month or quarter;

Data added to the system is usually never deleted;

Before loading, data undergoes various preparatory procedures related to bringing them to certain formats;

Queries to the system are unregulated and quite complex;

The speed of query execution is important, but not critical.

Bases OLAP data-applications are usually presented in the form of one or more hypercubes, the dimensions of which represent reference data, and the cells of the hypercube itself store the values ​​of this data. Physically, a hypercube can be built based on a special multidimensional model data – Multidimensional OLAP (MOLAP) or represented by means of a relational data model - Relational OLAP (ROLAP).

IN OLAP systems using relational model data, it is advisable to store data in the form of weakly normalized relations containing pre-computed basic totals. Data redundancy and related problems are not a problem here, since they are updated quite rarely and, along with the data update, the results are recalculated.


  • - Ways to ensure the reliability of the water supply system

    Ensuring the reliability of the water supply system, as well as other systems queuing, is one of the main tasks in their design. The system must be designed and built so that during operation it performs its functions with a given... [read more]


  • - I. Security concept of the protection system

    The security concept of the system being developed is “a set of laws, rules and norms of behavior that determine how an organization processes, protects and disseminates information. In particular, the rules determine in which cases the user has the right to operate with... [read more]


  • - After making the main decisions on the design of the heating system

    DESIGNING A WATER HEATING SYSTEM FOR A BUILDING Draw diagrams of thermal units when connecting a heating system using open and closed circuits.


  • Questions for self-test When supplying heat to several buildings.

    Pumps and other equipment are installed... [read more]


  • - Requirements for ensuring fire safety of the fire prevention system.

    Fundamentals of ensuring fire safety of technological processes.

    Question 2. Fire prevention of a facility (25 min.) Fire prevention includes a set of organizational and technical measures aimed at ensuring the safety of people... [read more]


  • - Animal tissues and organ systems

    Animal tissues. Animals also have several types of tissue. The most important of them are the following.