Hypertext markup language. Hypertext Markup Languages ​​SGML

In 1989, hypertext represented a promising new technology that had a relatively large number of implementations, on the one hand, and on the other hand, attempts were made to build formal models of hypertext systems that were more descriptive in nature and were inspired by the success of the relational approach to describing data.

HTML is a hypertext markup language used to encode documents. The HTML language is a set of commands according to which the browser displays the contents of a document; HTML commands are not displayed. The HTML language implements a hypertext linking mechanism that allows one document to be linked to others. These documents may be located on the same server as the page from which they are linked, or they may be hosted on a different server.

The HTML idea is an example of an extremely successful solution to the problem of building a hypertext system using a special display control tool.

Contextual hypertext links were recognized as the most effective form of hypertext organization, and in addition, the division into links associated with the entire document as a whole and with its individual parts was recognized.


All HTML documents have the same structure, defined by a fixed set of structure tags. An HTML document should always start with a tag< HTML >and end with the appropriate closing tag (). There are two main sections within a document: the headings section and the body of the document, in that order. The headers section contains information that describes the document as a whole and is limited by tags<НЕАD>And. In particular, the headings section should contain the general title of the document, delimited by the paired tag<ТITLE>.

). However, it is not recommended to omit structure tags when creating an HTML document. The simplest valid HTML document containing all the tags that define the structure might look like this:

< TITLE >Document title< /TITLE >

Document text

HTML elements.

For paired tags, the scope is defined by the portion of the document between the opening and closing tags. This part of the document is considered an element of the HTML language. So, we can talk about a “BODY element” that includes the tag, the body of the document and the closing tag. The entire HTML document. can be thought of as an "HTML element." For unpaired tags, the element is the same as the tag that defines it.

Most elements of the HTML language. describes parts of the document's content and is placed between tags . And, that is, inside the BODY structural element. Such elements are divided into block and text. Block elements refer to paragraph-level pieces of text. Text elements describe the properties of individual phrases and even smaller parts of text.

Now we can formulate rules for nesting elements.

Elements must not intersect. In other words, if the opening tag is located inside an element, then the corresponding closing tag must be located inside the same element.

Block elements can contain nested block and text elements.

Text elements can contain nested text elements.

Text elements cannot contain nested block elements.

Functional block elements.

In most documents, the main functional elements are headings and paragraphs. HTML language. supports six levels of headings. They are specified using paired tags from<Н1>before<Н6>. When displayed, Web documents are displayed using this method; tag (document on the computer screen, these elements are shown using fonts of different sizes.

Regular paragraphs are specified using a paired tag<Р>. HTML language. does not contain a means for creating a paragraph indent (“red line”), so when displayed on a computer screen, paragraphs are separated by a blank line. Closing tagis considered optional. It is understood that it comes before the tag, which specifies the beginning of the next paragraph of the document. For example:

Heading

<Р>First paragraph<Р>Second paragraph

Second level heading

A consequence of having a special tag that defines a paragraph is that the usual end-of-line character entered by pressing the ENTER key is not enough to create a paragraph indentation. HTML language. treats end-of-line characters and spaces in a special way. Any sequence; consisting only of spaces and end-of-line characters, is treated as a single space when the document is displayed. This, in particular, means that the end-of-line character does not even lead to a new line (a text element specified by an unpaired tag is used for this purpose
.

A horizontal ruler can also be used as a paragraph delimiter. This element is specified by an unpaired tag


. When a document is displayed on the screen, a ruler separates parts of the text from each other. Its length and thickness are specified by the tag attributes
.


This tag creates a 10 pixel wide horizontal ruler that takes up half the width of the window and is positioned to the right.


Website creation is one of the widely available opportunities in the modern Internet industry. The actual creation of websites is, in principle, not much more difficult than creating personal email accounts and electronic business cards.

To create a website, first of all, you need a server connected to the Internet on which you can place the necessary hypertexts. In addition, it is necessary to register the saiga name with the provider serving the selected server.

On the Internet you can find providers offering free opening of websites on their servers. Free sites can be opened on domestic servers narod.ru, boom.ru, hotmail.ru and on foreign servers, for example geocities.com, tripod.com.

On these servers you can register domain names like:

<имя>. narod.ru

name>.boom.ru,

Examples of registered domain names:

wdu.da.ru - website of the electronic university;

wduniv.newmail.ru - website of a distributed university.

After registering a site's domain name, you can host hypertexts on it. Hypertexts are placed on the site using special programs that allow you to create, edit, accumulate and copy a wide variety of hypertexts. Immediately after the placement of the very first (main) hypertext page, its information can be read using a browser in any country from any computer connected to the Internet. To do this, enter the website address on the Internet in the browser window. For example: http://bak.boom.ru

All posted files must be hypertexts, written in HTML format and having identifiers of the form<имя>.html.

HTML is a hypertext markup language.

By structure, hypertext is text with links to other hypertexts located on this server or on other servers. When you click on such a link, the browser automatically loads a hypertext page onto your computer screen, regardless of what server it is on and in what country it is located.

Using these tools and programs on the Internet, a wide variety of information sites and systems can be created - personal sites, company sites, electronic newspapers, magazines, electronic books, encyclopedias, as well as electronic archives and libraries.

The difference between sites is the amount of information, their structure and updating procedures. In general, for Internet sites, as for any organization, we can talk about the life cycles of their creation, development, modernization and liquidation.

The volume of information is determined by the owners - people or organizations that created sites and post their information on them. The amount of information on websites can range from several kilobytes to several gigabytes (millions of kilobytes).

The structure of sites can be very diverse. The simplest structure is a main page with links to a set of texts. These links can be in the text of the main page or highlighted in the table of contents at the beginning of it.

Each page of the site can be provided with a title, which appears on the top line of the screen when the site is loaded by the browser.

In addition, on the main page of the site you can specify a list of keywords for search engines.

Search engines weekly scan all servers on the Internet and record the addresses of all sites and hypertexts found along with the keywords highlighted in them. For these reasons, no later than a week later, any information published on the Internet can be found using the keywords contained in them.

Hypertext markup language (HTML) is most often used to create electronic training programs.

This choice is due to the fact that, along with the ease of creating this type of document, hypertext markup language has enormous capabilities, such as outputting formatted text, using graphic objects of almost all known formats, using a background image, inserting objects such as background sound, video and etc.

In addition, HTML makes it easy to organize links to other objects or fragments of text in the document itself.

The great advantage of HTML is that most modern tools (such as text and graphic editors, visual programming languages, Internet Explorer...) support working and saving documents in HTML format.

Therefore, HTML is often used to create such software products. However, creating various types of demos, testing procedures and surveys, in my opinion, is still made easier with the help of visual programming languages.

Therefore, this thesis examines the integration of various tools for creating training, testing programs and electronic textbooks.

However, the use of HTML documents greatly facilitates the writing of the theoretical part of the program and makes it more lively. Let's look at a few issues related to creating HTML documents. You can work on the Web without knowing the HTML language, since HTML texts can be created by various special editors and converters.

However, it is better to write directly in HTML, or at least monitor and modify the HTML code occasionally. Writing directly in HTML is not difficult. It may even be easier than learning an HTML editor or converter, which are often limited in their capabilities, buggy, or produce bad HTML that doesn't work on various platforms.

The first version of HTML was developed in the early 90s by Tim Beners-Lee for the formerly popular Mosaic browser. But in those days, neither the browser nor the language itself had yet found a worthy use. HTML+ appeared in 1993, and this version also went virtually unnoticed. The widespread use of hypertext began with version 2.0, which appeared in June 1994.

This was the moment when WWW began to grow in popularity around the world. The elements included in version 2 are, for the most part, still in use today.

Version 3.0 of HTML, which appeared a year later, introduced the ability to draw mathematical symbols (integral signs, infinity signs, fractions, parentheses, etc.) using language elements. Browsers (Arena) were also developed for this version. But this project turned out to be a dead end and did not receive further distribution.

In 1996, HTML version 3.2 appeared. This was an innovative solution; it is enough to mention that frames were introduced into the language specification, which have now become very popular among Web page developers.

Even now, very good design solutions can be implemented based on this specification. Almost all modern browsers fully support version 3.2, so the authors have no doubts about the functionality of the declared elements.

Along with the official language specifications that were developed by the W3C organization (W3 Consortium), browser manufacturing companies created their own elements (extensions).

Subsequently, some of these elements, after gaining general acceptance, were included in the specification of the next version of the language. It is interesting, for example, that an innovative solution - frames - that many developers loved, was not included in the 3.2 specification.

But browsers supported frames, and many books on HTML included descriptions of frames without mentioning that they were non-standard elements. And it was right, because frames became the de facto standard. They were already included in language version 4 for good reason.

Conversely, the APPLET and SCRIPT elements needed to extend HTML with other code did not play the role they were intended to play in version 3.2.

This was explained by the fact that browsers of different versions interpreted programs in Java, JavaScript, and Visual Basic VBScript differently. As a result, it was not possible to obtain code that worked reliably enough, and these languages ​​were used by HTML enthusiasts mainly for experiments.

The official HTML 4 (Dynamic HTML) specification appeared in 1997. At this time, it was already obvious that the further development of hypertext would be carried out through script programming. This turned out to be much more effective than introducing new elements into the language.

The browsers that appeared at that time (Netscape Navigator 4, Microsoft Internet Explorer 4, etc.) already interpreted the program code quite reliably (a certain level of standardization was achieved). However, the developers still have problems. As an example, it can be noted that many scripts begin by determining the browser version in order to then use this or that piece of code.

Obviously, the programmer is responsible for testing pages on all currently popular browsers. In addition, the problem of using old or not very popular programs remains relevant. Microsoft and Netscape are rightfully considered the leaders in browser development, but there are also other companies.

As a result, using all the capabilities of Dynamic HTML has become the responsibility of programmers in fairly large organizations, where there are conditions for developing complex programs and their comprehensive testing. Creators of personal Web pages sometimes have to compromise between reliability and innovation in order to obtain sufficiently competent HTML code.

Anatomy of a Web Page

Below is a sample of a typical Web document. In this example, we will look at the structure of HTML pages.

Example (template) of a Web page

<Т1Т1Е>Web page structure

If you look at the source texts of various Web pages, you can easily see the similarity of their structures. This is explained by the fact that documents are created according to certain rules.

The syntax of the HTML language is based on the ISO 8879:1986 standard "Information processing. Text and office systems. Standard Generalized Markup Language (SGML)". True, there is a big difference between the official standard and the actual standard. HTML is constantly evolving, supplemented with new elements, and it should be studied not from official primary sources, but in practice, turning to the latest developments of leading companies and specialists.

To understand the structure of a Web page, you need to consider the weight of the elements included in the listing above. When considering language elements, we will use both tags: start and end.

For example: . This can be emphasized that in most cases the developer should use two tags for each element. The number of cases where only a start tag is allowed (some elements have no end tag at all) is small, and they are specifically specified. For tag names, you can use both uppercase and lowercase letters of the Latin alphabet.

Some users write starting tags in uppercase letters and ending tags in lowercase letters. This helps you understand the source text of a Web page.

HTML syntax.

HTML document notation. It was mentioned above that one of the principles of the language is multi-level nesting of elements. This element is the outermost one, since the entire Web page must be located between its start and end tags.

In principle, this element can be considered as a formality. It has the version, lang and dir attributes, which are rarely used in this case, and allows the nesting of HEAD, BODY, FRAMESET and other elements that determine the overall structure of the Web page. Naturally, the end tag all such documents are running out.

The title area of ​​a Web page. In other words, its first part. Just like the previous element, HEAD serves only to form the overall structure of the document. This element can have lang and d i r attributes, must include a TITLE element, and allows nesting of BASE, META, LINK, OBJECT, SCRIPT, STYLE elements.

An element for placing the title of a Web page. The line of text located inside this element is not displayed in the document, but in the title bar of the browser window. This string is often used when organizing searches on the WWW. Therefore, authors creating Web pages for posting on the Internet must ensure that this line, without being too long, accurately reflects the purpose of the document.

Description of the style of some elements of the Web page. The Strukt.htm file assigns fonts to the H2 and CODE elements.

Naturally, for each element there is a default style, so the use of the STYLE element is not necessary, but desirable.

It's interesting how HTML syntax reflects the history of computing. For example, the old, now defunct BLINK element reminds us of the times when people used displays that only had a text mode. Given this state of affairs, blinking text was probably the only achievable visual effect.

In contrast, the STYLE element, introduced relatively recently, evokes associations with Windows programs, since they first introduced text styling, which is now incredibly popular, and without it it is no longer possible to work in applications such as Word or Excel.

This element contains service information that is not reflected when viewing a Web page. There is no text inside it in the usual sense, so there is no end tag. Each META element contains two main attributes, the first of which defines the data type, and the second the content.

In addition, the META element can contain a URL. The template for the corresponding attribute is:

URL="http://address"

This element contains the hypertext that defines the Web page itself. This is an arbitrary part of the document that is developed by the auto page and which is displayed by the browser. Accordingly, the end tag of this element should be found at the end of the HTML file. Inside the BODY element, you can use all the elements intended for Web page design. Within the start tag of the BODY element, you can place a number of attributes that provide settings for the entire page. Let's look at them in order.

One of the most useful for design is the attribute that defines the background of the page. Its appearance can be likened to a small revolution in the WWW, as the same gray Web pages suddenly blossomed with bright color patterns:

background="Path to background file"

A simpler background design comes down to setting its color:

bgcolor="#ff/?GGSS"

The background color is specified by three two-digit hexadecimal numbers that determine the intensity of red, green and blue colors, respectively. Setting colors will be described in more detail below. Both of the above attributes are not alternative and are often used together: if for some reason a background pattern cannot be found, the color is used.

Because the page background can change, you need to be able to match the text color. For this purpose there is the following attribute

text="#/?/?GGB5"

To set the text color of hyperlinks, use the following attribute:

In the same way, you can set the color for viewed hyperlinks:

vlink="#/?/?GGflS"

You can also specify a color change for the user's last selected hyperlink:

Hypertext located inside the BODY element can have any structure. It is determined, first of all, by the purpose of the Web page and the imagination of the developer.

Header element. There are six levels of headings, which are designated H1...H6. Level 1 heading is the largest, and level 6 provides the smallest heading. For headings, you can use an attribute that specifies left, center, or right alignment:

Horizontal rule is a very commonly used element. Firstly, because it is very convenient to divide a page into parts. Secondly, because the author of the page has a very small selection of such design elements. Indeed, there are practically no similar constructions in HTML, only for some reason an exception was made for the horizontal line. True, despite some stinginess of language in this area, you can come up with a lot of standard graphic images that would diversify the look of the pages.

The element does not have an end tag, but allows a number of attributes to be aligned left, centered, right, justified:

You can set the line thickness:

51ge = thickness in pixels

You can control the line length:

fiitifn - length in pixels

width=/^twa in percent/hour

You can choose the color:

co1og="color"

An HTML document can be very large, in which case the user should be able to quickly navigate to the desired section of the document. To do this, you can use the hyperlink mechanism. It is also necessary to place appropriate marks in the right places in the text. Here we will look only at the template for creating labels:

<А name=" метка ">Free text

In this case, a given line of the document is given a name, and therefore a hyperlink can be created in another part of the document, or even on another document, leading to that point. For example, to navigate within a document, you can use the following construct:

<Р>Transition to<А href=" Пметка ">label

Several similar lines can form a kind of table of contents for a Web page, which can be placed at the beginning and end of the document.

Element for specifying the base address (URL) for links. This allows you to omit the initial part of the address in document links. To use this element you must use the following construction:

The path // address fragment is optional.

When a full address is generated, it will be discarded.

So, if there is a relative link in the document text

<А ef =" путь2/имя документа, htm" ">Visible link text,

then it will match the URL

In the case when you need to set the base address for a local disk (for example.D:), the following construction should be used:

Then, when specifying a relative link, you can specify not only the file name, but also the names of the folders in which it is located. In other words, the path to files can be divided into two parts: absolute and relative. This is useful when the files specified in the document have a common starting path fragment. You can also omit the access scheme reference (file://) in an absolute link expression. In this case, only the left part of the absolute link up to the first left character "", that is, the local drive name, will be taken into account.

Syntax rules

Now that we know what the code for a Web page looks like, we can make some general conclusions about HTML syntax. When using each element, it is important to know which elements can be located inside it and which elements it itself can be located inside.

Thus, the relative position of the HTML, HEAD, TITLE and BODY elements should be standard on any page, however, in cases where frames are not used. If the page is a frame layout document, then the FRAMESET element is used instead of the BODY element.

There are groups of elements that are used together. These include elements for creating tables, lists, and frames.

In this case, the order of nesting elements is determined by the logic of creating a particular object on the page: here you need to remember simple design rules.

Tables and frames are often used to arrange page details (pictures, text, etc.) in a specific order.

For example, by placing a picture inside a table cell, you can achieve a certain position.

In such cases, the nesting of elements is determined by the Web page developer, and much depends on his imagination and skill.

The large number of elements that are used to format text allows for a wide variety of nesting options. And they themselves must necessarily be located inside certain elements.

Here we must be guided by common sense: each element performs a given function and has a specific scope.

The example below has two paragraphs (the first one in a green box) and a table:

<Р style="border: Зрх solid дгееп">Paragraph 1 text

. . .

<Р>Text of paragraph 2

The table in this case is an independent element. It can, for example, be aligned independently of the rest of the text.

You can use other code:

<Р style="border: Зрх solid дгееп">Paragraph 1 text

. . .

<Р>Text of paragraph 2

The end tag of the first paragraph has disappeared. The table is now part of the first paragraph, and the green border will enclose the table and text. Conversely, a P element can be located within a table: for example, one cell TD element can contain several P paragraphs.

Violating nesting rules is one of the most common mistakes when creating Web pages. To avoid such errors, you need to use hypertext editors that automatically control the execution of syntax rules. Below is a line containing a typical nested element error:

<Н1>Heading 1<Н2>Heading 2

Heading 3

It should be noted that browsers are built in such a way that they “try” to respond to hypertext markup errors. If the page can be displayed, it is displayed without any warning messages.

The program interprets erroneously placed tags in a certain way and generates an image, following the logic built into it by the developers. At the same time, the appearance of the page may not correspond to the author’s intention. And only in case of very serious errors or obvious contradictions does the browser display a message stating that the page cannot be displayed.

An indirect sign of a markup error can be the appearance of fragments of HTML code on the page. Users who work a lot with the Internet have probably encountered this situation.

Syntax rules also apply to the use of start and end tags, attributes, and element content. Do not confuse the concepts of "element" v. "tag". An element is a container containing attributes within the start tag and useful information between the start and end tags. A tag is a construct enclosed in angle brackets and used to indicate the scope of an element.

Some elements do not have an end tag. Obviously, the BR element denoting the end of a line does not need an end tag. Some elements can be used with or without an end tag. The most striking example is the paragraph element R.

It can have an end tag, but if that tag is not specified, the element's end tag is the following element that can logically define the end of the current paragraph: another P element, an IMG picture element, a UL list element, a TABLE table element, etc.

Thus, the payload of one element must be located either between the start and end tags of that element, or between the start tag of this element and the start tag of the next element.

Any arbitrary text entered onto a page is interpreted by the browser as being to be displayed on the screen and therefore formatted according to the elements surrounding that text. This does not take into account the division of text into lines obtained in a text editor. In theory, an entire Web page could be contained in one long line. End-of-line characters entered in Notepad, for example, can help read HTML code, but are not displayed by the browser.

The latter, when displaying a page on the screen, can break a line in accordance with the arrangement of the Hn, P or BR elements, and in other cases it formats paragraphs arbitrarily, depending on the amount of text, font size and the current window size.

Therefore, Web pages must be arranged in such a way that their appearance does not change dramatically for different monitor resolution modes, screen size, browser window size, and also for full-screen or windowed modes.

A very important rule, which has no exceptions, is to place element attributes inside the start tag.

Modern Web information technologies are rapidly changing our world and directly influencing the development of Web technologies. This technological revolution has greatly impacted not only business but also personal and professional life. The latest Web technologies penetrate into all spheres of society, change the methods of communication and the principles of conducting Web projects of modern companies, determining the fate of the latter. The internal complexity and extreme simplicity of use of modern Web information technologies makes them accessible to everyone who daily encounters their use in their professional activities.

Both in everyday life and in business, in correspondence and trade, people and organizations use the Web, create their own Web sites where they offer information, goods and services. Tools for creating Web resources are developing rapidly and without stopping, allowing you to create complex Web documents without requiring special knowledge about their structure and appearance, freeing up time for productive creative activity. The main advantage of Web technologies in modern conditions is their simplicity and, as a result, increasing the efficiency of their use.

Hypertext Markup Language HTML

The popularity of the Internet is largely due to the emergence of the World Wide Web (WWW), as it was the first network technology that provided the user with a simple, modern interface for accessing a variety of network resources. Simplicity and ease of use have led to an increase in the number of WWW users and attracted the attention of commercial structures. Then the process of growth in the number of users became an avalanche, and this continues to this day. Based on the need to combine all the multitude of information resources, technology began to develop with the help of which a hypertext navigation system is defined. This technology became the HTML language. HTML technology at the initial stage was extremely simple, and almost all network users simultaneously had the opportunity to try themselves as creators and readers of information materials published on the World Wide Web. The fact is that when developing various components of the technology, it was assumed that the qualifications of the authors of information resources and their equipment with computer technology would be minimal.

HTML (HyperText Markup Language) is one of the so-called markup languages. The term “markup” refers to general service information that is not displayed along with the document, but defines; what certain fragments of the document should look like. For example, you might want a word to be bold or italic, a paragraph to be shown in a specific font, or headings to be in a larger font.

There are many different markup languages ​​available these days. For example, in communications programs, a special form of markup determines the meaning of each packet of ones and zeroes sent to the Internet. However, any markup language must solve two important problems:

1) the language defines the markup syntax;

2) the language determines the meaning of the markup.

The most common markup language for Web pages is HTML. This markup language was created and promoted as a subset of SGML. First proposed in 1974 by Charles Goldfarb and subsequently adopted as an official ISO standard after significant refinement, SGML (Standard Generalized Markup Language) is a metalanguage - a system for describing other languages.

The emergence of the SGML standard was driven by the need to share data between different applications and operating systems. Even back in the 60s, computer users had many compatibility problems. After analyzing the shortcomings of many non-standard markup languages, three IBM scientists - Charles Goldfarb, Ed Mosher and Ray Lorie - formulated three general principles that ensure the ability to collaborate on documents across different operating systems.

1) Use of uniform formatting principles in all programs that process documents. A completely logical requirement - we all know how difficult it is for people speaking different languages ​​to agree with each other. The presence of a single set of syntactic structures and common semantics significantly simplifies the interaction between programs.

2) Specialization of formatting languages. Thanks to the ability to build a specialized language based on a set of standard rules, the programmer ceases to depend on external implementations and their ideas about the needs of the end user

3) Clear definition of the document format. The rules that define the document format specify the number and marking of language constructs used in the document. Using a standard format ensures that the user knows exactly the structure of the document's content. Please note: this is not about the display format of the document, but about its structural format. The set of rules that describe this format is called a document type definition (DTD).

These three rules were the basis for SGML's predecessor, GML (Generalized Markup Language). Research and development of GML continued for about ten years until the SGML standard emerged as a result of an agreement reached by an international group of developers.

HTML (Hypertext Markup Language) is the computer language that underlies the World Wide Web. HTML is based on the SGML standard, a hypertext markup language for document presentation on the Web. The HTML language standards, one of the key Web standards, are developed and maintained by the W3C consortium. The founder of this international consortium is Tim Berners-Lee. In addition to creating formatting standards, the consortium is the center for the development of the Semantic Web (semantic network). The HTML language provides format markup for documents and defines hyperlinks between documents and/or their fragments.

A regular text file was chosen as the basis for writing the HTML code. Thus, a hypertext database in the WWW concept is a set of text files marked up in HTML, which determines the form of information presentation (markup) and the structure of connections between these files and other information resources (hypertext links).

HTML developers were able to solve two problems:

· provide hypertext database designers with a simple means of creating documents;

· make this tool powerful enough to reflect the then existing ideas about the user interface of hypertext databases.

The first problem was solved by choosing a tagging model for document description. The HTML language allows you to mark up an electronic document that is displayed on the screen with a printing level of design; the resulting document can contain a wide variety of labels, illustrations, audio and video fragments, and so on. The language includes developed tools for creating different levels of headings, font selections, various lists, tables and much more.

The second important point that influenced the fate of HTML was that a regular text file was chosen as the basis. The HTML editing environment is a no-man's land between a simple text file and a WYSIWYG (what you see is what you get) application. Choosing an editing environment gives you all the benefits of text editing.

Hypertext links, establishing connections between text documents, gradually began to unite a wide variety of information resources, including sound and video. The HTML hyperlink system allows you to build a system of interconnected documents according to various criteria. The HTML language contains commands (tags) that allow you to control the shape and size of fonts, the size and location of illustrations, and allows you to move from a fragment of text or illustration to other HTML documents - the so-called hypertext link. A document in html format is a text file containing all the necessary information about the information displayed on the screen. You can use scripting languages ​​such as JavaScript, Java, and VBScript to manage website browsing scripts (a hypertext database hosted on the World Wide Web). Forms for user input that are later processed and other information can be processed using special server programs (for example, in PHP or Perl). HTML allows you to include hypertext links and clickable buttons on your pages that connect your Web pages to other pages in the same Web site, as well as to other Web sites around the world.

HTML is a text markup language, not a programming language, which is just one of the tools (more precisely, a page description language) used to create Web pages. HTML has limited text formatting capabilities compared to the capabilities of publishing programs, especially when publishing text rich in complex elements.

There are still no HTML editors so convenient that you can do without a text editor and manual placement of tags. This complicates working with the language and makes it necessary for them to master functions that are completely unusual for them.

Analyzing the features of the HTML language and assessing the level of its development, we can come to the conclusion that in the coming years we should expect the appearance of more advanced modifications of it, new languages ​​and application packages for working with web pages.

Dynamic and static HTML documents

There are two types of HTML documents - static and dynamic. Static documents are stored in files of the file system that is used by the web server or browser when viewing local files. When posting information on a web server, you can use dynamic documents - those that do not permanently exist in the form of files, but are generated at the time of a client request. Moreover, for the end user, it does not matter whether the documents are presented dynamically or statically.

To generate a dynamic HTML document, a specially written program is required according to the rules defined by the web server. When planning the placement of information on a web server, in order to correctly determine the use of any type of document, it is necessary to take into account the degree of data updating, its volume and frequency of access.

The dynamic method determines the storage of data in a formalized form, for example in a database.

If the data is stored in a formalized form, then static documents are generated using document templates in which changes have been made. To generate static documents, you can use any reporting tools available in the database management system (DBMS) with which the data is processed and formalized.

HTML Perspectives

There will be no new versions of the HTML language, but there is a further development of HTML called XHTML (English: Extensible Hypertext Markup Language). While XHTML is comparable in capabilities to HTML, it has more stringent syntax requirements. Like HTML, XHTML is a subset of the SGML language, but XHTML, unlike its predecessor, conforms to the XML specification. XHTML 1.0 was approved as a Recommendation by the World Wide Web Consortium (W3C) on January 26, 2000. It is necessary, however, to take into account one serious detail - a large number of information resources have been created in this format, and they will be “understood” by web browsers for a long time and used in their original form. In addition, all new formats will be developed (and are already being developed - for example XML) with support for HTML technologies.

The way we work is changing, and so are the means of accessing content. The HTML language was originally created as a platform-independent language. New technologies are being used almost everywhere and pretty soon the World Wide Web space will no longer be the property of only users of desktop personal computers; already now, some users are actively using voice browsers for the blind or browsers that use Broglie alphabet; often the content is displayed not on a computer monitor, but on a TV, when set-top boxes with access to the network or teletype, or to monochrome displays of various pager organizers and others are used.

Internet Engineering Task Force) published a draft proposal for an HTML standard

HTML Document Structure

An HTML 4 document consists of three parts:

  • a string containing HTML version information,
  • declaring header section (bounded by the HEAD element),
  • the body containing the document itself.

The body can be contained in BODY or FRAMESET elements. Whitespace characters(spaces, newlines, tab characters, and comments) may appear before or after this section.

Simple page

Hello world!

The document begins with an element type document, or doctype. It describes what type of HTML will be used so that the user's client application can determine how to interpret the document and decide whether it follows the rules it claims to follow.

After this, you can see the opening tag of the html element. This is a wrapper around the entire document. The closing html tag is the last object in any HTML document.

Inside the html element there is a head element. It contains information about the document (metadata). Inside head is a title element that defines the "Simple page" title in the menu bar.

After the head element comes the body element, which is the wrapper that contains the actual content of the page - in this case, just the first-level header element (h1), which contains the text "Hello world!" .

Elements often contain other elements. The body of the document will always contain many nested elements.

Page sections create the overall structure of the document, and can contain subsections. They can also contain headings, paragraphs, lists, etc. Paragraphs can contain elements that create links to other elements, quotes, highlights, etc.

HTML element syntax

A basic element in HTML consists of two tags surrounding a block of text. There are elements that do not wrap text, and in almost every case elements can contain subelements (just as html contains head and body in the example above).

Items may also have attributes, which can modify the behavior of the element and introduce additional value.

Basics HTML

In this example, the div element (the section of the page, the way documents are broken up into logical blocks) has an id attribute added that is set to masthead. The div element contains an h1 element (the first, or most important, level heading), which in turn contains some text. Some of this text is wrapped in an element abbr(which is used to define abbreviation extension) which has a title attribute whose value is set to Hypertext Markup Language.

Many attributes in HTML are common to all elements, but some are specific to a given element or elements. They all have the form:

keyword="value"

The value must be placed in single or double quotes (in some situations the quotes may be missing, but this is not very good in terms of predictability, understanding).

Attributes and their possible values ​​are defined primarily by the HTML specifications (http://www.w3.org/TR/html401/index/attributes.html), so you cannot create your own attributes. The only real exceptions are the id and class attributes, whose entire values ​​are intended to add your own meaning and semantics to documents.

An element inside another element is called "descendant" this element. In the example above abbr is a child of h1 , which in turn is a child of div . Conversely, div is the "ancestor" of the h1 element.

Block-level elements and inline elements

There are two main categories of elements in HTML, which correspond to the types of content and the structure that these elements represent - block level elements and inline elements.

Block level means a higher level element, usually informing about the structure of the document. Block-level elements can be thought of as elements that start on a new line, breaking away from what came before it. Common block elements are paragraphs, bullet points, headings and tables.

String elements are contained within block-level structural elements and cover only parts of the document text, not entire areas. An inline element does not result in a new line appearing in the document, because they are elements that appear in a paragraph of text. Common string elements are hypertext links, highlighted words or phrases and short quotations.

Heading

The head of an HTML document is an optional markup element. Initially, the existence of the title was determined by the need browser window naming. This was achieved through markup element TITLE:

This is the title ... ...

Another function of the HTML document header is to control HTTP traffic via markup element META. With the current practice of hosting company Web sites on provider servers, administrators of these sites may not be able to manage the server program. In this case, there is only one option left to control the exchange - through the header of the HTML document.

The header of an HTML document is also intended to describe the search image of the document, which is necessary for indexing the document by search engine robots. The META element allows you to store lists of keywords and document descriptions that will be used to compile a search engine index and appear as a description of the document if a link to it is returned in a keyword search.

Basic header tags are HTML markup elements that are most often found in the head of an HTML document, i.e. inside markup element HEAD:

  • TITLE (document title);
  • BASE (URL base);
  • ISINDEX (search pattern);
  • META (meta information);
  • LINK (general links);
  • STYLE (style descriptors);
  • SCRIPT (scripts).

The most commonly used elements are TITLE, SCRIPT, STYLE. The use of the META element indicates the author's awareness of the rules for indexing documents in search engines and the ability to manage HTTP data exchange. BASE and ISINDEX have hardly been used lately. LINK is specified only when using style sheet descriptors external to the document.

Markup element HEAD contains the head of the HTML document. The markup element is optional. If there is a start tag markup element It is advisable to use an end tag as well markup element. By default, the HEAD element is closed if either the BODY container start tag or the FRAMESET container start tag is encountered.

The header container is used to contain information related to the document as a whole.

Markup element TITLE is used to name a document on the World Wide Web. When selecting text for the content of the TITLE container, be aware that it is displayed system font, since it is the title of the browser window.

The general syntax of the TITLE container is as follows:

document's name

The header is not a required document container. It can be lowered. The robots of many search engines use the contents of the TITLE element to create a search image of the document. Words from TITLE are included in the search engine index. For these reasons, it is always recommended to use the TITLE element on Web site pages.

Markup element BASE is used to determine the base URL for document hypertext links specified in incomplete (partial) form. In addition, BASE allows you to define the default document loading target window when you select a hypertext link for the current document. BASE is most often found on pages of sites that have “mirrors”. Some documents from the main server are not transferred to the “mirror” server for various reasons. In this case, a document with a forced base URL will always link to the main server.

The container start tag contains one required HREF attribute, and can contain one optional TARGET attribute. The general syntax of a BASE container is as follows:

Markup element ISINDEX is used to specify a search pattern and is inherited from earlier versions of HTML. In HTML 4.0 this container is not defined.

META markup element

META contains control information that the browser uses to correctly display and process the content of the body of the document, for example, using the Content-type attribute, you can specify the recoding of the document on the client side.

You can also specify other operators using META. For example, disable document caching. To disable caching, just insert a tag like this into the META header:

In the new version of the HTTP protocol (HTTP 1.1), caching is controlled through the Cache-Control statement. To obtain the same result as in the case of Pragma, in the header of the HTML document it is enough to indicate:

You can prohibit storing a document after forwarding.

I. Basic information aboutHTML.

In recent years, developments for the Internet have evolved from static pages to dynamic information systems. Some time ago, creating modern Web pages required little more than a perfect command of Hypertext Markup Language (HTML).

HTMLis a simple word processing language; in this language using a set of tags (tags) a document is created that can be viewed with a special viewerWeb (browser).

HTML is not a programming language in the same sense as C++ or Visual Basic; it is more like a document formatter using escape sequences. HTML coding is often compared to creating a Microsoft Word document by typing formatting codes directly into Notepad. Obviously, this has very little functionality.

Under hypertext document understand a document that contains links to another document. All this was implemented through Hypertext Transfer Protocol HTTP(Hyper Text Transfer Protocol).

Information in Web documents can be found using keywords. This means that each Web browser contains specific links that create hyperlinks that allow millions of Internet users to search for information around the world.

Hypertext documents are created based on HTML (Hyper Text Markup Language). This language is very simple; its control codes, which are actually compiled by the browser for display on the screen, consist of ASCII text. Links, lists, headings, pictures and forms are called elements of the HTML language.

Currently, there are a lot of Web page editors that do not require you to know the basics of HTML. But in order to be able to professionally prepare hypertext documents, you must know their internal structure, that is, the HTML document code.

HTML allows you to generate various hypertext information based on structured documents.

The browser identifies the generated links and, through the HyperText Transfer Protocol (HTTP), makes your document available to other Internet users. Of course, to successfully implement all this, you need software that is fully compatible with the WWW and supports HTML.

II. HTML Description

HTML document - this is a regular text file. Using any Web browser, you can view the result of your work by simply loading a text file created using HTML syntax into it.

Hypertext language provides read-only information. This means that only the person who created them, and not an ordinary Internet user, can edit Web pages.

Most main element of hypertext language- This links. On the World Wide Web, you simply click on a link and instantly find yourself at another point in the world on the page of your choice.

Tags.

Tag- formatted unit of HTML code.

Tag HTML consists of the following elements in a certain order:

  • left corner bracket< (такого же, как "меньше чем" символа)
  • an optional slash /, which means the tag is an end tag that closes some structure. So in this context you can read the / symbol as end...
  • tag name, such as TITLE or PRE
  • optional, even if the tag can have them, attributes. A tag can be without attributes or accompanied by one or more attributes, for example: ALIGN=CENTER
  • right angle bracket > (same as the greater than symbol).

Most tags have opening element<> And closing. Between them are codes that the Web browser recognizes

In such cases, two tags and the part of the document separated by them form a block called HTML element. Some tags e.g.


, are HTML elements in themselves, and for them the corresponding end tag is incorrect.

For each tag, a set of possible attributes. Most tags allow one or more attributes, but there may be no attributes at all. Attribute specification consists of the following:

  • attribute name, such as WIDTH
  • equal sign (=)
  • attribute value, which is specified by a character string, for example, "80".

Always useful enclose the attribute value in quotes, using either single quotes ("80") or double quotes ("80"). A quoted string must not contain quotes within itself. So, if a date is enclosed in double quotes, use single quotes to enclose it in quotes afterwards, and vice versa. You can also omit the quotes for attribute values ​​that consist only of the following characters:

  • English alphabet characters (A - Z, a - z)
  • numbers (0 - 9)
  • periods of time
  • hyphens (-)