What does the sign mean in ascii. ASCII encoding (American standard code for information interchange) - basic text encoding for the Latin alphabet

A computer understands the process of converting it into a form that allows for more convenient transmission, storage or automatic processing of this data. Various tables are used for this purpose. ASCII was the first system developed in the United States for working with English-language text, which subsequently became widespread throughout the world. Its description, features, properties and further use The article below is devoted to this.

Display and storage of information in a computer

Symbols on a computer monitor or a mobile phone digital gadget are formed on the basis of sets of vector forms of various characters and a code that allows you to find among them the character that needs to be inserted into right place. It represents a sequence of bits. Thus, each character must uniquely correspond to a set of zeros and ones, which appear in a certain, unique order.

How it all started

Historically, the first computers were English-language. To encode symbolic information in them, it was enough to use only 7 bits of memory, while 1 byte consisting of 8 bits was allocated for this purpose. The number of characters understood by the computer in this case was 128. These characters included the English alphabet with its punctuation marks, numbers and some special characters. The English-language seven-bit encoding with the corresponding table (code page), developed in 1963, was called the American Standard Code for Information Interchange. Usually, the abbreviation “ASCII encoding” was and is still used to denote it.

Transition to multilingualism

Over time, computers became widely used in non-English speaking countries. In this regard, there was a need for encodings that allow the use national languages. It was decided not to reinvent the wheel and take ASCII as a basis. Encoding table in new edition has expanded significantly. The use of the 8th bit made it possible to translate into computer language already 256 characters.

Description

The ASCII encoding has a table that is divided into 2 parts. Only its first half is considered to be a generally accepted international standard. It includes:

  • Characters with serial numbers from 0 to 31, encoded in sequences from 00000000 to 00011111. They are reserved for control characters that control the process of displaying text on the screen or printer, sounding a sound signal, etc.
  • Characters with NN in the table from 32 to 127, encoded by sequences from 00100000 to 01111111 form the standard part of the table. These include space (N 32), letters of the Latin alphabet (lowercase and uppercase), ten-digit numbers from 0 to 9, punctuation marks, brackets of different styles and other symbols.
  • Characters with serial numbers from 128 to 255, encoded by sequences from 10000000 to 11111111. These include letters of national alphabets other than Latin. It is this alternative part of the ASCII table that is used to convert to computer form Russian symbols.

Some properties

Features of the ASCII encoding include the difference between the letters “A” - “Z” lower and uppercase only one bit. This circumstance greatly simplifies register conversion, as well as checking whether it belongs to a given range of values. In addition, all letters in the ASCII encoding system are represented by their own sequence numbers in the alphabet, which are written with 5 digits in the binary number system, preceded by 011 2 for lowercase letters and 010 2 for uppercase letters.

One of the features of the ASCII encoding is the representation of 10 digits - “0” - “9”. In the second number system they start with 00112 and end with 2 number values. So, 0101 2 is equivalent to the decimal number five, so the symbol “5” is written as 0011 01012. Based on the above, you can easily convert binary decimal numbers to an ASCII string by adding the bit sequence 00112 to each nibble on the left.

"Unicode"

As you know, thousands of characters are required to display texts in the languages ​​of the Southeast Asian group. Such a number of them cannot be described in any way in one byte of information, so even extended versions of ASCII could no longer satisfy the increased needs of users from different countries.

Thus, the need arose to create a universal text encoding, the development of which, in collaboration with many leaders of the global IT industry, was undertaken by the Unicode consortium. Its specialists created the UTF 32 system. In it, 32 bits were allocated to encode 1 character, constituting 4 bytes of information. The main disadvantage was the sharp increase in volume required memory as much as 4 times, which entailed many problems.

At the same time, for most countries with official languages, belonging to the Indo-European group, the number of characters equal to 2 32 is more than excessive.

As a result further work Specialists from the Unicode consortium introduced the UTF-16 encoding. It became the option for converting symbolic information that suited everyone both in terms of the amount of memory required and the number of encoded characters. That is why UTF-16 was adopted by default and requires 2 bytes to be reserved for one character.

Even this fairly advanced and successful version of Unicode had some drawbacks, and after the transition from the extended version of ASCII to UTF-16, the weight of the document doubled.

In this regard, it was decided to use UTF-8 variable length encoding. In this case, each character of the source text is encoded as a sequence of length from 1 to 6 bytes.

Contact with American standard code for information exchange

All Latin characters in UTF-8 variable length are encoded into 1 byte, as in the ASCII encoding system.

A special feature of YTF-8 is that in the case of text in Latin without using other characters, even programs that do not understand Unicode will still be able to read it. In other words, the base ASCII text encoding simply becomes part of the new variable-length UTF. Cyrillic characters in YTF-8 occupy 2 bytes, and, for example, Georgian characters - 3 bytes. By creating UTF-16 and 8, the main problem of creating a single code space in fonts was solved. Since then, font manufacturers can only fill out the table vector shapes characters of text based on your needs.

Different operating systems prefer different encodings. To be able to read and edit texts typed in a different encoding, Russian text conversion programs are used. Some text editors contain built-in transcoders and allow you to read text regardless of encoding.

Now you know how many characters are in the ASCII encoding and how and why it was developed. Of course, today the Unicode standard is most widespread in the world. However, we must not forget that it is based on ASCII, so the contribution of its developers to the IT field should be appreciated.

The set of characters with which text is written is called alphabet.

The number of characters in the alphabet is its power.

Formula for determining the amount of information: N=2b,

where N is the power of the alphabet (number of characters),

b – number of bits ( information weight character).

The alphabet with a capacity of 256 characters can accommodate almost all the necessary characters. This alphabet is called sufficient.

Because 256 = 2 8, then the weight of 1 character is 8 bits.

The unit of measurement 8 bits was given the name 1 byte:

1 byte = 8 bits.

The binary code of each character in computer text takes up 1 byte of memory.

How is text information represented in computer memory?

The convenience of byte-by-byte character encoding is obvious because a byte is the smallest addressable part of memory and, therefore, the processor can access each character separately when processing text. On the other hand, 256 characters is quite a sufficient number to represent a wide variety of symbolic information.

Now the question arises, which eight-bit binary code match each character.

It is clear that this is a conditional matter; you can come up with many encoding methods.

All characters of the computer alphabet are numbered from 0 to 255. Each number corresponds to an eight-bit binary code from 00000000 to 11111111. This code is simply the serial number of the character in the binary number system.

A table in which all the characters of the computer alphabet are assigned to each other serial numbers, is called an encoding table.

For different types Computers use different encoding tables.

The table has become the international standard for PCs ASCII(read aski) (American standard code for information exchange).

The ASCII code table is divided into two parts.

Only the first half of the table is the international standard, i.e. symbols with numbers from 0 (00000000), up to 127 (01111111).

ASCII encoding table structure
Serial number Code Symbol
0 - 31 00000000 - 00011111

Symbols with numbers from 0 to 31 are usually called control symbols.
Their function is to control the process of displaying text on the screen or printing, sounding a sound signal, marking up text, etc.

32 - 127 00100000 - 01111111

Standard part of the table (English). This includes lowercase and capital letters Latin alphabet, decimal numbers, punctuation marks, all kinds of brackets, commercial and other symbols.
Character 32 is a space, i.e. empty position in the text.
All others are reflected in certain signs.

128 - 255 10000000 - 11111111

Alternative part of the table (Russian).
Second half of the code ASCII tables, called a code page (128 codes, starting from 10000000 and ending with 11111111), may have various options, each option has its own number.
The code page is primarily used to accommodate national alphabets other than Latin. In Russian national encodings, characters from the Russian alphabet are placed in this part of the table.

First half of the ASCII code table

I draw your attention to the fact that in the encoding table the letters (uppercase and lowercase) are located in alphabetical order, and the numbers are ordered in ascending order. This observance of lexicographic order in the arrangement of symbols is called the principle of sequential coding of the alphabet.

For letters of the Russian alphabet, the principle of sequential coding is also observed.

Second half of the ASCII code table

Unfortunately, there are currently five different encodings Cyrillic (KOI8-R, Windows. MS-DOS, Macintosh and ISO). Because of this, problems often arise with transferring Russian text from one computer to another, from one software system to another.

Chronologically, one of the first standards for encoding Russian letters on computers was KOI8 ("Information Exchange Code, 8-bit"). This encoding was used back in the 70s on computers of the ES computer series, and from the mid-80s it began to be used in the first Russified versions operating system UNIX.

From the early 90s, the time of dominance of the MS DOS operating system, the CP866 encoding remains ("CP" means "Code Page", "code page").

Computers Apple operating under operating room control Mac systems OS, use their own Mac encoding.

In addition, the International Standards Organization (ISO) has approved another encoding called ISO 8859-5 as a standard for the Russian language.

The most common currently is Microsoft encoding Windows, abbreviated CP1251.

Since the late 90s, the problem of standardizing character encoding has been solved by introducing a new international standard which is called Unicode. This is a 16-bit encoding, i.e. it allocates 2 bytes of memory for each character. Of course, this increases the amount of memory occupied by 2 times. But such a code table allows the inclusion of up to 65536 characters. The complete specification of the Unicode standard includes all the existing, extinct and artificially created alphabets of the world, as well as many mathematical, musical, chemical and other symbols.

Let's try using an ASCII table to imagine what words will look like in the computer's memory. Internal representation of words in computer memory

Sometimes it happens that a text consisting of letters of the Russian alphabet received from another computer cannot be read - some kind of “abracadabra” is visible on the monitor screen. This happens because computers use different character encodings for the Russian language.

Excel for Office 365 Word for Office 365 Outlook for Office 365 PowerPoint for Office 365 Publisher for Office 365 Excel 2019 Word 2019 Outlook 2019 PowerPoint 2019 OneNote 2016 Publisher 2019 Visio Professional 2019 Visio Standard 2019 Excel 2016 Word 2016 Outlook 2016 PowerPoint 2016 2013 Publisher 2016 Visio 2013 Visio Professional 2016 Visio Standard 2016 Excel 2013 Word 2013 Outlook 2013 PowerPoint 2013 Publisher 2013 Excel 2010 Word 2010 Outlook 2010 PowerPoint 2010 OneNote 2010 Publisher 2010 Visio 2010 Excel 2007 Word 2007 Outlook 20 07 PowerPoint 2007 Publisher 2007 Access 2007 Visio 2007 OneNote 2007 Office 2010 Visio Standard 2007 Visio Standard 2010 Less

In this article: Insert an ASCII or Unicode character into a document

If you only need to enter a few special characters or symbols, you can use keyboard shortcuts. List ASCII characters see the following tables or the article Inserting letters from national alphabets using keyboard shortcuts.

Notes:

Inserting ASCII characters

To insert an ASCII character, press and hold the ALT key while entering the character code. For example, to insert a degree symbol (º), press and hold the ALT key, then type 0176 into numeric keypad.

To enter numbers, use the numeric keypad rather than the numbers on the main keyboard. If you need to enter numbers on the numeric keypad, make sure the NUM LOCK indicator is on.

Inserting Unicode Characters

To insert a Unicode character, enter the character code, then press ALT keys and X. For example, to insert a dollar symbol ($), enter 0024 and press ALT and X in sequence. For all Unicode character codes, see .

Important: Some programs Microsoft Office, such as PowerPoint and InfoPath, do not support converting Unicode codes to characters. If you need to insert a Unicode character in one of these programs, use .

Notes:

    If the wrong Unicode character appears after you press ALT+X, select the correct code, and then press ALT+X again.

    In addition, you must enter "U+" before the code. For example, if you enter "1U+B5" and press ALT+X, the text "1µ" will be displayed, and if you enter "1B5" and press ALT+X, the symbol "Ƶ" will be displayed.

Using the symbol table

A symbol table is a program built into Microsoft Windows, which allows you to view the characters available for the selected font.

Using a symbol table, you can copy individual symbols or a group of symbols to the clipboard and paste them into any program that supports displaying those symbols. Opening the symbol table

    In Windows 10, enter the word "symbol" in the search box on the taskbar and select the symbol table from the search results.

    In Windows 8, enter the word "character" at home screen and select symbol table from the search results.

    In Windows 7, click the Start button, select All Programs, Accessories, System Tools, and then click Character Map.

Characters are grouped by font. Click the font list to select the appropriate character set. To select a symbol, click it, then click the Select button. To insert a symbol, click right click mouse over the desired location in the document and select Paste.

Frequently used character codes

Full list characters, see on your computer, ASCII character code table, or Unicode character tables organized by set.

Glyph

Glyph

Currency

Legal symbols

Mathematical symbols

Fractions

Punctuation and dialect symbols

Shape symbols

Commonly used diacritics codes

For a complete list of glyphs and corresponding codes, see.

Glyph

Glyph

Non-printing ASCII control characters

Signs used to control some peripheral devices, such as printers, are numbered 0–31 in the ASCII table. For example, the page feed/new page character is number 12. This character tells the printer to move to the beginning of the next page.

Table of non-printing ASCII control characters

Decimal number

Sign

Decimal number

Sign

Freeing the data channel

Start of title

First device control code

Beginning of text

Second device control code

End of text

Third device control code

End of transmission

Fourth device control code

five-pointed

Negative confirmation

Confirmation

Synchronous mode transfers

Beep

End of transmitted data block

Horizontal tabulation

End of media

Line feed/new line

Replacement symbol

Vertical tab

exceed

Page translation/new page

Twelve

File separator

Carriage return

Group separator

Shift without storing bits

Record separator

Bit-preserving shift

fifteen

Data separator

DecHexSymbol DecHexSymbol
000 00 specialist. NOP 128 80 Ђ
001 01 specialist. SOH 129 81 Ѓ
002 02 specialist. STX 130 82
003 03 specialist. ETX 131 83 ѓ
004 04 specialist. EOT 132 84
005 05 specialist. ENQ 133 85
006 06 specialist. ACK 134 86
007 07 specialist. BEL 135 87
008 08 specialist. B.S. 136 88
009 09 specialist. TAB 137 89
010 0Aspecialist. LF 138 8AЉ
011 0Bspecialist. VT 139 8B‹ ‹
012 0Cspecialist. FF 140 8CЊ
013 0Dspecialist. CR 141 8DЌ
014 0Especialist. SO 142 8EЋ
015 0Fspecialist. S.I. 143 8FЏ
016 10 specialist. DLE 144 90 ђ
017 11 specialist. DC1 145 91
018 12 specialist. DC2 146 92
019 13 specialist. DC3 147 93
020 14 specialist. DC4 148 94
021 15 specialist. N.A.K. 149 95
022 16 specialist. SYN 150 96
023 17 specialist. ETB 151 97
024 18 specialist. CAN 152 98
025 19 specialist. E.M. 153 99
026 1Aspecialist. SUB 154 9Aљ
027 1Bspecialist. ESC 155 9B
028 1Cspecialist. FS 156 9Cњ
029 1Dspecialist. G.S. 157 9Dќ
030 1Especialist. R.S. 158 9Eћ
031 1Fspecialist. US 159 9Fџ
032 20 clutch SP (Space) 160 A0
033 21 ! 161 A1 Ў
034 22 " 162 A2ў
035 23 # 163 A3Ћ
036 24 $ 164 A4¤
037 25 % 165 A5Ґ
038 26 & 166 A6¦
039 27 " 167 A7§
040 28 ( 168 A8Yo
041 29 ) 169 A9©
042 2A* 170 A.A.Є
043 2B+ 171 AB«
044 2C, 172 A.C.¬
045 2D- 173 AD­
046 2E. 174 A.E.®
047 2F/ 175 A.F.Ї
048 30 0 176 B0°
049 31 1 177 B1±
050 32 2 178 B2І
051 33 3 179 B3і
052 34 4 180 B4ґ
053 35 5 181 B5µ
054 36 6 182 B6
055 37 7 183 B7·
056 38 8 184 B8e
057 39 9 185 B9
058 3A: 186 B.A.є
059 3B; 187 BB»
060 3C< 188 B.C.ј
061 3D= 189 BDЅ
062 3E> 190 BEѕ
063 3F? 191 B.F.ї
064 40 @ 192 C0 A
065 41 A 193 C1 B
066 42 B 194 C2 IN
067 43 C 195 C3 G
068 44 D 196 C4 D
069 45 E 197 C5 E
070 46 F 198 C6 AND
071 47 G 199 C7 Z
072 48 H 200 C8 AND
073 49 I 201 C9 Y
074 4AJ 202 C.A. TO
075 4BK 203 C.B. L
076 4CL 204 CC M
077 4DM 205 CD N
078 4EN 206 C.E. ABOUT
079 4FO 207 CF P
080 50 P 208 D0 R
081 51 Q 209 D1 WITH
082 52 R 210 D2 T
083 53 S 211 D3 U
084 54 T 212 D4 F
085 55 U 213 D5 X
086 56 V 214 D6 C
087 57 W 215 D7 H
088 58 X 216 D8 Sh
089 59 Y 217 D9 SCH
090 5AZ 218 D.A. Kommersant
091 5B[ 219 D.B. Y
092 5C\ 220 DC b
093 5D] 221 DD E
094 5E^ 222 DE Yu
095 5F_ 223 DF I
096 60 ` 224 E0 A
097 61 a 225 E1 b
098 62 b 226 E2 V
099 63 c 227 E3 G
100 64 d 228 E4 d
101 65 e 229 E5 e
102 66 f 230 E6 and
103 67 g 231 E7 h
104 68 h 232 E8 And
105 69 i 233 E9 th
106 6Aj 234 E.A. To
107 6Bk 235 E.B. l
108 6Cl 236 E.C. m
109 6Dm 237 ED n
110 6En 238 E.E. O
111 6Fo 239 E.F. n
112 70 p 240 F0 r
113 71 q 241 F1 With
114 72 r 242 F2 T
115 73 s 243 F3 at
116 74 t 244 F4 f
117 75 u 245 F5 X
118 76 v 246 F6 ts
119 77 w 247 F7 h
120 78 x 248 F8 w
121 79 y 249 F9 sch
122 7Az 250 F.A. ъ
123 7B{ 251 FB s
124 7C| 252 F.C. b
125 7D} 253 FD uh
126 7E~ 254 F.E. yu
127 7FSpecialist. DEL 255 FF I
ASCII code table Windows characters.
Description of special (control) characters It should be noted that initially control characters of the ASCII table were used to ensure data exchange via teletype, data entry from punched tape and for simple control of external devices.
Currently, most of the ASCII table control characters no longer carry this load and can be used for other purposes. Code Description
NUL, 00Null, empty
SOH, 01Start Of Heading
STX, 02Start of TeXt, the beginning of the text.
ETX, 03End of TeXt, end of text
EOT, 04End of Transmission, end of transmission
ENQ, 05Enquire. Please confirm
ACK, 06Acknowledgment. I confirm
BEL, 07Bell, call
BS, 08Backspace, go back one character
TAB, 09Tab, horizontal tab
LF, 0ALine Feed, line feed.
Nowadays in most programming languages ​​it is denoted as \n
VT, 0BVertical Tab, vertical tabulation.
FF, 0CForm Feed, page feed, new page
CR, 0DCarriage Return, carriage return.
Nowadays in most programming languages ​​it is denoted as \r
SO,0EShift Out, change the color of the ink ribbon in the printing device
SI,0FShift In, return the color of the ink ribbon in the printing device back
DLE, 10 Data Link Escape, channel switching to data transmission
DC1, 11
DC2, 12
DC3, 13
DC4, 14
Device Control, device control symbols
NAK, 15Negative Acknowledgment, I do not confirm.
SYN, 16Synchronization. Synchronization symbol
ETB, 17End of Text Block, end of the text block
CAN, 18Cancel, canceling previously transferred
EM, 19End of Medium
SUB, 1ASubstitute, substitute. Placed in place of a symbol whose meaning was lost or corrupted during transmission
ESC, 1BEscape Control Sequence
FS, 1CFile Separator, file separator
GS, 1DGroup Separator
RS, 1ERecord Separator, record separator
US, 1FUnit Separator
DEL, 7FDelete, erase the last character.
12/19/13 23.8K

In order to use ASCII correctly, it is necessary to expand your knowledge in this area and about coding capabilities.

What is it?

ASCII is an encoding table of printable characters (see screenshot No. 1) typed on computer keyboard, to transmit information and some codes. In other words, the alphabet and decimal digits are encoded into corresponding symbols that represent and carry the necessary information.


ASCII was developed in America, so the standard character set usually includes the English alphabet with numbers, for a total of about 128 characters. But then a fair question arises: what to do if encoding of the national alphabet is required?

Other versions of the ASCII table have been developed to address similar issues. For example, for languages ​​with a foreign language structure, letters of the English alphabet were either removed or added additional characters in the form of a national alphabet. Thus, the ASCII encoding may contain Russian letters for national use (see screenshot No. 2).

Where is the ASCII coding system used?

This coding system is necessary not only for dialing text information on the keyboard. It is also used in graphics. For example, in the ASCII Art Maker program graphic images various extensions consist of a range of ASCII characters (see screenshot No. 3).


As a rule, such programs can be divided into those that perform the function graphic editors, inverting an image into text, and those that convert an image into ASCII graphics. The well-known emoticon (or as it is also called “smiling human face”) is also an example of an encoding symbol.

This encoding method can also be used during writing or creation HTML document. For example, you enter a specific and necessary set of characters, and when viewing the page itself, the symbol corresponding to this code will be displayed on the screen.

Among other things this type encoding is necessary when creating a multilingual website, because characters that are not included in a particular national table will need to be replaced with ASCII codes. If the reader is directly connected with information and communication technologies (ICT), then it will be useful for him to familiarize himself with such systems as:

  • Portable character set;
  • Control characters;
  • EBCDIC;
  • VISCII;
  • YUSCII;
  • Unicode;
  • ASCII art;
  • KOI-8.
  • ASCII Table Properties

    Like any systematic program, ASCII has its own characteristic properties. So, for example, the decimal number system (numbers from 0 to 9) is converted to binary system calculus (i.e. each decimal digit is converted to binary 288=1001000 respectively).

    The letters located in the upper and lower columns differ from each other only by a bit, which significantly reduces the level of complexity of checking and editing the case.

    With all these properties, ASCII encoding works as eight-bit, although it was originally intended to be seven-bit.

    Application of ASCII in Microsoft programs Office:

    If necessary this option information encoding can be used in Microsoft Notepad and Microsoft Office Word. Within these applications, the document can be saved in ASCII format, but in this case, you will not be able to use some functions when typing text.

    In particular, bold and bold fonts will not be available because encoding only preserves the meaning of the typed information, and not general view and shape. You can add such codes to a document using the following software applications.