Character encodingÂ
Character encoding is a method used to represent the characters and symbols used in written language in digital form. A character encoding maps characters from a specific character set to a numerical representation that can be stored and processed by computers.
For example, the ASCII (American Standard Code for Information Interchange) character encoding maps each character in the English alphabet to a unique number between 0 and 127. The Unicode character encoding standard is an extension of ASCII and includes a much larger character set, including characters from many different writing systems and symbols used worldwide. In Unicode, each character is assigned a unique number, known as a code point.
For example, the character ‘A’ can be represented in ASCII as the number 65, and in Unicode as the number 65 or U+0041. When a document or text file is saved, the characters in that file are encoded into a specific character encoding and stored on a computer or transmitted over a network. When that file is later read, the numerical representations are decoded back into the original characters.