The ISO Latin 1 character repertoire - a description with usage notes,section 3


Detailed descriptions of the characters

SPACE

This is the well-known space character, or blank. The abbreviation SP is often used for the name of the character. The ISO 8859-1 standard defines this character formally as follows:

This character may be interpreted as a graphic character, a control character or as both. As a graphic character it has the visual representation consisting of the absence of a graphic symbol.

In different programs for processing and displaying texts, spaces in data may be handled in different ways. In particular, the inter-word gaps can be of different widths in visual presentation. In the HTML language, spaces are treated as "collapsible".

EXCLAMATION MARK  

This character (!) is basically used as a punctuation character at the end of an exclamation. It is also used in mathematics to denote a factorial (as in "5!" which denotes 1×2×3×4×5). Many other special usages exist; e.g. in the C programming language, the exclamation mark denotes a "not" operator!

Cf. to inverted exclamation mark (¡).

In the orthography of some African languages, the character Latin letter retroflex click (U+01C3) is used to denote a sound, e.g. in the name "!Kung" (denoting a people in southern Africa). The glyph of the character is typically similar to exclamation mark, but in principle the two characters are distinct.

QUOTATION MARK 

This character (") is a "symmetric" quotation mark as opposite to "smart" or "asymmetric" quotation marks. That is, when this character is used to mark quotations, the opening quote is identical with the closing quote. Its glyph is "neutral" (vertical) to reflect this.

In Unicode, there are several pairs of asymmetric quotation marks, but of them, only the double angle quotation marks « and » belong to ISO Latin 1. Notice in particular that left and right double quotation marks (U+201C, U+201D) do not belong to ISO Latin 1 (although they belong to the so-called Windows character set).

The rules for using quotation marks vary greatly from one language to another and even within a language. For example, But when ISO Latin 1 is used, there are not many choices: you have to live with " and ' and « and ». It is much better to use these characters for quotations even if they are regarded as typographically inferior than to try to "construct" smart quotes from characters which are not quotes. See general reasons for being strict about meanings of characters. For example, section Quotation Marks in A Handbook for Technical Writers and Editors (by Mary K. McCaskill) should be read with caution in this respect. Also please notice that even in English there are also styles different from the one described there; for example, single quotes (to be presented using apostrophes in ISO Latin 1) might be used as normal quotes and quotation marks as inner quotes.

The Unicode standard explicitly says that APL quote is identical with the quotation mark. In addition to that, the quotation mark is used in many other programming and command languages, typically to delimit string constants. In some of such languages, a string can be delimited using either quotation marks or apostrophes with no change in meaning, whereas in some others there is a definite difference; for example, in the C language, quotation marks delimit string constants whereas apostrophes delimit character constants.

In practice, the quotation mark is also widely used as the following symbols, although they are in principle distinct from it (and each other) in Unicode:

In ASCII, the quotation mark was intended to have secondary usage as diaeresis. See notes on diacritics.

NUMBER SIGN 

In English and some other natural languages, this character (#) is sometimes used in conjunction with ordinal numbers, as in "item #42" (meaning "item number 42"). Such usage is not very common; more often, abbreviations like "nr.", "no.", "n.", or "Nº" are used instead.

In programming languages, markup languages, etc., this character has many different uses. In some of these uses, # relates to ordinal numbers (e.g. in HTML, &#n denotes the character which occupies code position n in Unicode) while in others it might be just a separator or have some special meaning assigned to it more or less arbitrarily.

The number sign has also been used as a surrogate for the music sharp sign (U+266F), due to some similarity in appearance.

The number sign character unambiguously occupies code position 23 hexadecimal in ISO 8859-1 and in Unicode, although the Unicode standard confusingly mentions "pound sign" as an alternative name to it. Probably "pound" here means a unit of weight, not currency. However, in ASCII that code position was primarily assigned to the pound sterling sign, and some programs and devices might reflect this in their behavior (displaying £ when the data contains #). The ASCII standard says:

The symbol £ is assigned to position 23 [hexadecimal] - -. In a situation where there is no requirement for the symbol £ the symbol # (number sign) may be used in position 23. - - The chosen allocation of [a symbol to this position] for international information exchange shall be agreed between the interested parties.

Notice that the pound sign (as a currency symbol) belongs to ISO Latin 1 as a completely independent symbol in its own code position.

For notes on different names and usages for the number sign, see section names of "&", "@", and "#" in alt.usage.english FAQ.

DOLLAR SIGN 

This character ($) is a famous currency symbol, but its exact meaning is not quite clear. The Unicode standard explicitly says:

this code is unambiguously dollar sign, not "currency" sign or any other currency symbol

But this is obviously to be interpreted mainly as a warning against the use of the sign to denote a currency generically (cf. to (general) currency sign, which belongs to ISO Latin 1 as a completely independent symbol in its own code position). It is not intended to limit the use to denote only those currencies which are named "dollar", still less US dollar only. In fact, the English word "dollar" has a rather general meaning, covering "taler" as well as numerous coins patterned after the taler (as a Spanish peso). The Unicode standard mentions "milreis" and "escudo" as alternative names for dollar sign, so obviously the symbol can be used to denote those currencies, too.

For historical notes on the origin of the $ character, see section Origin of the dollar sign in alt.usage.english FAQ.

In computing, this character has secondary uses which may have nothing to do with any currency. It can, for example, be a character allowed in identifiers and used to signal a reserved or otherwise special identifier.

According to the Unicode standard, a glyph for the dollar sign may have one or two vertical bars.

The dollar sign unambiguously occupies code position 24 hexadecimal in ISO 8859-1 and in Unicode. However, in the ASCII the situation was more vague, and some programs and devices might reflect that in their behavior (e.g. displaying ¤ when the data contains $). The ASCII standard says:

The - - symbol $ is assigned to position 24 [hexadecimal] - - Where there is no requirement for the symbol $ the symbol ¤ (currency sign) may be used in position 24. The chosen allocation of [a symbol to this position] for international information exchange shall be agreed between the interested parties.

PERCENT SIGN

This character (%) is basically used after numbers, in the meaning 'in the hundred' or 'of each hundred'. In programming languages, for example, it has very different secondary uses.

Per mille sign (U+2030) and per ten thousand sign (U+2031) do not belong to ISO Latin 1. For the per mille sign, misconceptions about this may have arisen from confusion with the so-called Windows character set.

AMPERSAND

In natural languages, this character (&) normally means just 'and'. In other contexts, it has many other uses.

APOSTROPHE

This character has mixed usage. Most commonly, it is used either as an apostrophe as in English "don't" or as a single quote. (In Unicode version 1.0, this character was named "apostrophe-quote" to reflect this.) As regards to use as a single quote, see notes on the use of the quotation mark.

In the future, as support to Unicode becomes wider, the use of this character should mostly be replaced by the use of more specific characters.

Version 2.0 of the Unicode standard says that "the preferred character for apostrophe" is the character modifier letter apostrophe (U+02BC); but an official change ("corrigendum") to Unicode standard changes this to the following:

  • U+02BC modifier letter apostrophe is preferred where the character is to represent a modifier letter (for example, in transliterations to indicate a glottal stop). In the latter case, it is also referred to as a letter apostrophe.
  • U+2019 right single quotation mark is preferred where the character is to represent a punctuation mark, as in "We've been here before." In the latter case, U+2019 is also referred to as a punctuation apostrophe.

The Unicode standard also says that the preferred characters for opening and closing single quotation mark are U+2018 and U+2019.

The rules for using the apostrophe vary from one language to another, and even from one authority to another. For a good summary of one usage style in English, see section Apostrophe in A Handbook for Technical Writers and Editors.

Unicode defines modifier letter prime (U+02B9) and prime (U+2032) as distinct characters. The former is used mainly in linguistics to denote primary stress or palatalization (e.g. when transliterating Cyrillic soft sign). The latter is used to denote minutes or feet. When only ISO Latin 1 character repertoire is available, apostrophe can be used as a surrogate for those characters. It might look natural to use acute accent for some of such purposes, but since the whole idea is to use a replacement due to character repertoire restrictions, it is best to use a replacement that works most widely (due to being an ASCII character).

In ASCII, the apostrophe was intended to have secondary usage as acute accent. See notes on diacritics.

LEFT PARENTHESIS

Used as an opening delimiter for parenthetic remarks in natural languages. The rules for using such vary from one language to another, and even from one authority to another. For a good summary of one usage style in English, see section Parentheses in A Handbook for Technical Writers and Editors.

In other languages, there are various uses such as an opening delimiter for a list of parameters. Called opening parenthesis (U+2032) in Unicode version 1.0.

RIGHT PARENTHESIS 

Used as a closing delimiter for parenthetic remarks in natural languages. In other languages, there are various uses such as an closing delimiter for a list of parameters. Called closing parenthesis (U+2032) in Unicode version 1.0.

ASTERISK

This character (*) has various uses, especially in programming languages, command languages, etc. Often it is used as a wildcard character which matches any string of characters.

In many programming languages, asterisk is the multiplication symbol. When writing or quoting expressions in such languages, the asterisk shall be preserved of course. On the other hand, such usage should not be extended to other contexts; in ISO Latin 1 there is a separate multiplication sign.

PLUS SIGN

The well-known plus sign, primarily used to denote addition and as an unary plus. Notice in ISO Latin 1 the combination of plus and minus is available as a separate character, plus-minus sign ( ±).

COMMA

Primarily this character (,) is a punctuation symbol in natural languages. The rules for using it vary from one language to another, and even from one authority to another. For a good summary of one usage style in English, see section Comma in A Handbook for Technical Writers and Editors.

Notice that in numbers, some languages (mainly English) use comma as thousands separator (e.g. "1,234" means one thousand two hundred thirty-four) whereas in many other languages it is used as a decimal point (e.g. "1,234" means the same as "1.234" in English).

In ASCII, the comma was intended to have secondary usage as cedilla. See notes on diacritics.

HYPHEN, MINUS SIGN (HYPHEN-MINUS)

This character (-) is a dual-purpose character: it can be used as a hyphen or as a minus sign. It can usually be called "hyphen" or "minus" depending on the context, but when referred to as a character in a character repertoire, the best term is probably hyphen-minus.

The rules for using the hyphen vary from one language to another, and even from one authority to another. For a good summary of one usage style in English, see section Hyphen in A Handbook for Technical Writers and Editors.

Hyphens are also widely used as a replacement for various dashes when dashes themselves are not in the available character repertoire; see my document On the use of some MS Windows characters in HTML for the use of hyphens as a replacement for em dash and en dash.

The Unicode standard mentions "hyphen or minus sign" and "hyphus" as synonyms for this character. It is best to avoid these synonyms, since the former makes statements ambiguous and the latter is just an invented word which is hardly ever used in reality.

In situations where sufficient support to Unicode can be safely assumed, it is best to replace the use of hyphen-minus by Unicode hyphen (U+2010) or non-breaking hyphen (U+2011) or minus sign (U+2212) or if hyphen-minus had been used e.g. in place of a dash symbol, some other Unicode character such as en dash (U+2013) or em dash (U+2014).

Cf. to soft hyphen.

FULL STOP

This character (.) is probably better known under the name "period" (which was the name used for it in Unicode version 1.0). The Unicode standard uses (in section 3.2) this character to illustrate that "a character may have a broader range of use than the most literal interpretation of its name might indicate". It says:

U+002E full stop can represent a sentence period, an abbreviation period, a decimal number separator in English, a thousands number separator in German, and so on.

The rules for using the period vary from one language to another, and even from one authority to another. For a good summary of one usage style in English, see section Period in A Handbook for Technical Writers and Editors.

There is a separate character horizontal ellipsis (U+2026) in Unicode, but it does not belong to ISO Latin 1. One may therefore wish to use three full stops (...) instead as three points of ellipsis.

SOLIDUS

This character (/) is much more widely known as "slash" (which was its name in Unicode version 1.0). It is sometimes called "virgule" or even "shilling".

Solidus is used for many different purposes, typically as a separator of some kind. For example, a date notation like 3/4 might mean the 3rd of April - or the 4th of March; in ISO 8601 notation, the solidus is used when expressing a time interval (e.g. 1998-03-04/04-03 unambiguously means 'from 4th of March to 3rd of April in 1998'). Sometimes the solidus separates alternatives, e.g. in a fill-out form, with the suggestion to strike out the inapplicable alternative(s). In natural languages, it seems to be fashionable to use it instead of the word "or", perhaps because the solidus symbol is less definite.

Unicode defines fraction slash (U+2044) and division slash (U+2215) as characters distinct from solidus and from each other. (Notice that rules for using the solidus in various languages do not yet make this distinction. See e.g. section Slash in A Handbook for Technical Writers and Editors.) When only ISO Latin 1 character repertoire is available, solidus can be used as a surrogate for fraction slash. For division slash, the division sign is perhaps preferable.

Notice that for three commonly used fractions there are separate "vulgar fraction" characters in ISO Latin 1.

DIGIT ZERO  

A digit. Definitely distinct from the letter O.

DIGIT ONE

A digit. Definitely distinct from the letter l (el). Cf. to superscript one (¹).

DIGIT TWO

A digit. Cf. to superscript two (²).

DIGIT THREE 

A digit. Cf. to superscript three (³).

DIGIT FOUR  

A digit.

DIGIT FIVE  

A digit.

DIGIT SIX

A digit.

DIGIT SEVEN 

A digit.

DIGIT EIGHT 

A digit.

DIGIT NINE  

A digit.

COLON 

This character (:) is used as a punctuation symbol in natural and other languages. The rules for using it vary from one language to another, and even from one authority to another. For a good summary of one usage style in English, see section Colon in A Handbook for Technical Writers and Editors.

The colon is also used when presenting ratios (proportions) as in "2:3", but in Unicode the character ratio (U+2236) should be used instead in such contexts.

SEMICOLON

This character (;) is used as a punctuation symbol in natural and other languages. The rules for using it vary from one language to another, and even from one authority to another. For a good summary of one usage style in English, see section Semicolon in A Handbook for Technical Writers and Editors.

LESS-THAN SIGN 

This character (<) basically denotes a mathematical relation. It is used for some secondary purposes as well, such as an angle bracket. See also notes on using < and > as brackets.

ISO Latin 1 does not contain a "less than or equal to character"; Unicode does (U+2264). The usual workaround is to use the character pair <= (less-than sign followed by equals sign).

EQUALS SIGN 

This character (=) is used to denote equality both in mathematics (as in 2+2=4) and in other areas. It is distinct from the Unicode character identical to (U+2261).

GREATER-THAN SIGN 

This character (>) basically denotes a mathematical relation. It is used for some secondary purposes as well, such as an angle bracket. See also notes on using < and > as brackets.

ISO Latin 1 does not contain a "greater than or equal to character"; Unicode does (U+2265). The usual workaround is to use the character pair >= (greater-than sign followed by equals sign).

QUESTION MARK

This character (?) is basically used as a punctuation character at the end of a direct question. The rules for using it vary from one language to another, and even from one authority to another. For a good summary of one usage style in English, see section Question in A Handbook for Technical Writers and Editors.

Cf. to inverted question mark (¿).

COMMERCIAL AT

This character (@) was original used in English in conjunction with unit prices in the meaning 'each'; it's name still reflect such usage, which is relatively rare. It has become most widely known as a separator in Internet E-mail addresses. It has other special uses, too.

In several national variants of ASCII, there is some other character in the code position of this character.

CAPITAL LETTER A  

A basic Latin letter.

CAPITAL LETTER B  

A basic Latin letter. As regards to using B in place of script capital B (U+212C, denoting Bernoulli function), see notes on letterlike symbols.

CAPITAL LETTER C  

A basic Latin letter. Notice that the copyright sign (©), appearing as C in a circle, is a separate symbol. As regards to using C e.g. in place of double-struck capital C (U+2102, denoting the set of complex numbers), see notes on letterlike symbols.

CAPITAL LETTER D  

A basic Latin letter.

CAPITAL LETTER E  

A basic Latin letter. As regards to using E e.g. in place of script capital E (U+2130, denoting electro-magnetic force), see notes on letterlike symbols.

CAPITAL LETTER F  

A basic Latin letter. As regards to using F e.g. in place of script capital F (U+2131, denoting Fourier transform), see notes on letterlike symbols.

CAPITAL LETTER G  

A basic Latin letter.

CAPITAL LETTER H  

A basic Latin letter. As regards to using H e.g. in place of script capital H (U+210B, denoting Hamiltonian function), see notes on letterlike symbols.

CAPITAL LETTER I  

A basic Latin letter. As regards to using I e.g. in place of black letter capital I (U+2111, denoting imaginary part), see notes on letterlike symbols.

CAPITAL LETTER J  

A basic Latin letter.

CAPITAL LETTER K  

A basic Latin letter. Also used to denote the temperature unit kelvin. (Although Unicode also has character kelvin sign, U+212A, it is equivalent to letter K.)

CAPITAL LETTER L  

A basic Latin letter. Notice that the pound sign (£), historically a variant of L, is a separate symbol. As regards to using L in place of script capital L (U+2112, denoting Laplace function), see notes on letterlike symbols.

CAPITAL LETTER M  

A basic Latin letter. As regards to using M in place of script capital M (U+2133, denoting M-matrix), see notes on letterlike symbols.

CAPITAL LETTER N  

A basic Latin letter. As regards to using N e.g. in place of double-struck capital N (U+2115, denoting the set of natural numbers), see notes on letterlike symbols.

CAPITAL LETTER O  

A basic Latin letter. Naturally, this character is distinct from the digit zero (0).

CAPITAL LETTER P  

A basic Latin letter. Notice that the sound recording copyright symbol (U+2117, appearing as P in a circle)does not belong to ISO Latin 1. As regards to using P e.g. in place of script capital P (U+2118, denoting e.g. power set) see notes on letterlike symbols.

CAPITAL LETTER Q  

A basic Latin letter. As regards to using Q e.g. in place of double-struck capital Q (U+211A, denoting the set of rational numbers), see notes on letterlike symbols.

CAPITAL LETTER R  

A basic Latin letter. Notice that the registered sign (®) appearing as R in a circle, is a separate symbol. As regards to using R e.g. in place of black letter capital R (U+211C, denoting real part), see notes on letterlike symbols.

CAPITAL LETTER S  

A basic Latin letter.

CAPITAL LETTER T  

A basic Latin letter.

CAPITAL LETTER U  

A basic Latin letter.

CAPITAL LETTER V  

A basic Latin letter.

CAPITAL LETTER W  

A basic Latin letter.

CAPITAL LETTER X  

A basic Latin letter.

CAPITAL LETTER Y  

A basic Latin letter.

CAPITAL LETTER Z  

A basic Latin letter. As regards to using Z e.g. in place of double-struck capital Z (U+2124, denoting the set of integers), see notes on letterlike symbols.

LEFT SQUARE BRACKET  

This character ([) is sometimes used as an opening delimiter for parenthetic remarks of some special kind in natural languages, especially when such remarks are nested or they present editorial insertions, corrections, and comments in quoted material and in reference citations. In other languages, there are various uses such as an opening delimiter for an array subscript list. Called opening square bracket in Unicode version 1.0.

In several national variants of ASCII, there is some other character in the code position of this character.

REVERSE SOLIDUS

This character (\) has various uses in technical contexts, e.g. as a separator in hierarchical file names. Notice, however, that in Unicode it is regarded as distinct from set minus (U+2216), which is used as an operator on sets (meaning set complement).

Called backslash (U+2032) in Unicode version 1.0 and very widely in actual practice.

In several national variants of ASCII, there is some other character in the code position of this character.

RIGHT SQUARE BRACKET 

This character (]) is sometimes used as a closing delimiter for special parenthetic remarks in natural languages. In other languages, there are various uses such as an closing delimiter for an array subscript list. Called closing square bracket in Unicode version 1.0.

In several national variants of ASCII, there is some other character in the code position of this character.

CIRCUMFLEX ACCENT 

This character (^) is a spacing character which basically represents a diacritic mark. As such, it has little use: it can be used in order to mention the diacritic. Due to its spacing nature, it cannot be used to construct a character with a circumflex accent (such as â).

In practice, circumflex accent is used for a variety of technical purposes e.g. in programming and command languages. It might, for example, be used as an exponentiation operator.

In ASCII, this character had the primary name "upward arrow head" and "circumflex accent" as secondary name only. See notes on diacritics.

Called spacing circumflex in Unicode version 1.0.

In several national variants of ASCII, there is some other character in the code position of this character.

LOW LINE 

Probably the most typical use of this character (_) is to make long identifiers more readable in programming languages. Due to their general syntax, such languages generally do not allow spaces in identifiers; but several programming languages allow underscores in identifiers. For example, one could write number_of_events in such languages.

In plain text, e.g. in Usenet discussions, it is customary to use a low line before and after a word or phrase to denote emphasis (e.g. "this is _very_ important") due to lack of better methods.

Called spacing underscore in Unicode version 1.0. The most usual name in practice is probably just "underscore".

This is spacing character, so it cannot be used to underline text (except through specific processing which goes beyond simple text presentation).

GRAVE ACCENT

This character (`) is a spacing character which basically represents a diacritic mark. As such, it has little use: it can be used in order to mention the diacritic. Due to its spacing nature, it cannot be used to construct a character with a grave accent accent (such as à).

Sometimes the grave accent is used as a single quote, especially to create the appearance of "smart" (asymmetric) quotes, using grave accent instead of an opening single quote and acute accent instead of a closing single quote. Such usage is definitely incorrect. In ISO Latin 1, the apostrophe is the only adequate surrogate for a single quote.

In different notation systems, the grave accent may have various technical uses which have nothing to do with accents. For example, in many Unix shells, the grave accent is a quoting character with a special meaning, "command substitution" (sometimes even called "grave command"!).

In several national variants of ASCII, there is some other character in the code position of this character.

SMALL LETTER a 

A basic Latin letter. Cf. to feminine ordinal indicator (ª).

SMALL LETTER b 

A basic Latin letter.

SMALL LETTER c 

A basic Latin letter. Cf. to cent sign (¢) and c with cedilla (ç).

SMALL LETTER d 

A basic Latin letter.

SMALL LETTER e 

A basic Latin letter. Notice that the estimated symbol (U+212E, similar in appearance to "e" but typically larger), used in European packaging, does not belong to ISO Latin 1. As regards to using e e.g. in place of script small e (U+212F, "error"), see notes on letterlike symbols.

SMALL LETTER f 

A basic Latin letter.

SMALL LETTER g 

A basic Latin letter. As regards to using g e.g. in place of script small g (U+210A, used as real number symbol), see notes on letterlike symbols.

SMALL LETTER h 

A basic Latin letter. The Planck constant h exists as a separate symbol (U+210E) in Unicode but it can be presented as an h in italics (e.g. using the markup <I>h</I> in HTML).

SMALL LETTER i 

A basic Latin letter.

SMALL LETTER j 

A basic Latin letter.

SMALL LETTER k 

A basic Latin letter.

SMALL LETTER l 

A basic Latin letter. As regards to using l e.g. in place of script small l (U+2113, used as symbol for litre), see notes on letterlike symbols.

SMALL LETTER m 

A basic Latin letter.

SMALL LETTER n 

A basic Latin letter.

SMALL LETTER o 

A basic Latin letter. Cf. to masculine ordinal indicator (º). As regards to using o e.g. in place of script small o (U+2134, used to denote "order"), see notes on letterlike symbols.

The letter o has often been used as a "list bullet". However, it might be read - especially by an automatic speech generator - as a word ("oh"), and in any case the use of a letter for such a purpose is illogical. Therefore, use e.g. a hyphen-minus or an asterisk instead. Notice that in the HTML language, you can just use logical elements (UL and LI) to set up lists and leave it up to browsers to present them.

SMALL LETTER p 

A basic Latin letter.

SMALL LETTER q 

A basic Latin letter.

SMALL LETTER r 

A basic Latin letter.

SMALL LETTER s 

A basic Latin letter. Cf. to sharp s (ß).

SMALL LETTER t 

A basic Latin letter.

SMALL LETTER u 

A basic Latin letter.

SMALL LETTER v 

A basic Latin letter.

SMALL LETTER w 

A basic Latin letter.

SMALL LETTER x 

A basic Latin letter.

The letter x has often been used as a multiplication sign when ASCII characters only are available. In ISO Latin 1, there is no reason to do so; use multiplication sign instead.

SMALL LETTER y 

A basic Latin letter.

SMALL LETTER z 

A basic Latin letter.

LEFT CURLY BRACKET

This character ({) is (rarely) used as an opening delimiter for parenthetic remarks in natural languages, especially when such remarks are nested. In other languages, there are various uses such as an opening delimiter for a comment or a parameter list.

Called opening curly bracket in Unicode version 1.0. In practice, the word "brace" is often used instead of "curly bracket".

In several national variants of ASCII, there is some other character in the code position of this character.

VERTICAL LINE

This character is probably most typically used in formal languages (such as BNF) between alternatives, corresponding to the word "or". However, many other usages exist. In Unix shells, for example, this character is used to denote "piping" (e.g. ls | more means "execute the ls program directing its output to the more program as input").

Called vertical bar in Unicode version 1.0 and in most contexts in practice. However, the word "line" is preferable to "bar", since in Unicode there are several vertical bar symbols, and even light vertical bar (U+2658) is intended to be thicker than vertical line!

In some old fonts (and keyboards), this character appears as a broken vertical line. But notice that in ISO Latin 1, broken bar (¦) is a completely distinct character.

In several national variants of ASCII, there is some other character in the code position of this character.

RIGHT CURLY BRACKET  

This character (}) is (rarely) used as an closing delimiter for parenthetic remarks in natural languages, especially when such remarks are nested. In other languages, there are various uses such as an closing delimiter for a comment or a parameter list.

Called closing curly bracket in Unicode version 1.0. In practice, the word "brace" is often used instead of "curly bracket".

In several national variants of ASCII, there is some other character in the code position of this character.

TILDE 

This character (~) is a spacing character which basically represents a diacritic mark. As such, it has little use: it can be used in order to mention the diacritic. Due to its spacing nature, it cannot be used to construct a character with tilde (such as ã).

In practice, tilde is used for a variety of technical purposes e.g. in programming and command languages. For example, in many Unix shells ~ denotes the user's home directory. Reflecting this tradition, on many Web servers people's Web pages are named in a manner which involves the tilde character. This should be avoided, since it causes various problems.

Tilde is often used as a symbol for negation in formal logic, but for that purpose, not sign should be used instead.

Tilde is not the same as tilde operator (U+223C), which is used in meanings like 'varies with', 'is proportional to', 'is similar to', etc. Typically, the glyphs for tilde operator and tilde look rather similar but the latter is positioned higher with respect to the baseline, reflecting its nature as a diacritic.

In ASCII, this character had the primary name "overline" (and a corresponding appearance; cf. to MACRON); "tilde" was a secondary name only. See notes on diacritics.

In several national variants of ASCII, there is some other character in the code position of this character.


NO-BREAK SPACE

This character is used in place of a normal space character as a "binding space", to prevent a line break between words or other expressions. The reason is that programs which process texts, even if the processing is otherwise quite simple, very often reformat the text as regards to division into lines. This means that normal spaces may be replaced by line breaks. In some cases, e.g. when a statement ends with an expression like "number 7.", such processing would lead to unesthetic results. The use of no-break space instead of normal space between "number" and "7." is expected to prevent that.

The ISO 8859-1 standard says this in technical language as follows:

NO-BREAK SPACE (NBSP)

A graphic character the visual representation of which consists of the absence of a graphic symbol, for use when a line break is to be prevented in the text as presented.

In the HTML language, no-break spaces may have other meanings, too.

INVERTED EXCLAMATION MARK

This character (¡) is used in Spanish at the beginning of an exclamation (which is terminated by a "normal" exclamation mark). Example:

¡Buenos días, señor!

CENT SIGN

This character (¢) is a currency symbol used in many countries. It is most widely known as the symbol for "cent" as one hundredth of the US dollar.

The currency unit euro will be divided into 100 cents; there seems to be no indication that the cent sign would be recommended as a symbol for cent in that meaning.

POUND SIGN

This character (£) is a currency symbol, also called "pound sterling". It is distinct from the lira sign (U+20A4), which is used as the symbol for Italian lira and Turkish lira. The pound sign has one crossbar whereas the lira sign has two.

CURRENCY SIGN

This character (¤) is a currency symbol to which no definite semantics seems to be assigned. It is used very rarely. The most natural semantics for it is probably that it is a generic currency symbol: a placeholder for actual currency symbols.

It is advisable to avoid using this character, since its code position is occupied by another character in ISO Latin 9 (alias ISO 8859-15), which will probably widely replace ISO Latin 1 at least in European usage.

YEN SIGN

This character (¥) is a currency symbol, with an alternative name "yuan", reflecting its dual use for the currencies of Japan and China. A glyph for the character may have one or two crossbars (with no difference in meaning).

BROKEN BAR 

In some old fonts (and keyboards), the vertical line character appears as a broken line. But in ISO Latin 1, broken bar (¦) is a completely distinct character. Its Unicode 1.0 name is "vertical broken bar". There seems to be no good information about the intended or actual usage of this character.

It is advisable to avoid using this character, since its code position is occupied by another character in ISO Latin 9 (alias ISO 8859-15), which will probably widely replace ISO Latin 1 at least in European usage.

PARAGRAPH SIGN, SECTION SIGN (SECTION SIGN)

This character (§) is used as a paragraph sign in some usage, especially when referring to paragraphs in laws. (For that reason, § is sometimes used to symbolize law in general.)

DIAERESIS

This character (¨) is a spacing character which basically represents a diacritic mark. As such, it has little use: it can be used in order to mention the diacritic.

The official spelling "diaeresis" conforms to British English; the American spelling "dieresis" is often used in practice. In Unicode 1.0, the name is spacing diaeresis.

The name "umlaut" or "Umlaut" is often used, especially when referring to the use of diaeresis in languages like German where it reflects a phonetic phenomenon called Umlaut. For more information on this, see a news article with subject Umlaut, ablaut, etc. by Christian Weisgerber. As regards to the appearance, especially when used to denote Umlaut in handwritten text, dieresis often takes a form which looks like tilde or macron.

It is advisable to avoid using this character, since its code position is occupied by another character in ISO Latin 9 (alias ISO 8859-15), which will probably widely replace ISO Latin 1 at least in European usage.

COPYRIGHT SIGN

This character (©) consists of the letter C in a circle, and it means "copyright". It can be used instead of or in addition to the word "copyright", partly because the character is in principle language-neutral and universal. Example:

© 1996 - 1998 Jukka Kalervo Korpela

The example is a copyright notice which satisfies the formalities for copyright protection in some countries. In most countries, there is no such formality requirement, but a notice might still be useful; see 10 Big Myths about copyright explained.

For sound recording copyright, a different symbol (P within a circle) is used; it does not belong to ISO Latin 1 but belongs to Unicode as U+2117.

FEMININE ORDINAL INDICATOR

This character (ª) looks like the letter "a" used as a superscript, often underlined. It is used in Spanish when denoting the feminine ending (-a) of an ordinal number, e.g. in "1ª", read "primera". Cf. to masculine ordinal indicator (º).

LEFT ANGLE QUOTATION MARK (LEFT-POINTING DOUBLE ANGLE QUOTATION MARK)

This character («) is a quotation mark which is usually used as an opening quotation mark, sometimes as closing.

Angle quotation marks, namely this character and right-pointing double angle quotation mark (»), are often called guillemets. They are mainly used in books. They are used in either "symmetric" or "asymmetric" way, i.e. the opening mark can be similar to the closing mark, or one of the marks can be the opening mark and the other the closing mark. This mainly depends on language. Some examples:

This character is not the same as the much less-than sign (U+226A); the latter, if needed when only ISO Latin 1 is available, can be simulated using two less-than signs (<<).

In Unicode 1.0, the name is left pointing guillemet.

NOT SIGN

This character (¬) denotes logical negation, or "not" operator. It is probably used mainly in sentential logic, and even there, the tilde sign is probably more often used to denote negation.

SOFT HYPHEN

The soft hyphen character, for which the abbreviation SHY is often used, seems to have different and contradictory meanings in ISO 8859-1 and in Unicode. In the former, SHY is a hyphen-like graphic character to be used at the end of line to indicate that word division has occurred; in the latter, it is also, and more essentially, an invisible hyphenation hint, a "discretionary hyphen" (which is an alternative name for the character in Unicode). There seems to be little if any support in widely used programs for soft hyphen in either meaning. Thus, you should probably just forget this character. But if you want to get even more confused with it, see my essay Soft hyphen (SHY) - a hard problem?.

REGISTERED TRADE MARK SIGN (REGISTERED SIGN)

This character (®) consists of letter R in a circle. It is used after a name or expression, indicates that it is a registered trade mark (at least in some country). In some countries, the law may require the aknowdledgement of (registered) trade marks when mentioning product names. See Trademarks: The Official Media Guide by INTA.

Trade mark sign (letters TM in superscript style, U+2122), used for trade marks which have not been registered but established by continuous use, does not belong to ISO Latin 1. Some confusion has been caused by the fact the trade mark sign belongs to the so-called Windows character set.

MACRON 

This character (¯) is a spacing character with a rather indefinite meaning.

The Unicode standard mentions "overline" and "APL overbar" as synonyms for this character. The latter is not problematic: it simply refers to use in the APL programming language. The former is strange, since Unicode also contains a character with the primary name overline (U+203E), in the General Punctuation block. Probably this is to be interpreted so that macron is distinct from overline. Notice that combining (nonspacing) macron and overline are also distinct characters (U+0304 and U+0305), in the Combining Diacritical Marks block; and combining overline is shown in the Unicode standard with a longer glyph than combining macron (despite the latter having the synonym "long"!), with the explicit statement that combining overline "connects on left and right".

Thus, it might seem that macron is intended to be a (spacing) diacritic mark (in addition to its special use in APL). However, in Unicode there is the separate character modifier letter macron (U+02C9), which is classified under "miscellaneous phonetic modifiers"!

In Unicode 1.0, the name is spacing macron.

As a note mostly of historical value, it needs to be remarked that in ISO 646, the primary name for tilde is "overline" and the primary glyph for it looks line overline. Luckily, such usage seems to be rare if not nonexistent.

RING ABOVE, DEGREE SIGN (DEGREE SIGN)

This character (°) denotes degrees. It is used both for temperature degrees (e.g. 100°F, 38°C) and when expressing angles in degrees (e.g. 90° angle). Notice that when a temperature is expressed in kelvins, the degree sign is not used; the symbol of kelvin is simply K (e.g. 311 K).

Despite the name of this character in the ISO 8859-1 standard, and despite some fonts showing it as a ring above, it must not be regarded as a diacritic mark, or as anything else than the degree sign for that matter. The reason is that the Unicode standard, in addition to specifying "degree sign" as the only name for it, specifically distinguishes it from ring above (U+02DA) which is listed under "spacing clones of diacritics". It is also distinct from ring operator (U+2218).

This character is not the same as masculine ordinal indicator (º) although the glyphs for the two characters may look similar.

PLUS-MINUS SIGN 

This character (±) means "plus or minus". It is used to refer to two quantities at the same time, as in "the solutions of the equation x²-4=0 are ±2", meaning that the solutions are +2 and -2. It is also used to indicate an interval of uncertainty in measurements and estimates, as in "according to the measurements, the weight is 42.4 ± 0.5 kg"; this means that the weight is expected to be between 42.4 - 0.5 and 42.4 + 0.5 kilograms. Typically, this does not specify absolute limits; the quantity after the ± sign is often some statistical measure like standard deviation. Yet another (informal) usage seems to be to let ± denote 'about, circa' (e.g. "he is ±50 years old"), which can be quite confusing.

In Unicode 1.0, the name is plus-or-minus sign.

SUPERSCRIPT TWO 

This character (²) is digit 2 as superscript. Alternative name: "squared". Example of use: m² (square meter). In Unicode 1.0, the name is superscript digit two.

SUPERSCRIPT THREE

This character (³) is digit 3 as superscript. Alternative name: "cubed". Example of use: m³ (cubic meter). In Unicode 1.0, the name is superscript digit three.

ACUTE ACCENT

This character (´) is a spacing character which basically represents a diacritic mark. As such, it has little use: it can be used in order to mention the diacritic. Due to its spacing nature, it cannot be used to construct a character with an acute accent (such as á).

In Unicode 1.0, the name is spacing acute.

Sometimes the acute accent is used as a single quote, especially to create the appearance of "smart" (asymmetric) quotes, using grave accent instead of an opening single quote and acute accent instead of a closing single quote. Such usage is definitely incorrect. In ISO Latin 1, the apostrophe is the only adequate surrogate for a single quote.

The acute accent should not be used instead of the apostrophe in expressions like "don't" or "Jim's" or "o'clock". See also notes on the use of apostrophe (rather than acute accent) as a surrogate for various characters not in ISO Latin 1.

It is advisable to avoid using this character, since its code position is occupied by another character in ISO Latin 9 (alias ISO 8859-15), which will probably widely replace ISO Latin 1 at least in European usage.

MICRO SIGN

This character (µ) corresponds to the prefix "micro-". It is used in the metric system and, more generally, in the SI system of units to denote 'millionth of'. More exactly, it corresponds to a numeric multiplier of ten to the power -6. For example, "µm" means 'micrometer', i.e. on millionth of a metre. (The old, unsystematic name for that unit, "micron", with an old abbreviation consisting of the "µ" character only, is still used sometimes.)

Although this character is historically based on the Greek letter mu (my), it is regarded as a distinct character. The Greek letter mu has its own code position (U+3BC) in Unicode. The glyphs of micro sign and mu are similar to each other and might be identical.

PILCROW SIGN

This character (¶) is a section sign in some usage. Many text processing programs display paragraph breaks as ¶ when requested to "show formatting". In Unicode 1.0, the name is paragraph sign.

MIDDLE DOT

This character (·) has alternative names "Georgian comma" and "Greek middle dot". Otherwise its semantics seem vague. In any case, it is distinct from the following characters: bullet (U+2022), one dot leader (U+2024), bullet operator (U+2219), dot operator (U+22C5), hyphenation point (U+2027). None of these characters belongs to ISO Latin 1. (The bullet appears in the so-called Windows character set, but see my document On the use of some MS Windows characters in HTML.)

Notice that the middle dot character would not be even a visually good surrogate e.g. for bullet as a list bullet, since the glyph for middle dot is typically a rather small dot. In HTML authoring, there is no need for a list bullet character, since you simply present an unordered list using the UL and LI elements, leaving it to browsers to present them (using bullets or otherwise).

CEDILLA

This character (¸) is a spacing character which basically represents a diacritic mark. As such, it has little use: it can be used in order to mention the diacritic. Due to its spacing nature, it cannot be used to construct a character with a cedilla. Notice that in ISO Latin 1, the only letter with cedilla which you can use is c with cedilla (Ç and ç). There does not seem to be much secondary use for the cedilla character either. In Unicode 1.0, the name is spacing cedilla.

It is advisable to avoid using this character, since its code position is occupied by another character in ISO Latin 9 (alias ISO 8859-15), which will probably widely replace ISO Latin 1 at least in European usage.

SUPERSCRIPT ONE 

This character (¹) is digit 1 as superscript. In Unicode 1.0, the name is superscript digit one.

MASCULINE ORDINAL INDICATOR 

This character (º) looks like the letter "o" used as a superscript, often underlined. It is used in Spanish when denoting the masculine ending (-o) of an ordinal number, e.g. in "1º", read "primero". Cf. to feminine ordinal indicator (ª).

This character is definitely not superscript 0 or degree sign.

RIGHT ANGLE QUOTATION MARK (RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK)

This character (») is a quotation mark which is usually used as closing quotation mark, sometimes as opening. See usage notes in the description of the left-pointing double angle quotation mark («). Cf. to quotation mark (").

This character is not the same as the much greater-than sign (U+226B); the latter, if needed when only ISO Latin 1 is available, can be simulated using two greater-than signs (>>).

In Unicode 1.0, the name is right pointing guillemet.

VULGAR FRACTION ONE QUARTER 

This character (¼) denotes "1/4" as one character. See notes on vulgar fractions. In Unicode 1.0, the name is fraction one quarter.

It is advisable to avoid using this character, since its code position is occupied by another character in ISO Latin 9 (alias ISO 8859-15), which will probably widely replace ISO Latin 1 at least in European usage.

VULGAR FRACTION ONE HALF 

This character (½) denotes "1/2" as one character. See notes on vulgar fractions. In Unicode 1.0, the name is fraction one half.

It is advisable to avoid using this character, since its code position is occupied by another character in ISO Latin 9 (alias ISO 8859-15), which will probably widely replace ISO Latin 1 at least in European usage.

VULGAR FRACTION THREE QUARTERS

This character (¾) denotes "3/4" as one character. See notes on vulgar fractions. In Unicode 1.0, the name is fraction three quarters.

It is advisable to avoid using this character, since its code position is occupied by another character in ISO Latin 9 (alias ISO 8859-15), which will probably widely replace ISO Latin 1 at least in European usage.

INVERTED QUESTION MARK  

This character (¿) is used in Spanish at the beginning of a question (which is terminated by a "normal" question mark). Example:

¿Cómo está usted?

A synonym for the character is "turned question mark".

CAPITAL LETTER A WITH GRAVE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER A WITH ACUTE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER A WITH CIRCUMFLEX ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER A WITH TILDE

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER A WITH DIAERESIS  

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER A WITH RING ABOVE 

This is a separate character composed of a basic Latin letter and a diacritic mark. It is used in Used in Danish, Norwegian, and Swedish. It is equivalent to the angstrom sign (U+212B) used in physics; notice that the very angstrom unit should be replaced by regular SI units: 1 Å is 0.1 nanometres.

CAPITAL DIPHTHONG A WITH E (LATIN CAPITAL LETTER AE)

This character (Æ) is a separate character which historically originated as a ligature of the basic Latin letters A and E. See usage notes in the description of the corresponding small letter, æ.

In Unicode 1.0, the name is "Latin capital ligature AE".

CAPITAL LETTER C WITH CEDILLA

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER E WITH GRAVE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER E WITH ACUTE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER E WITH CIRCUMFLEX ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER E WITH DIAERESIS

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER I WITH GRAVE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER I WITH ACUTE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER I WITH CIRCUMFLEX ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER I WITH DIAERESIS

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL ICELANDIC LETTER ETH (LATIN CAPITAL LETTER ETH)

This character (Ð) is a letter which was included into ISO Latin 1 due to its use in Icelandic and Faeroese. Although its appearance is typically that of the letter D with stroke, it is not regarded as a letter with a diacritic. It is also distinct from Latin capital letter D with stroke (U+0110), which appears in some other ISO Latin alphabets, and from Latin capital letter African D (U+0189), although these letters may all look similar.

See usage notes in the description of the corresponding small letter, ð.

In Unicode 1.0, the name is "Latin capital letter ETH".

CAPITAL LETTER N WITH TILDE

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER O WITH GRAVE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER O WITH ACUTE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER O WITH CIRCUMFLEX ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER O WITH TILDE

This is a separate character composed of a basic Latin letter and a diacritic mark. Used in Portuguese and in phonetic writing to denote nasal "o". Also used in Estonian but denoting a different (non-nasal) vowel.

CAPITAL LETTER O WITH DIAERESIS  

This is a separate character composed of a basic Latin letter and a diacritic mark.

MULTIPLICATION SIGN  

This character (×) is a mathematical symbol denoting multiplication. Examples: "2×2 makes 4", where "×" can be read as "times"; "a 5×10 metres area", where "×" can be read as "by".

CAPITAL LETTER O WITH OBLIQUE STROKE (LATIN CAPITAL LETTER O WITH STROKE)

This character (Ø) is classified as a letter. It is used e.g. in Danish. Cf. to the corresponding small letter, ø. Despite its name--which reflects its origin--it is not regarded as a letter with a diacritic mark.

This letter is not a suitable symbol for the empty set (for which there is a separate symbol in Unicode, namely U+2205).

CAPITAL LETTER U WITH GRAVE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER U WITH ACUTE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER U WITH CIRCUMFLEX ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER U WITH DIAERESIS  

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL LETTER Y WITH ACUTE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

CAPITAL ICELANDIC LETTER THORN (LATIN CAPITAL LETTER THORN)

This character (Þ) is the a capital letter corresponding to the Latin small letter thorn, þ.

SMALL GERMAN LETTER SHARP s (LATIN SMALL LETTER SHARP S)

This character (ß) is a letter used in German, and it denotes an "s" sound (unvoiced). It is definitely not the Greek letter beta! A synonym for the name is "ess-zed", reflecting an assumed origin of the letter as a ligature of "s" and "z", although the origin might more properly be regarded as a ligature of "long s" and "short s".

When converting German text into uppercase, this letter is converted to the character pair "SS" (two normal "S" letters).

The use of this character will be affected (reduced, in favor of "ss") by the German orthography reform (to be carried out in 1998 - 2005).

SMALL LETTER a WITH GRAVE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

This character sometimes appears in languages which do not usually have accented letters, since they use the loanword (French preposition) "à" in a punctuation-like manner, e.g. "5 à 7" meaning '5 to 7', '5--7'.

SMALL LETTER a WITH ACUTE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER a WITH CIRCUMFLEX ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER a WITH TILDE

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER a WITH DIAERESIS  

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER a WITH RING ABOVE 

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL DIPHTHONG a WITH e (LATIN SMALL LETTER AE)

This character (æ) is a separate character which historically originated as a ligature of the basic Latin letters a and e. Cf. to the corresponding capital letter, Æ.

The word "diphthong" is misleading in this context, since the character does not necessarily, or even usually, denote a combination of vowels pronounced as a diphthong.

This character is used

In Unicode 1.0, the name is "Latin small ligature ae".

SMALL LETTER c WITH CEDILLA

This is a separate character composed of a basic Latin letter and a diacritic mark. It is used e.g. in French to denote an "s" sound. It is also used in the international phonetic alphabet by IPA to denote an unvoiced palatal fricative.

SMALL LETTER e WITH GRAVE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER e WITH ACUTE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER e WITH CIRCUMFLEX ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER e WITH DIAERESIS

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER i WITH GRAVE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark. Used in Italian and Malagash.

SMALL LETTER i WITH ACUTE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER i WITH CIRCUMFLEX ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER i WITH DIAERESIS

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL ICELANDIC LETTER ETH (LATIN SMALL LETTER ETH)

This character (ð) is a letter which was included into ISO Latin 1 due to its use in Icelandic and Faeroese. It is also used in old English and in the international phonetic alphabet by IPA. It denotes the voiced sound which is denoted by "th" in modern English (as in the word "the").

This character is distinct from Latin small letter d with stroke (U+0111), which appears in some other ISO Latin alphabets.

Cf. to the corresponding capital letter, Ð.

SMALL LETTER n WITH TILDE

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER o WITH GRAVE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER o WITH ACUTE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER o WITH CIRCUMFLEX ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER o WITH TILDE

This is a separate character composed of a basic Latin letter and a diacritic mark. Used in Portuguese and in phonetic writing to denote nasal "o". Also used in Estonian but denoting a different (non-nasal) vowel.

SMALL LETTER o WITH DIAERESIS  

This is a separate character composed of a basic Latin letter and a diacritic mark.

DIVISION SIGN

This character (÷) is a mathematical symbol denoting division. Its intended scope of use is somewhat unclear, but in ISO Latin 1, which lacks the Unicode character division slash (described as "generic division operator") one could probably use it as the normal division operator (as in "100÷5 makes 20"). Cf. to the discussion of slashes in the description of the solidus.

SMALL LETTER o WITH OBLIQUE STROKE (LATIN SMALL LETTER O WITH STROKE)

This character (ø) is classified as a letter. It is used e.g. in Danish and in the International phonetic alphabet by IPA. Cf. to the corresponding capital letter, Ø. Despite its name--which reflects its origin--it is not regarded as a letter with a diacritic mark.

SMALL LETTER u WITH GRAVE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER u WITH ACUTE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER u WITH CIRCUMFLEX ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER u WITH DIAERESIS  

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL LETTER y WITH ACUTE ACCENT

This is a separate character composed of a basic Latin letter and a diacritic mark.

SMALL ICELANDIC LETTER THORN (LATIN SMALL LETTER THORN)

This character (þ), originally a runic letter, was included into ISO Latin 1 due to its use in Icelandic. It is also used in old English. It denotes the unvoiced sound which is denoted by "th" in modern English (as in the word "mouth").

The Unicode standard also mentions IPA in the usage notes for this character, but this is probably a mistake, for the following reasons: This character is not listed in the cross-reference section of the description of the IPA extensions block (which lists e.g. latin small letter eth). The International phonetic alphabet by IPA does not contain any character resembling thorn. On the other hand, there are two characters for dental fricatives, unvoiced and voiced; the latter is obviously eth while the former looks like greek small letter theta (U+03B8), which is actually listed in the cross-reference section mentioned above.

Cf. to the corresponding capital letter, Þ.

SMALL LETTER y WITH DIAERESIS

This is a separate character composed of a basic Latin letter and a diacritic mark. It is used in French in some names like L'Haÿ; the diaeresis indicates, as usual in French, that each vowel keeps its own pronunciation. Sometimes it is also (incorrectly) used in Dutch in place of ij ligature (U+0133).

Notice that the corresponding capital letter (U+0178) does not belong to ISO Latin 1. (See some related notes in my document On the use of some MS Windows characters in HTML.)


Next subsection: The characters grouped logically