How to cope with international standards for the thousands separator (digit grouping symbol)

by William S. Statler, originally published 2005-02-22, updated 2015-07-29

UPDATE: This article is 10 years old and just a bit out-of-date! In particular, the use of Unicode character U+202F, the Narrow No-Break Space, has become more common — I noticed it used as a thousands separator just today on JAMA's website. See the note below under Impractical solutions.

The problem:

What is the "right" way to write long numbers in scientific and technical reports?

It is customary and recommendable to group digits in long numbers to groups of three digits. But the method of separating the groups depend on cultural conventions and even personal style. This typically means using spaces, commas, periods, or apostrophes as separators.

It is safest to use spaces, since the other alternatives could be misinterpreted. For example, in English 1,005 would mean one thousand and five and 1.005 would mean one and five thousandths; in French, and in several other languages, it's just the reverse!

Jukka Korpela (author of the preceding explanation) presents a nice discussion of how to handle the thousands separator in web page design. My interest here is how to handle it when writing a report using common software such as Microsoft Word and Excel, where the problem is somewhat different.

Let's review the standards first. Here is what the Bureau International des Poids et Mesures has to say about numbers in SI:

In numbers, the comma (French practice) or the dot (British practice) is used only to separate the integral part of numbers from the decimal part. Numbers may be divided in groups of three in order to facilitate reading; neither dots nor commas are ever inserted in the spaces between groups.

[Reference: BIPM's SI Brochure, 7th edition.]

ISO standard 31-0 says:

To facilitate the reading of numbers with many digits, these may be separated into suitable groups, preferably of three, counting from the decimal sign towards the left and the right; the groups should be separated by a small space, and never by a comma or a point, nor by any other means.

[ISO standards are available from ISO for a fee.]

In the US, NIST says that:

...digits should be separated into groups of three, counting from the decimal marker towards the left and right, by the use of a thin, fixed space. However, this practice is not usually followed for numbers having only four digits on either side of the decimal marker except when uniformity in a table is desired. [It is also] not usually followed in certain specialized applications, such as engineering drawings and financial statements.

[Reference: NIST's Guide for the Use of the International System of Units (SI).]

So, the thousands separator must be a space, preferably a "small" or "thin" space. It must also be a no-break space, to prevent the number from being word-wrapped across two lines of text. Ideally, it should be a single Unicode character which can be entered into a spreadsheet program as the default thousands separator, and which will also be displayed correctly in a word-processing program.

A practical, though not entirely satisfactory, solution:

Unfortunately, as of 2005 there is no single character that fits all the above requirements and is commonly supported by existing software and fonts. So we need to improvise.

If you are simply using word-processing software to write a report, this is fairly easy to do: you can enter a non-breaking space, and reduce its width. Here's how.

The standard-width No-Break Space character, U+00A0 in Unicode terms, should be present in even your most ancient and antique fonts. You can enter it from a Windows keyboard by holding down Alt and typing 0160 on the numeric keypad. And your word-processor program probably has a shortcut key for it (in Word, it's Ctrl+Shift+Space).

In Word 97 (and I assume in later versions), character width is adjusted using the strangely-named "Scale" or "Scaling" setting. To set this manually, select Format/Font from the menu, click the Character Spacing tab, and enter a percentage-of-normal in the Scale box. 66% seems to look about right for a thousands-separator space. (If you are using another brand of software, check its documentation for how to adjust character width.)

In most software, you'll be able to automate this with a macro. Here is the macro I'm using with Word 97. It inserts a single No-Break Space at 66% of normal width. I've assigned it to the key combination "Ctrl+Space", which makes it very easy to use.

Sub NarrowNoBreakSpace()
'
' NarrowNoBreakSpace Macro
' 66% scale non-breaking space, use as thousands separator.
'
    With Selection.Font
        .Scaling = 66
    End With
    Selection.InsertSymbol CharacterNumber:=160, Unicode:=True, Bias:=0
    With Selection.Font
        .Scaling = 100
    End With
End Sub

Here are instructions from Microsoft on how to create a macro and how to assign a shortcut key in Word 2003. (The procedure is similar in earlier versions of Word.)

If you are using a spreadsheet, it can be much more difficult to get the desired narrow non-breaking space into your numbers. Microsoft Excel only recognizes one character for use as a thousands separator: the one you've selected for the entire system. To change this under Windows XP, you must open the Control Panel, select Regional and Language Options, click the "Customize" button, and set the "Digit Grouping Symbol" to be the character you want (Alt+0160). Of course, this affects every Windows program on your computer. Also, there's no practical way to make this character narrow, so you're stuck with the full-width No-Break Space. (More on this below.)

(Although Excel also lets you design custom formats for numbers, it is impractical to use this feature to emulate a thousands separator for numbers with many digits.)

If you are generating numbers in a spreadsheet and copying them into a word-processor document to create your report, you should be able to perform a "slimming exercise" on your No-Break Space using the Find-and-Replace function.

Here's how to do it in Microsoft Word 97: Select Edit/Replace and click More. Type a No-Break Space (Ctrl+Shift+Space) in the Find box. In the Replace box, enter another No-Break Space, then click the Format button, select Font, and on the Character Spacing tab, enter 66 (or your preferred width) in the Scale box.

Impractical solutions:

You might think that Unicode would include a single character that could be used as a thousands separator. Unfortunately, as of 2005 and Unicode 4.0, this is not the case. There are a few options that might seem suitable, but aren't really.

For example, there is Unicode character U+202F, the Narrow No-Break Space. But this character was introduced fairly recently, and you are very unlikely to have any font on your computer that contains it. What's worse, the purpose of this character is poorly documented: apparently, it is only used in Mongolian and related languages, so even if you find this character in a font, its width may be unsuitable for use as a thousands separator. The other available non-breaking space characters are either too wide (U+2007, Figure Space) or too narrow (U+FEFF, Zero Width No-Break Space).

UPDATE: Unicode character U+202F, the Narrow No-Break Space, is more widely supported as of 2015. (Possibly this is because it was found helpful for use with certain French and Russian punctuation, so it's no longer exclusively for Mongolians!) It's still rather inconvenient to type from a standard keyboard: you'll have to consult your specific software and operating system instructions, and use the hexadecimal code 202f or the decimal code 8239. Verify that you are using a font that contains this character (i.e., try typing it and see what you get). If you expect to use Narrow No-Break Space regularly as a thousands separator, you'll probably want to assign it to an unused shortcut key such as Ctrl+Space.

There are also quite a few space characters that are about the right width but are not non-breaking (for example: U+2005, Four-Per-Em Space; U+2006, Six-Per-Em Space; or U+205F, Medium Mathematical Space). These aren't safe to use as a thousands separator, because word-wrapping might split your numbers across a line break. (In theory, you could combine one of these with U+2060, Word Joiner, which ought to prevent a line break. But as of 2005, software support for Word Joiner is very poor.)

Future solutions:

The Common Locale Data Repository (CLDR) is a new project of the Unicode Consortium that aims to provide "a general XML format for the exchange of locale information" that isn't tied to one vendor's proprietary formats. This is intended as a standard way to encode language and regional differences that affect "the formatting of numbers, dates, times, and currency values, as well as ... differences in measurement units, text sorting order, and other services."

With luck, perhaps CLDR will support the formatting needs of scientists and engineers. This will take some time (and, of course, the purchase of new CLDR-compatible software).

My personal wish is for Unicode to add a new character, called perhaps "Narrow Figure Space" or "No-Break Medium Mathematical Space", or even better, "Thin Digit Grouping Separator", specifically for this purpose. Once we have the character, and a small selection of fonts that include it, it would be quite easy to use it as a thousands separator even with old software. However, I'm not aware of any plans to introduce such a character.

So, for the near future at least, the only practical option is good old No-Break Space plus formatting to make it narrow.


Comments and corrections requested. If you are able to try out these ideas on other operating systems and software, I'd like to hear your results -- at the moment, I only have access to MS Word/Excel 97 running under Windows XP. E-mail me at billstatler(at)bentonrea(dot)com. Many thanks to Jukka Korpela for helpful comments and information.

Copyright © 2005 William S. Statler. This work is licensed under a Creative Commons License, which grants limited rights of non-commercial distribution and reuse. Please read
http://creativecommons.org/licenses/by-nc/2.0/
for details. All other rights reserved.

Return to my Home Page