Aug 30, 2011
tom

Does having no dash in utf-8 in email messages makes charset unreadable?

Question

Does having no dash in utf-8 in email messages headers can make email clients to display text wrong?

Subject: Newsletter
MIME-Version: 1.0
From: <>
Reply-To: <>
Content-Type: text/plain; **charset=utf8**
Message-Id: <>
Sender: www-data <>
Date: Mon, 29 Aug 2011 12:19:37 +0200
X-SmarterMail-Spam: SPF_None

As opposed to:

Return-Path: <>
Received: from g with SMTP;
   Tue, 30 Aug 2011 17:19:03 +0200
Received: from www-data by serwis with local (Exim: PJ server v1.0 
    id <>
    for <>; Tue, 30 Aug 2011 17:18:53 +0200
To: <>
Subject: <>
From: WWW <>
MIME-Version: 1.0
Content-type: text/plain; **charset=utf-8**
Message-Id: <>
Sender: www-data <>
Date: Tue, 30 Aug 2011 17:18:53 +0200
X-SmarterMail-Spam: SPF_None

I’m asking since we’ve noticed in some emails that if there’s charset utf8 polish chars are unreadable.

Answer

From the Wikipedia entry on UTF-8:

The official name is “UTF-8″. All letters are upper-case, and the name is hyphenated. This spelling is used in all the documents relating to the encoding.
Alternatively, the name “utf-8″ may be used by all standards conforming to the Internet Assigned Numbers Authority (IANA) list (which include CSS, HTML, XML, and HTTP headers),[15] as the declaration is case insensitive.

Other descriptions that omit the hyphen or replace it with a space, such as “utf8″ or “UTF 8″, are not accepted as correct by the governing standards. Despite this, most agents such as browsers can understand them, and so standards intended to describe existing practice (such as HTML5) may effectively require their recognition.

So basically utf8 is technically incorrect (The Worst Kind of Incorrect™), and programs are under no obligation to accept it and Do The Right Thing (though many may do so out of the goodness of their hearts).

Related posts:

  1. Mysql charset problem
  2. How to find out if a terminal supports UTF-8
  3. Where does the PHP/MySQL connection take its default charset?
  4. Using UTF-8 in the /etc/passwd file. Any known issues?
  5. UTF-8 and !# shell scripts

Leave a comment