Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Fedora User (http://www.linux-archive.org/fedora-user/)
-   -   Curious characters in Thunderbird on Linux... (http://www.linux-archive.org/fedora-user/134778-curious-characters-thunderbird-linux.html)

Ed Greshko 07-30-2008 10:53 PM

Curious characters in Thunderbird on Linux...
 
Kevin Martin wrote:
I get strange characters in some emails that I receive in Thunderbird on
F8. Things like (I hope this comes thru):


*Uptown Theatre buyer calls city requirements ‘onerous’*
<http://e.ccialerts.com/a/hBIkMy4AFS8nrB7Q2vpAUpPTuv5/ccb37>


and

Version:Â Â Â [GA

Any idea why I would be seeing this? Apparently, when I send the
recipient's see strange characters such as these as well. Is it an
encoding issue, d'ya think?


Yes, it is character encoding.

If the message is sent using UTF-8 as the encoding and you force it to
display as ISO-8859-1 you will indeed get what you show above.


Also, if a message is sent using "onerous" (with the punctuation mark used
in the article) as iso-8859-1 you will see what you show. In that case, you
can force UTF-8 as the charset and it will display properly.


--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

Kevin Martin 07-31-2008 01:54 AM

Curious characters in Thunderbird on Linux...
 
Ed Greshko wrote:

Kevin Martin wrote:
I get strange characters in some emails that I receive in Thunderbird
on F8. Things like (I hope this comes thru):


*Uptown Theatre buyer calls city requirements ‘onerous’*
<http://e.ccialerts.com/a/hBIkMy4AFS8nrB7Q2vpAUpPTuv5/ccb37>


and

Version:Â Â Â [GA

Any idea why I would be seeing this? Apparently, when I send the
recipient's see strange characters such as these as well. Is it an
encoding issue, d'ya think?


Yes, it is character encoding.

If the message is sent using UTF-8 as the encoding and you force it to
display as ISO-8859-1 you will indeed get what you show above.


Also, if a message is sent using "onerous" (with the punctuation mark
used in the article) as iso-8859-1 you will see what you show. In
that case, you can force UTF-8 as the charset and it will display
properly.


So if messages are sent using an encoding that you are not this will
happen? Crud, how do you get around /that/?


Kevin


--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

Tim 07-31-2008 03:35 AM

Curious characters in Thunderbird on Linux...
 
On Wed, 2008-07-30 at 20:54 -0500, Kevin Martin wrote:
> So if messages are sent using an encoding that you are not this will
> happen? Crud, how do you get around /that/?

Your client should automatically display the text correctly, transcoding
if it has to. Of course, that will only work if:

1. The message correctly identifies which encoding it used.
2. It's an encoding that your client understands.
3. You have fonts that can provide the characters needed.
4. You haven't forced your client to use a particular encoding.
5. The message hasn't been mangled in transit.

Point 1 is a common problem, because authors will cut and paste text
using different encodings, using systems that don't transcode as they do
that. And they may be forcing their client to use a particular encoding
scheme, one that's not what they're actually using.

Point 2 is probably fine, in this day and age. Point 3 is probably
fine, too, but is still a likely problem area.

Point 4 can be a common problem, generally you shouldn't try to force
your client to always work in a particular encoding, you should leave
your client to work it out automatically. Setting a default is fine,
for which to choose to start off with when authoring messages (your
system's default, would be best). And setting a default is almost fine,
for which to choose when reading unidentified text, though the RFC
default for unidentified text was ASCII, and I think is now ISO-8859-1
(I can't be bothered checking, now, and the two are equivalent, as far
the ASCII portion goes).

Point 5 is a common problem when messages pass through some services
that want to transcode a message into their own schemes, and make a
pig's breakfast of it. Some mailing list software was notorious for
doing that sort of thing.

When replying to a message, there's two common defaults:

a. Reply in the same scheme as the original message.
b. Reply in your usual encoding scheme.

Point a is fine, so long as your client can make use of that encoding
(it probably can, unless it's a truly strange one - some weird schemes
required specially-arranged fonts).

Point b is fine, so long as your client transcodes the original message,
if quoting it, into the encoding scheme that it's going to use. Looking
at the mess some clients make, I wonder if they actually do that, rather
than just bodge the text in as-is, without changing the encoding.

If you think all of that is a right headache, it is. That's why there
was a push for unicode all those years back. One scheme for everyone,
and no transcoding required.
--
[tim@localhost ~]$ uname -r
2.6.25.11-97.fc9.i686

Don't send private replies to my address, the mailbox is ignored. I
read messages from the public lists.



--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

Bjrn Persson 07-31-2008 11:39 AM

Curious characters in Thunderbird on Linux...
 
Tim wrote:
> On Wed, 2008-07-30 at 20:54 -0500, Kevin Martin wrote:
> > So if messages are sent using an encoding that you are not this will
> > happen? Crud, how do you get around /that/?
>
> Your client should automatically display the text correctly, transcoding
> if it has to. Of course, that will only work if:
>
> 1. The message correctly identifies which encoding it used.
> 2. It's an encoding that your client understands.
> 3. You have fonts that can provide the characters needed.
> 4. You haven't forced your client to use a particular encoding.
> 5. The message hasn't been mangled in transit.

Thunderbird is good with character encodings in my experience, so points 1 and
2 shouldn't be a problem on your end, Kevin. Point 3 should give different
symptoms. I don't know if point 4 is possible in Thunderbird. You may want to
check that, but otherwise the problem is probably not with Thunderbird. Then
it's either the other person's email program, or a broken gateway (point 5).

> If you think all of that is a right headache, it is. That's why there
> was a push for unicode all those years back. One scheme for everyone,
> and no transcoding required.

Unicode was invented some 50 years too late. A gazillion different encodings
were created in the meantime, and now we have to cope with the mess.

Bjrn Persson

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

Kevin Martin 07-31-2008 02:26 PM

Curious characters in Thunderbird on Linux...
 
Bjrn Persson wrote:

Tim wrote:


On Wed, 2008-07-30 at 20:54 -0500, Kevin Martin wrote:


So if messages are sent using an encoding that you are not this will
happen? Crud, how do you get around /that/?


Your client should automatically display the text correctly, transcoding
if it has to. Of course, that will only work if:

1. The message correctly identifies which encoding it used.
2. It's an encoding that your client understands.
3. You have fonts that can provide the characters needed.
4. You haven't forced your client to use a particular encoding.
5. The message hasn't been mangled in transit.



Thunderbird is good with character encodings in my experience, so points 1 and
2 shouldn't be a problem on your end, Kevin. Point 3 should give different
symptoms. I don't know if point 4 is possible in Thunderbird. You may want to
check that, but otherwise the problem is probably not with Thunderbird. Then
it's either the other person's email program, or a broken gateway (point 5).




If you think all of that is a right headache, it is. That's why there
was a push for unicode all those years back. One scheme for everyone,
and no transcoding required.



Unicode was invented some 50 years too late. A gazillion different encodings
were created in the meantime, and now we have to cope with the mess.


Bjrn Persson


Well, my encodings /were/ set to force UTF8 in and out. I've now just
set that as the default but aren't enforcing them with the
Preferences->Display->Fonts check boxes anymore. It looks /better/ but
still non-alphanumeric characters (like apostrophe) are hosed. Maybe
I'll change my default from UTF8 to one of the ISO "standard" encodings
and see how it looks.


Kevin


--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list

Tim 08-01-2008 03:03 PM

Curious characters in Thunderbird on Linux...
 
On Thu, 2008-07-31 at 09:26 -0500, Kevin Martin wrote:
> my encodings /were/ set to force UTF8 in and out. I've now just
> set that as the default but aren't enforcing them with the
> Preferences->Display->Fonts check boxes anymore. It looks /better/
> but still non-alphanumeric characters (like apostrophe) are hosed.
> Maybe I'll change my default from UTF8 to one of the ISO "standard"
> encodings and see how it looks.

Was it actually an apostrophe, or something similar looking being
misused as one?

ISO-8859-1 is a common standard, but only much use for very plain
English. UTF-8 should be supported by just about everything, by now.
But could still come a cropper if it went through a 7-bit system
(they're still around), UTF-7 is supposed to be a solution to that.

You could always give things a bit of a test. Read man iso-8859-1 in a
console, cut and paste a slab of it into an e-mail, and post it to
yourself. Try out different encoding options, see what happens.

--
[tim@localhost ~]$ uname -r
2.6.25.11-97.fc9.i686

Don't send private replies to my address, the mailbox is ignored. I
read messages from the public lists.



--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


All times are GMT. The time now is 01:18 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.