Register | Sign In


Understanding through Discussion


EvC Forum active members: 64 (9164 total)
3 online now:
Newest Member: ChatGPT
Post Volume: Total: 916,760 Year: 4,017/9,624 Month: 888/974 Week: 215/286 Day: 22/109 Hour: 3/2


Thread  Details

Email This Thread
Newer Topic | Older Topic
  
Author Topic:   UTF-8?
Codegate
Member (Idle past 844 days)
Posts: 84
From: The Great White North
Joined: 03-15-2006


Message 9 of 11 (545121)
02-01-2010 11:19 AM
Reply to: Message 6 by Admin
02-01-2010 9:01 AM


From a developer that regularly deals with character encoding woes.
The key is to make sure that the encoding is maintained through the entire chain. Your web server, your application and your database tables should all be set to UTF-8, if that is the encoding you want to use.
Otherwise, the text stream is changing encoding as it progress up and down the chain and you can end up with some very odd behavior. For example, if a user submits a utf-8 string to your application, which then interprets it as latin-1 (8859-1) which then places it into a database table that treats it like utf8 again. So what you end up with in your DB is a utf8 encoded version of the latin-1 encoding of the original utf8 string provided.
Very bizarre and very annoying to deal with. Single characters that end up taking 12 bytes is a tad wasteful
The moral of the story? Stay consistent with your encodings. Trying to change how string are interpreted can lead to potentially having to rebuild your db tables to match the new encoding (or just living with weird database content).
Good luck!

This message is a reply to:
 Message 6 by Admin, posted 02-01-2010 9:01 AM Admin has seen this message but not replied

  
Newer Topic | Older Topic
Jump to:


Copyright 2001-2023 by EvC Forum, All Rights Reserved

™ Version 4.2
Innovative software from Qwixotic © 2024