[baseten-users] Unicode strings in BaseTen

Erik Aderstedt erik at aderstedt.se
Mon May 18 15:37:11 EEST 2009


Hi again!

> I'm having issues with Unicode strings in BaseTen. I created a db  
> according to the BaseTen manual, with en_US.UTF-8 as the default  
> locale. When inserting strings with some non-ASCII characters, like  
> 'Ö', the 'Ö' (Unicode 0x00D6) is translated to two characters: 'O'  
> and then Unicode 0x0308 (diaresis/umlaut). This appears to be the  
> same string, but results in a different hash value for the string.
>

I have to confess to not knowing very much about Unicode, but after  
some Googling I found that I could solve my problem with the different  
hash values by using -[NSString precomposedStringWithCanonicalMapping]  
on the value that was stored using  -[BXDatabaseContext  
createObjectForEntity:withFieldValues:error:]. The hash value on the  
resulting string is what is expected.

The above workaround does not directly relate to BaseTen, but I  
thought I'd include it here for completeness.

Regards,
Erik Aderstedt
Aderstedt Software AB





More information about the baseten-users mailing list