|
|
Unicode and AMD64 under Linux with gcc - long post
Last post 07-19-2008, 10:28 AM by fagiano. 16 replies.
-
07-08-2008, 10:56 AM |
-
James Gregory
-
-
-
Joined on 10-05-2006
-
-
Posts 12
-
-
|
Unicode and AMD64 under Linux with gcc - long post
OK, so people kept telling me my program crashed when compiled as 64 bits under Linux, and then I converted to using unicode, and they told me that it wouldn't even compile. I've had a lot of free time on my hands the past couple of days so I installed Ubuntu using Wubi and decided to take a look myself. 1. Crashing when compiled as 64 bit under Linux I compiled as 64 bit and then ran the game using valgrind. valgrind had about a million reports of the same invalid write in StringTable::Add, then some other errors, then it crashed. The StringTable::Add errors all looked like this: ==8969== Memcheck, a memory error detector. ==8969== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al. ==8969== Using LibVEX rev 1804, a library for dynamic binary translation. ==8969== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP. ==8969== Using valgrind-3.3.0-Debian, a dynamic binary instrumentation framework. ==8969== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al. ==8969== For more details, rerun with: -v ==8969==
==8969== ==8969== Invalid write of size 4 ==8969== at 0x4AB4B7: StringTable::Add(wchar_t const*, long) (sqstate.cpp:521) ==8969== by 0x4A7ECA: SQString::Create(SQSharedState*, wchar_t const*, long) (sqobject.cpp:50) ==8969== by 0x4AD454: SQSharedState::Init() (sqstate.cpp:111) ==8969== by 0x48F959: sq_open (sqapi.cpp:51) ==8969== by 0x47FCA5: ScriptManager::setup_vm() (ScriptManager.cpp:80) ==8969== by 0x48029D: ScriptManager::set_mission_script(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (ScriptManager.cpp:34) ==8969== by 0x477DDF: World::init() (World.cpp:137) ==8969== by 0x45D494: RTS::RTS_State::RTS_State() (RTS.cpp:111) ==8969== by 0x448DF1: game_main() (Main.cpp:214) ==8969== by 0x4494E9: main (Main.cpp:86) ==8969== Address 0xba0d7d0 is 0 bytes after a block of size 72 alloc'd ==8969== at 0x4C22FAB: malloc (vg_replace_malloc.c:207) ==8969== by 0x4A5B74: sq_vm_malloc(unsigned long) (sqmem.cpp:5) ==8969== by 0x4AB46D: StringTable::Add(wchar_t const*, long) (sqstate.cpp:518) ==8969== by 0x4A7ECA: SQString::Create(SQSharedState*, wchar_t const*, long) (sqobject.cpp:50) ==8969== by 0x4AD454: SQSharedState::Init() (sqstate.cpp:111) ==8969== by 0x48F959: sq_open (sqapi.cpp:51) ==8969== by 0x47FCA5: ScriptManager::setup_vm() (ScriptManager.cpp:80) ==8969== by 0x48029D: ScriptManager::set_mission_script(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (ScriptManager.cpp:34) ==8969== by 0x477DDF: World::init() (World.cpp:137) ==8969== by 0x45D494: RTS::RTS_State::RTS_State() (RTS.cpp:111) ==8969== by 0x448DF1: game_main() (Main.cpp:214) ==8969== by 0x4494E9: main (Main.cpp:86) ==8969== ==8969== Invalid write of size 4 ==8969== at 0x4AB4B7: StringTable::Add(wchar_t const*, long) (sqstate.cpp:521) ==8969== by 0x4A7ECA: SQString::Create(SQSharedState*, wchar_t const*, long) (sqobject.cpp:50) ==8969== by 0x4AD4C1: SQSharedState::Init() (sqstate.cpp:112) ==8969== by 0x48F959: sq_open (sqapi.cpp:51) ==8969== by 0x47FCA5: ScriptManager::setup_vm() (ScriptManager.cpp:80) ==8969== by 0x48029D: ScriptManager::set_mission_script(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (ScriptManager.cpp:34) ==8969== by 0x477DDF: World::init() (World.cpp:137) ==8969== by 0x45D494: RTS::RTS_State::RTS_State() (RTS.cpp:111) ==8969== by 0x448DF1: game_main() (Main.cpp:214) ==8969== by 0x4494E9: main (Main.cpp:86) ==8969== Address 0xba0d8bc is 2 bytes after a block of size 74 alloc'd ==8969== at 0x4C22FAB: malloc (vg_replace_malloc.c:207) ==8969== by 0x4A5B74: sq_vm_malloc(unsigned long) (sqmem.cpp:5) ==8969== by 0x4AB46D: StringTable::Add(wchar_t const*, long) (sqstate.cpp:518) ==8969== by 0x4A7ECA: SQString::Create(SQSharedState*, wchar_t const*, long) (sqobject.cpp:50) ==8969== by 0x4AD4C1: SQSharedState::Init() (sqstate.cpp:112) ==8969== by 0x48F959: sq_open (sqapi.cpp:51) ==8969== by 0x47FCA5: ScriptManager::setup_vm() (ScriptManager.cpp:80) ==8969== by 0x48029D: ScriptManager::set_mission_script(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (ScriptManager.cpp:34) ==8969== by 0x477DDF: World::init() (World.cpp:137) ==8969== by 0x45D494: RTS::RTS_State::RTS_State() (RTS.cpp:111) ==8969== by 0x448DF1: game_main() (Main.cpp:214) ==8969== by 0x4494E9: main (Main.cpp:86) ==8969== ==8969== Invalid write of size 4 ==8969== at 0x4AB4B7: StringTable::Add(wchar_t const*, long) (sqstate.cpp:521) ==8969== by 0x4A7ECA: SQString::Create(SQSharedState*, wchar_t const*, long) (sqobject.cpp:50) ==8969== by 0x4AD52E: SQSharedState::Init() (sqstate.cpp:113) ==8969== by 0x48F959: sq_open (sqapi.cpp:51) ==8969== by 0x47FCA5: ScriptManager::setup_vm() (ScriptManager.cpp:80) ==8969== by 0x48029D: ScriptManager::set_mission_script(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (ScriptManager.cpp:34) ==8969== by 0x477DDF: World::init() (World.cpp:137) ==8969== by 0x45D494: RTS::RTS_State::RTS_State() (RTS.cpp:111) ==8969== by 0x448DF1: game_main() (Main.cpp:214) ==8969== by 0x4494E9: main (Main.cpp:86) ==8969== Address 0xa90263c is 2 bytes after a block of size 74 alloc'd ==8969== at 0x4C22FAB: malloc (vg_replace_malloc.c:207) ==8969== by 0x4A5B74: sq_vm_malloc(unsigned long) (sqmem.cpp:5) ==8969== by 0x4AB46D: StringTable::Add(wchar_t const*, long) (sqstate.cpp:518) ==8969== by 0x4A7ECA: SQString::Create(SQSharedState*, wchar_t const*, long) (sqobject.cpp:50) ==8969== by 0x4AD52E: SQSharedState::Init() (sqstate.cpp:113) ==8969== by 0x48F959: sq_open (sqapi.cpp:51) ==8969== by 0x47FCA5: ScriptManager::setup_vm() (ScriptManager.cpp:80) ==8969== by 0x48029D: ScriptManager::set_mission_script(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (ScriptManager.cpp:34) ==8969== by 0x477DDF: World::init() (World.cpp:137) ==8969== by 0x45D494: RTS::RTS_State::RTS_State() (RTS.cpp:111) ==8969== by 0x448DF1: game_main() (Main.cpp:214) ==8969== by 0x4494E9: main (Main.cpp:86) ==8969== ==8969== Invalid write of size 4 ==8969== at 0x4AB4B7: StringTable::Add(wchar_t const*, long) (sqstate.cpp:521) ==8969== by 0x4A7ECA: SQString::Create(SQSharedState*, wchar_t const*, long) (sqobject.cpp:50) ==8969== by 0x4AD59B: SQSharedState::Init() (sqstate.cpp:114) ==8969== by 0x48F959: sq_open (sqapi.cpp:51) ==8969== by 0x47FCA5: ScriptManager::setup_vm() (ScriptManager.cpp:80) ==8969== by 0x48029D: ScriptManager::set_mission_script(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (ScriptManager.cpp:34) ==8969== by 0x477DDF: World::init() (World.cpp:137) ==8969== by 0x45D494: RTS::RTS_State::RTS_State() (RTS.cpp:111) ==8969== by 0x448DF1: game_main() (Main.cpp:214) ==8969== by 0x4494E9: main (Main.cpp:86) ==8969== Address 0xa9026c4 is 6 bytes after a block of size 78 alloc'd ==8969== at 0x4C22FAB: malloc (vg_replace_malloc.c:207) ==8969== by 0x4A5B74: sq_vm_malloc(unsigned long) (sqmem.cpp:5) ==8969== by 0x4AB46D: StringTable::Add(wchar_t const*, long) (sqstate.cpp:518) ==8969== by 0x4A7ECA: SQString::Create(SQSharedState*, wchar_t const*, long) (sqobject.cpp:50) ==8969== by 0x4AD59B: SQSharedState::Init() (sqstate.cpp:114) ==8969== by 0x48F959: sq_open (sqapi.cpp:51) ==8969== by 0x47FCA5: ScriptManager::setup_vm() (ScriptManager.cpp:80) ==8969== by 0x48029D: ScriptManager::set_mission_script(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (ScriptManager.cpp:34) ==8969== by 0x477DDF: World::init() (World.cpp:137) ==8969== by 0x45D494: RTS::RTS_State::RTS_State() (RTS.cpp:111) ==8969== by 0x448DF1: game_main() (Main.cpp:214) ==8969== by 0x4494E9: main (Main.cpp:86)
There were about a million more cases of this. I managed to get rid of these errors with the following diff: --- /host/mydocs/tmp/SQUIRREL2/squirrel/sqstate.cpp 2008-02-09 21:17:47.000000000 +0000 +++ ./squirrel/sqstate.cpp 2008-07-08 17:56:40.000000000 +0100 @@ -511,13 +511,13 @@ SQHash h = ::_hashstr(news,len)&(_numofslots-1);
SQString *s;
for (s = _strings ; s; s = s->_next){
- if(s->_len == len && (!memcmp(news,s->_val,rsl(len))))
+ if(s->_len == len && (!memcmp(news,s->_val,len * sizeof(SQChar))))
return s; //found
}
- SQString *t=(SQString *)SQ_MALLOC(rsl(len)+sizeof(SQString));
+ SQString *t=(SQString *)SQ_MALLOC(len * sizeof(SQChar)+sizeof(SQString));
new (t) SQString;
- memcpy(t->_val,news,rsl(len));
+ memcpy(t->_val,news,len * sizeof(SQChar));
t->_val[len] = _SC('\0');
t->_len = len;
t->_hash = ::_hashstr(news,len);
After this change valgrind had much less to say (slightly cropped) before the game crashed: ==9145== Memcheck, a memory error detector. ==9145== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al. ==9145== Using LibVEX rev 1804, a library for dynamic binary translation. ==9145== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP. ==9145== Using valgrind-3.3.0-Debian, a dynamic binary instrumentation framework. ==9145== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al. ==9145== For more details, rerun with: -v ==9145==
==9145== ==9145== Invalid read of size 4 ==9145== at 0x4AF354: _hashstr(wchar_t const*, unsigned long) (sqstring.h:10) ==9145== by 0x4AB3DE: StringTable::Add(wchar_t const*, long) (sqstate.cpp:511) ==9145== by 0x4A7ECA: SQString::Create(SQSharedState*, wchar_t const*, long) (sqobject.cpp:50) ==9145== by 0x48E40D: sq_pushstring (sqapi.cpp:198) ==9145== by 0x4849A3: Scripting::register_function(SQVM*, std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, int (*)(SQVM*)) (SqWrapper.cpp:825) ==9145== by 0x4858FF: Scripting::register_mission_functions(SQVM*) (SqWrapper.cpp:835) ==9145== by 0x4802D6: ScriptManager::set_mission_script(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (ScriptManager.cpp:37) ==9145== by 0x477DDF: World::init() (World.cpp:137) ==9145== by 0x45D494: RTS::RTS_State::RTS_State() (RTS.cpp:111) ==9145== by 0x448DF1: game_main() (Main.cpp:214) ==9145== by 0x4494E9: main (Main.cpp:86) ==9145== Address 0x964b0c0 is 0 bytes after a block of size 96 alloc'd ==9145== at 0x4C23809: operator new(unsigned long) (vg_replace_malloc.c:230) ==9145== by 0x5AA5C44: std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::_Rep::_S_create(unsigned long, unsigned long, std::allocator<wchar_t> const&) (in /usr/lib/libstdc++.so.6.0.9) ==9145== by 0x5AA6A08: (within /usr/lib/libstdc++.so.6.0.9) ==9145== by 0x5AA6B21: std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> >::basic_string(wchar_t const*, std::allocator<wchar_t> const&) (in /usr/lib/libstdc++.so.6.0.9) ==9145== by 0x4858EA: Scripting::register_mission_functions(SQVM*) (SqWrapper.cpp:835) ==9145== by 0x4802D6: ScriptManager::set_mission_script(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (ScriptManager.cpp:37) ==9145== by 0x477DDF: World::init() (World.cpp:137) ==9145== by 0x45D494: RTS::RTS_State::RTS_State() (RTS.cpp:111) ==9145== by 0x448DF1: game_main() (Main.cpp:214) ==9145== by 0x4494E9: main (Main.cpp:86) ==9145== ==9145== Source and destination overlap in memcpy(0x38, 0x964B078, -4) ==9145== at 0x4C2508B: memcpy (mc_replace_strmem.c:402) ==9145== by 0x4AB4B3: StringTable::Add(wchar_t const*, long) (sqstate.cpp:520) ==9145== by 0x4A7ECA: SQString::Create(SQSharedState*, wchar_t const*, long) (sqobject.cpp:50) ==9145== by 0x48E40D: sq_pushstring (sqapi.cpp:198) ==9145== by 0x4849A3: Scripting::register_function(SQVM*, std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, int (*)(SQVM*)) (SqWrapper.cpp:825) ==9145== by 0x4858FF: Scripting::register_mission_functions(SQVM*) (SqWrapper.cpp:835) ==9145== by 0x4802D6: ScriptManager::set_mission_script(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (ScriptManager.cpp:37) ==9145== by 0x477DDF: World::init() (World.cpp:137) ==9145== by 0x45D494: RTS::RTS_State::RTS_State() (RTS.cpp:111) ==9145== by 0x448DF1: game_main() (Main.cpp:214) ==9145== by 0x4494E9: main (Main.cpp:86) ==9145== ==9145== Invalid write of size 1 ==9145== at 0x4C25127: memcpy (mc_replace_strmem.c:402) ==9145== by 0x4AB4B3: StringTable::Add(wchar_t const*, long) (sqstate.cpp:520) ==9145== by 0x4A7ECA: SQString::Create(SQSharedState*, wchar_t const*, long) (sqobject.cpp:50) ==9145== by 0x48E40D: sq_pushstring (sqapi.cpp:198) ==9145== by 0x4849A3: Scripting::register_function(SQVM*, std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&, int (*)(SQVM*)) (SqWrapper.cpp:825) ==9145== by 0x4858FF: Scripting::register_mission_functions(SQVM*) (SqWrapper.cpp:835) ==9145== by 0x4802D6: ScriptManager::set_mission_script(std::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > const&) (ScriptManager.cpp:37) ==9145== by 0x477DDF: World::init() (World.cpp:137) ==9145== by 0x45D494: RTS::RTS_State::RTS_State() (RTS.cpp:111) ==9145== by 0x448DF1: game_main() (Main.cpp:214) ==9145== by 0x4494E9: main (Main.cpp:86) ==9145== Address 0x38 is not stack'd, malloc'd or (recently) free'd ==9145== ==9145== ERROR SUMMARY: 258 errors from 19 contexts (suppressed: 332 from 1) ==9145== malloc/free: in use at exit: 33,198,902 bytes in 10,483 blocks. ==9145== malloc/free: 58,418 allocs, 47,934 frees, 95,257,007 bytes allocated. ==9145== For counts of detected errors, rerun with: -v ==9145== searching for pointers to 10,483 not-freed blocks. ==9145== checked 35,775,448 bytes. ==9145==
So it seems the problem is either this line in _hashstr: h = h ^ ((h<<5)+(h>>2)+(unsigned short)*(s++)); or otherwise this line in StringTable::Add: memcpy(t->_val,news,len * sizeof(SQChar)); Given that I removed this typedef in squirrel.h to get it to compile: typedef unsigned short wchar_t; I thought changing _hashstr to cast s to a wchar_t* instead of an unsigned short* might help with the first problem, but this didn't make any difference. This crash always occured even before using unicode so I don't think it is unicode related, but rather 64 bit related. But I might be wrong. Here is the size of a few different data types on my computer, maybe it will be of some use: Size of unsigned short: 2 Size of char: 1 Size of wchar_t: 4 Size of int :4 Size of long: 8 2. Not compiling under Linux with SQUNICODE defined. Most errors - and the easiest to fix - are that the standard versions of swprintf and scstrtok require more arguments that Microsoft's versions. Other problems are that "typedef unsigned short wchar_t;" causes a compile error, and that gcc doesn't seem to have any equivalent of the following functions: #define scgetenv _wgetenv #define scsystem _wsystem #define scasctime _wasctime #define scremove _wremove #define screname _wrename I got it to compile with the following patch, but it is a total bodge and clearly isn't the correct way to go about it in many places: https://sourceforge.net/tracker/download.php?group_id=175078&atid=871789&file_id=284034&aid=2013747
|
|
-
07-08-2008, 11:33 AM |
-
ats
-
-
-
Joined on 01-17-2007
-
-
Posts 116
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
On Linux I'm using a patched Squirrel that works with UTF8 encoded strings internally. Otherwise strings end up using 4 bytes per char.
It has worked fine for some months, no changes to scripts to accommodate that.
If you're interested I can post the patch.
Maybe there are other 64 bit issues though.
Regards // ATS.
|
|
-
07-08-2008, 12:02 PM |
-
atai
-
-
-
Joined on 08-16-2005
-
-
Posts 90
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
Hopefully your patch can be merged into the official Squirrel releases (2.2 and 3.0) soon... GNU/Linux is an important platform now
|
|
-
07-08-2008, 1:59 PM |
-
James Gregory
-
-
-
Joined on 10-05-2006
-
-
Posts 12
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
OK, further investigation of valgrind and the crash bug:
1. I think the valgrind "invalid read" in _hashstr is a red herring, I think it gets confused by the fact that the looping condition is checked after "s++" rather than before, so the pointer is invalid but it's not actually read from.
2. Casting s to unsigned short in _hashstr won't make a 64 bit computer crash, but it does mean it's no longer hashing the string correctly, because with the data sizes on my computer it is now iterating through half of each character rather than each whole character.
3. I have sort of discovered the actual cause of the crash, but I don't know the solution. I added the following lines to StringTable::Add
std::wcout << "len: " << len << std::endl; std::wcout << "news: " << news << std::endl; std::wcout << "len uncast: " << scstrlen(news) << std::endl;
And these are the last few lines of my terminal says before it crashes:
news: Squirrel 2.2.1 stable len uncast: 21 len: 10 news: _charsize_ len uncast: 10 len: 9 news: _intsize_ len uncast: 9 len: 4294967295 news: play_script_sound len uncast: 17 Segmentation fault
|
|
-
07-08-2008, 2:56 PM |
-
James Gregory
-
-
-
Joined on 10-05-2006
-
-
Posts 12
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
OK, so the problem was that on my computer an int has a size of 4, whilst a long has a size of 8. squirrel.h defines SQInteger to be a long, so whenever I passed a literal number (such as -1) to a squirrel function it was getting messed up. So I tried changing squirrel.h to typedef SQInteger as an int. But now I get:
sqtable.h:21: error: cast from ‘SQRefCounted*’ to ‘SQInteger’ loses precision
Because pointers have a size of 8. So yeah:
Size of unsigned short: 2 Size of char: 1 Size of wchar_t: 4 Size of int :4 Size of size_t: 8 Size of char*: 8 Size of long: 8
This is the results from gcc on a standard AMD64 running Ubuntu.
|
|
-
07-08-2008, 3:01 PM |
-
fagiano
-
-
-
Joined on 06-12-2005
-
-
Posts 455
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
Hi, looking at your results I can't see the problem, The only difference from my usual tests is that wchar_t is 4 bytes other than 2. I always assumed ppl on Unix platform use UTF8, unix UCS2 and UCS4 support seems very confusing to me.
do you have a simple way for me to reproduce this? I'll run it on my 64 bit box(I got it just for this occasions :) ).
currently I'm not at home, but as soon as I'm back to Singapore I'll look into this(this week end).
btw, are you 100% sure isn't and external peice of code screwing around with the heap or stack?
ciao
Alberto
PS: don't worry if there's a bug and we find a fix, I'll make sure it makes it into both 2.x and 3.x branches.
|
|
-
07-08-2008, 3:11 PM |
-
fagiano
-
-
-
Joined on 06-12-2005
-
-
Posts 455
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
Ok, one reason comes to mind. Are you 100% sure that both SQUNICODE and _SQ64 are defined everywhere squirrel code happens? both while compiling squirrel and while using the API?
I've seen scary things happen for similar reason in past projects(not on squirrel), due to having mudules compiled with one flag and main program with some different one. They still compiled and linked fine, but produced some unexplainable data misalignments, that at first look, could pass for a stack corruption or compiler bug.
Alberto
|
|
-
07-08-2008, 3:17 PM |
-
fagiano
-
-
-
Joined on 06-12-2005
-
-
Posts 455
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
BTW on 64bits architectures SQInteger should be 8 bytes.
I have the feeling that the problem is indeed that you don't have SQ64 defined somewhere.
the rule is if squirrel.h or the libs are included, you must define SQ64.
Alberto
|
|
-
07-08-2008, 3:52 PM |
-
James Gregory
-
-
-
Joined on 10-05-2006
-
-
Posts 12
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
OK, so it appears the crash bug was entirely my own fault. My Linux Makefile was defining _SQ64 when building squirrel but not when including squirrel.h into my own project.
Someone else suggested that:
#ifdef _LP64 #define _SQ64 #endif
could be added into squirrel.h, maybe that would be a good idea?
The issue of not compiling under Linux with SQUNICODE defined still stands, however.
|
|
-
07-08-2008, 4:43 PM |
-
ats
-
-
-
Joined on 01-17-2007
-
-
Posts 116
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
Here is a patch for Squirrel in UTF8 mode: squtf8.patch. It's built with Squirrel 2.5(work3). I've been using it for some months without problems. It does foreach, len() and [] array lookup the way one could expect for a Unicode string. Behind the scenes it uses some UTF8 iterators to keep from looping while accessing strings with [] or querying for length. Performance also with long strings is good. Regards // ATS.
|
|
-
07-08-2008, 4:53 PM |
-
ats
-
-
-
Joined on 01-17-2007
-
-
Posts 116
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
Oh, I forgot, when testing, activate with
#define SQUTF8
And SQUNICODE should not be defined then.
Regards // ATS.
|
|
-
07-09-2008, 1:15 AM |
-
fagiano
-
-
-
Joined on 06-12-2005
-
-
Posts 455
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
If someone can come out with a realiable way to detect a LINUX64 i'll put something like #ifdef _LINUX64 | _WIN64 | _XBOX360
#define SQ64
#endif
I think for the most popular platforms would avoid the _SQ64 issue.
about Unicode on Linux my feeeling is that is not really meant to be used, I never seen any serious app using UCS4. with squirrel in char* mode you can safely embed UTF8 I always assumed that was the Unix way(I'm not an expert on localization on X platforms, if someone has some suggestions I'm listening).
Alberto
|
|
-
07-09-2008, 4:01 AM |
-
ats
-
-
-
Joined on 01-17-2007
-
-
Posts 116
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
Yes, you can safely embed UTF8, but using it from the script is another issue:
local s = "1€5" // 1, Euro sign, 5 foreach( k,p in s ) print( p+" : "k+"\n" )
will print
0 : 49 1 : 32 2 : 172 3 : 53
That's the bytes, its not the characters, and differs with what you get in Squirrel wchar_t mode.
With UTF8 patch it prints:
0 : 49
1 : 8364
2 : 53
With the patch you can process a string chacracter by character again. Now you get the same result when running this script under a Win wchar_t platform and Linux UTF8 one.
To do this efficient for array lookups: s[ix] one has to use some hidden iterators so we avoid stepping from the beginning of the string on each access.
Regards // ATS.
|
|
-
07-09-2008, 2:36 PM |
-
atai
-
-
-
Joined on 08-16-2005
-
-
Posts 90
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
What is Squirrel 2.5(work3)?
|
|
-
07-09-2008, 3:13 PM |
-
ats
-
-
-
Joined on 01-17-2007
-
-
Posts 116
-
-
|
Re: Unicode and AMD64 under Linux with gcc - long post
It's one of these test releases made in the forum (2 months back?)
From squirrel.h:
#define SQUIRREL_VERSION _SC("Squirrel 2.5 work3")
Regards // ATS.
|
|
Page 1 of 2 (17 items)
1
|
|