Let me try to give a summary of the features in - *(CLISP) 2000-03-06 - Eclipse - ACL 6.0 (not yet released) - LispWorks 4.0.1 * Character and string types In Eclipse, ACL, LispWorks the type BASE-CHAR includes only Latin-1 characters, whereas the CHARACTER type includes all of Unicode (16 bit). In CLISP, BASE-CHAR and CHARACTER are equivalent and include all of Unicode (16 bit). The memory representation of read-only strings (e.g. symbol print names and program literals) is optimized to 1 byte/character if possible. * Supported external formats of streams - CLISP 2000-03-06: Around 80 external formats, including all of the ones supported by browsers and Linux locales. - Eclipse: Only :ASCII (1 byte/character), :UCS (2 bytes/character), and :MULTI-BYTE (locale dependent multibyte representation, works only on OSes for which wchar_t is Unicode). - ACL 6.0: Lots of external formats, mostly table-driven. - LispWorks: Around 10 external formats, including Latin-1, Unicode (2 bytes/character), UTF-8, and the most important Japanese encodings (but not ISO-2022-JP). Different end-of-line conventions are indicated to OPEN through the :external-format argument in CLISP and LispWorks, and through an extra argument to OPEN in Eclipse. * Additional API - CLISP: STRING-WIDTH returns the display width of a string, used by FORMAT ~T. - Eclipse: none. - ACL: unknown. - LispWorks: functions for guessing the encoding of a file (important for Japanese environments) * FFI support - CLISP: FFI can pass strings only with single-byte encodings. - Eclipse, ACL: unknown - LispWorks: a few specialized macros for passing strings from/to C.
* new feature: a build-time option (controlled by the :SB-UNICODE keyword feature, enabled by default) for building the system with support for the entire 21-bit character space defined by the Unicode consortium.There are a few details on the sbcl-internals cliki, and a more detailed explanation from Christophe Rhodes which he presented at the European Common Lisp Meeting, April 2005 in this pdf.
Since version 2.35, the FFI can pass strings in multibyte encodings as well.
Here's a link to Bruno Haible's The Unicode HOWTO
Here are the docs on Unicode.