assimp/contrib/ConvertUTF
aramis_acg a251827cb9 Adding support for Unicode input files to most text file loaders (BVH and MD5 missing for now).
IrrXML receives memmapped UTF-8 input data now, it's own (faulty) conversion is not used anymore.
aiString's are explicitly UTF-8 now.
Slight refactorings and improvements.
Adding UTF-8/UTF-16 text files for ASE,obj,collada,ac3d. These contain various japanese/chinese character sequences.
Changing assimp_view's node view to display UTF-8 multibyte sequences correctly.

git-svn-id: https://assimp.svn.sourceforge.net/svnroot/assimp/trunk@469 67173fc5-114c-0410-ac8e-9d2fd5bffc1f
2009-08-21 22:49:58 +00:00
..
ConvertUTF.c Adding support for Unicode input files to most text file loaders (BVH and MD5 missing for now). 2009-08-21 22:49:58 +00:00
ConvertUTF.h Adding support for Unicode input files to most text file loaders (BVH and MD5 missing for now). 2009-08-21 22:49:58 +00:00
readme.txt Adding support for Unicode input files to most text file loaders (BVH and MD5 missing for now). 2009-08-21 22:49:58 +00:00

readme.txt

The accompanying C source code file "ConvertUTF.c" and the associated header
file "ConvertUTF.h" provide for conversion between various transformation
formats of Unicode characters.  The following conversions are supported:

	UTF-32 to UTF-16
	UTF-32 to UTF-8
	UTF-16 to UTF-32
	UTF-16 to UTF-8
	UTF-8 to UTF-16
	UTF-8 to UTF-32

In addition, there is a test harness which runs various tests.

The files "CVTUTF7.C" and "CVTUTF7.H" are for archival and historical purposes
only. They have not been updated to Unicode 3.0 or later and should be
considered obsolescent. "CVTUTF7.C" contains two functions that can convert
between UCS2 (i.e., the BMP characters only) and UTF-7. Surrogates are
not supported, the code has not been tested, and should be considered
unsuitable for general purpose use.

Please submit any bug reports about these programs here:

	http://www.unicode.org/unicode/reporting.html

Version 1.0: initial version.

Version 1.1: corrected some minor problems; added stricter checks.

Version 1.2: corrected switch statements associated with "extraBytesToRead"
	in 4 & 5 byte cases, in functions for conversion from UTF8.
	Note: formally, the 4 & 5 byte cases are illegal in the latest
	UTF8, but the table and this code has always catered for those,
	cases since at one time they were legal.

Version 1.3: Updated UTF-8 legality check;
	updated to use UNI_MAX_LEGAL_UTF32 in UTF-32 conversions
	Updated UTF-8 legality tests in harness.c
 

Last update: October 19, 2004