summaryrefslogtreecommitdiffstats
path: root/kexi/kexiutils/transliteration_table.readme
diff options
context:
space:
mode:
authortpearson <tpearson@283d02a7-25f6-0310-bc7c-ecb5cbfe19da>2010-01-20 01:29:50 +0000
committertpearson <tpearson@283d02a7-25f6-0310-bc7c-ecb5cbfe19da>2010-01-20 01:29:50 +0000
commit8362bf63dea22bbf6736609b0f49c152f975eb63 (patch)
tree0eea3928e39e50fae91d4e68b21b1e6cbae25604 /kexi/kexiutils/transliteration_table.readme
downloadkoffice-8362bf63dea22bbf6736609b0f49c152f975eb63.tar.gz
koffice-8362bf63dea22bbf6736609b0f49c152f975eb63.zip
Added old abandoned KDE3 version of koffice
git-svn-id: svn://anonsvn.kde.org/home/kde/branches/trinity/applications/koffice@1077364 283d02a7-25f6-0310-bc7c-ecb5cbfe19da
Diffstat (limited to 'kexi/kexiutils/transliteration_table.readme')
-rw-r--r--kexi/kexiutils/transliteration_table.readme54
1 files changed, 54 insertions, 0 deletions
diff --git a/kexi/kexiutils/transliteration_table.readme b/kexi/kexiutils/transliteration_table.readme
new file mode 100644
index 00000000..ff00d8ab
--- /dev/null
+++ b/kexi/kexiutils/transliteration_table.readme
@@ -0,0 +1,54 @@
+Transliteration Table README
+----------------------------
+
+1. Rationale: Identifiers within the database or programming languages
+only accept latin-1 characters, numbers and '_' character.
+
+Application developers can enter captions (titles) to give
+objects or variables a meaningful name using full unicode set.
+
+Transliteration is used to convert unicode captions to identifiers
+without loosing meaning of the names.
+
+More info:
+ http://en.wikipedia.org/wiki/Transliteration
+ http://en.wikipedia.org/wiki/Romanization
+
+2. We use special kind of romanization as we only allow characters
+described in 1.
+
+3. Implementation: transliteration table, was generated by
+generate_transliteration_table.sh shell script is used
+to transliterate any unicode character (having code < 65535)
+to an identifier, what gives constant time for converting
+single character.
+
+The resulting generated code is kept in transliteration_table.{h|cpp} files,
+included by identifier.cpp for use in public utility functions.
+
+For each item, the table (basically a table of c-strings) contains:
+- a NULL string it the resulting conversion have to be "_" string;
+- a c-string of size 1 or more containing a valid transliteration
+ as described in 1;
+- an empty string "" if the transliteration should return empty string
+ (can be useful e.g. for soft signs in Cyrillic)
+
+4. Fixes: Because iconv/recode tools are not fully implemented in regards
+to transliteration to latin-1 (e.g. no good support
+for Greek and Cyrillic/Serbian characters),
+the transliteration_table.cpp file is patched with
+transliteration_table.cpp.patch which provides fixes written by hand.
+
+If you find invalid or missing transliterations:
+ a) edit transliteration_table.cpp (using UTF-8-compliant text editor!)
+ - if transliteration_table.cpp file does not exist,
+ extract it from transliteration_table.bz2 archive
+ b) run update_transliteration_table_patch.sh shell script,
+ what will update the transliteration_table.cpp.patch file
+ c) send the transliteration_table.cpp.patch file to the Kexi team
+
+5. Credits
+ Jaroslaw Staniek <js at iidea.pl>
+ Michael Drueing <michael at drueing.de>
+ Chusslove Illich <caslav.ilic at gmx.net>
+ Michal Svec <rebel at atrey.karlin.mff.cuni.cz>