Skip to content. | Skip to navigation

Personal tools

Navigation

You are here: Home / Tips / Localization and Multilingual

Localization and Multilingual

訊息在地化、多國語文切換、多國語文查詢的處理方式,包括 CJK 字元、字型、編碼、翻譯、斷詞的問題。

Translating Text Strings from Developer Manual

Overview by Maurits van Rees PEP 263 Prevent ASCII error for encode/decode in utf8 into converter: no plone.dexterity dependency

from Products.CMFPlone.utils import safe_unicode
def safe_utf8(s):
    if isinstance(s, unicode):
        s = s.encode('utf8')
    return s

collective.solr: handle non-ASCII values in the query

- query = ' '.join(query.values())
+ query = u' '.join([safe_unicode(val) for val in query.values())]).encode('utf-8')

mosky uniout Java 判斷中文字元的範例 Big5 to Unicode

sort_on = schema.TextLine(
    title=_(u'label_sort_on', default=u'Sort on'),
    description=_(u"Sort the collection on this index"),
    required=False,
)

翻譯狀況統計 Language Codes

Language Switching

Portal Object 會搭配 Default View 來切換語系 設定 language-switcher 可以指定 Plone Root 的預設語系

啟用步驟: /mysite/dexterity-types 從 Behaviors 勾選 Multilingual Support 從 /mysite/folder_contents Translate 選取 Edit 英文

想要把 English (USA) 改成 English 字樣,可修改 plone.i18n/locales/languages.py 內容。

without language root folder RelatedItemsWidget bug

多國語文方案 教學文件 不分語系的查詢方式: path='/' Language='all' Language=''

表單頁籤 Default 的翻譯

欄位預設值依照語系回傳

def default_foo():
    return _('msgid')

...
defaultFactory=default_foo,
@default_value(field=IMySchema['foo'])
def default_foo(data):
    ...

TinyMCE JavaScript Language File: 404 Not Found Wrong portal_url

Personal Preference vs UI Language

plone.restapi/src/plone/restapi/serializer/converters.py:11: DeprecationWarning: getSiteEncoding: `getSiteEncoding` is deprecated. Plone only supports UTF-8 currently. This method always returns "utf-8"

Backporting Python3 Open Encoding

Firefox: The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol.

<meta charset="utf-8" />

collective.linguadomains: to signal Plone which languages it should provide under which domain

Show translations from LinguaPlone if Canonical is available: collective.jsonify

Widgets and Misc

Remove the _u compat Function collective.storedtranslations 利用 Plone Registry 記錄翻譯

eea.facetednavigation conditionally query by unicode value widget title

Custom Title Viewlet plone.htmlhead.title 切換網站標題翻譯

英文的 collection 只能搜尋英文內容,對於想要搜尋中文內容的情況,最不濟的方法,就是先複製中文到英文版本裡。

Related Items Widget Not Working After Installing PAM: 問題在 plone.app.querystring

取代系統翻譯值

根據文件,先在自製模組 locales 目錄裡,複製 plone/app/locales/locales/plone.pot 之類的檔案,只留想要翻譯的條目,產生 zh_TW/LC_MESSAGES/plone.po 後,在 develop.cfg 裡加上:

zcml +=
    my.package

Untranslated Fixup

bin/buildout -c experimental/i18n.cfg; bin/i18n-find-untranslated details

<tal:comment replace="nothing">something</tal:comment>
<metal:file use-macro="context/widgets/image/macros/image_view">download link</metal:file>
<span tal:content="size">size</span>
<span tal:replace="size">size</span>

執行 buildout 觀察是否生效。

利用 mr.developer 測試翻譯結果 google translate API 

zope.i18n: Products.CMFPlone/settings.py

Unicode

圖解示意的文章,直接在記憶體處理字串時,通常只會用到 Unicode String,一旦牽涉 IO 時,就需要轉換 Byte String 表示,例如輸出到 console, file, network socket 的場合,兩者間的轉換並未一對一對應,因此 UnicodeDecodeError 成為最常見的錯誤。String to Unicode String to UTF-8 Byte String 和 Unicode String 合併時會產生錯誤,需要 decode 後再合併。常見的錯誤是 UnicodeDecodeError - collective.contentleadimage collective.fingerpointing

patching Archetypes' Field.py, tag method of ImageField:

if isinstance (alt, str):
    alt = alt.decode('utf-8', 'replace')
if isinstance (title, str):
    title = title.decode('utf-8', 'replace')

Theming Controlpanel fails silently when saving parameterExpressions with non-ASCII chars

Python 2 supports a concept of "raw" Unicode literals that don't meet the conventional definition of a raw string. Python 3 has no corresponding concept - the compiler performs no preprocessing of the contents of raw string literals. This matches the behaviour of 8-bit raw string literals in Python 2.

Debug Practice: collective.upload

plone.i18n Configuration UniDecode Character Mapping

Example: iCalendar Contentline, tutorial.todoapp, eea.facetednavigation, refactor textIndexer method to avoid Unicode Error, collective.z3cform.widgets, collective.geo.mapwidget Products.feedfeeder Products.LoginLockout alm.solrindex

from Products.CMFPlone.utils import safe_unicode
...
"login names: %s" % safe_unicode(','.join(user_ids))

Subject / Category for Event

Enable Wide Unicode Support in Python < 3.3

ZODB 假設 Transaction Description 是 ASCII 字串,附上 Reproducer Script

from persistent.mapping import PersistentMapping

import transaction

from ZODB.DB import DB
from ZODB.FileStorage import FileStorage


def get_app_root():
    fs = FileStorage('Data.fs')
    db = DB(fs)
    conn = db.open()
    db_root = conn.root()

    with transaction.manager:
        app_root = PersistentMapping()
        db_root['app_root'] = app_root

    return db_root['app_root']


if __name__ == '__main__':
    app_root = get_app_root()

    with transaction.manager as t:
        t.note('Something non-ascii like 森')
        app_root['item'] = PersistentMapping()

PloneHelpCenter Example

GroupServer Example

DexterityFileSetter

Dexterity vs Archetypes Title Description Index

Archetypes Calendar Popup

example collective.behavior.localdiazo

Improve Unicode Compatibility between Python 2 and 3

DateTime

時間字串的多國語文處理: toLocalizedTime Products.CMFPlone.i18nl10n.ulocalized_time 翻譯日期問題 experimental.ulocalized_time

在 Products.CMFPlone/browser/ploneview.py 有 toLocalizedTime() 定義,使用 long_format=1 參數會顯示日期加時間,使用 time_only=1 參數會顯示時間(?) 實際工作是轉交 Products.CMFPlone/interfaces/translationservice.py 執行,而且與 plone.app.locales 的 plonelocales.po 有關;它說仿照 strftime 格式,但發現不支援 -m 格式。

MailHost GMT+8 導致時間差16小時: Products/MailHost/MailHost.py 裡 _mungeHeaders() 會檢查是否有 Date 欄位,沒有的話,會指定 DateTime().rfc882(),另一個 ModificationDate 時間顯示的程式碼 plone/app/layout/viewlets/document_byline.pt 則正常,在 Products/CMFDefault/DublinCore.py 有定義不同用途的時間,在 Products/CMFPlone/TranslationServiceTool.py 有定義 ulocalized_time,在 Products/CMFPlone/i18nl10n.py 找得到 Unicode Aware Localized Time 的定義。

Products.CMFPlone/i18n|10n.py override_dateformat toLocalizedTime() Helper Method

Look for strings of cst in the following files:
Plone 3.x: Zope-2.10.11-final-py2.4/lib/python/DateTime/DateTime.py
Plone 4.x: buildout-cache/eggs/zope.datetime-{$VERSION}/zope/datetime/__init__.py

Why the Display Messed Up

Mixed Encoding

http://www.evanjones.ca/python-utf8.html

http://tarekziade.wordpress.com/tag/qa-python-zope-plone

http://www.netsight.co.uk/blog/unicode-stuff-and-lxml lxml turns multiple lines into one

https://www.transifex.com/projects/p/Plone/language/zh_TW

jarn.jsi18n support

wrong value

Keywords (Tags) with non-ascii characters cannot be removed on dexterity objects, because obj.Subject() on dexterity objects returns a "safe_utf8" encoded list while the request values in TagsAction BrowserView are unicode values: wildcard.foldercontents

plone.app.layout/globals/interfaces.py IPortalState

<div tal:attributes="lang context/@@plone_portal_state/language">
from zope.component import getMultiAdapter
...

portal_state = getMultiAdapter((self.context, self.request), name=u'plone_portal_state')
current_language = portal_state.language()

Normalizer

plone.i18n/normalizer/__init__.py 有 IDNormalizer URLNormalizer 預設使用 lower() 來轉換,Normalizing IDs

from plone.i18n.normalizer.interfaces import IFileNameNormalizer
from zope.component import queryUtility

normalizer = queryUtility(IFileNameNormalizer)

case1 = u'CåpitalName'
normalizer.normalize(case1)
'CapitalName'

case2 = u'中文'
normalizer.normalize(case2)
'4e2d6587'

URL Normalization PinYin

Normalize Unicode

unnecessary normalization example

wildcard.foldercontents name normalizer file upload atreal.massloader IFileNameNormalizer CMFPlone.utils.normalizeString

工具程式

最常用的是 i18ndude 工具,它是協助 gettext 翻譯的程式,可以在 buildout.cfg 裡指定安裝。另一個工具 lingua 提供相似的功能。

parts =
        ...
        i18ndude

[i18ndude]
recipe = zc.recipe.egg
eggs = i18ndude

批次作業範例

$ i18ndude rebuild-pot --pot ./my.pkg.pot --merge my.pkg-manual.pot --create my.pkg ../templates || exit 1
$ i18ndude sync --pot ./my.pkg.pot ./*/LC_MESSAGES/my.pkg.po

Since 3.4.5 i18ndude supports chameleon, so there should be no need to "modify your local zope.tal".

slc.linguatools 能協助批次處理翻譯工作。

Keep PO Files Line Length at 80 Characters: PO_MAX_WIDTH Environment Variable

collective.googleanalytics example Dynamic Content Translation collective.pwexpiry: Message ID 如果包含變數值,在 zope.i18n 翻譯之前,就會被讀取,所以未被翻譯到。

IStatusMessage(self.REQUEST).add(
    _(u'account_locked',
       default = u'Your account has been locked due to too many invalid '
        'attempts to login with a wrong password. Your account will '
        'remain blocked for the next ${hrs} hours. You can reset your '
        'password, or contact an administrator to unlock it, using '
        'the Contact form.', mapping={'hrs': user_disabled_time}
        ), type='error'
)

取消 attribute 的翻譯: test code

<img src="${view/image_url}"
     alt="${view/image_caption}"
   title="${view/image_caption}"
 i18n:ignore-attributes="alt;title" />

Plone 3 vs Plone 4 distributing compiled translations override core translation

[instance]
# ...
environment-vars =
    zope_i18n_compile_mo_files true

zest.pocompile

precedence of messages from multiple PO files

使用 lingua 取代 i18ndude 更換模組的 i18n domain copy-content-to

CSV

UTF8 的 CSV 直接用 Excel 開啟時,預設使用 ANSI 編碼,會造成亂碼,如果強迫 UTF8 轉成 ANSI 編碼,可能造成簡體或日文字消失。

transmogrifier imports big5 encoded CSV.

UnicodeDecodeError: 'utf8' codec can't decode byte 0xa9 in position 0: invalid start byte

MySQL has a 3 byte limit on utf-8 characters: incorrect string value

Migration

indexes mixing string and unicode.

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 14:
 ordinal not in range(128)

UnicodeDecodeError after Plone 3.3.5 to 4.0.5 migration

http://www.zopyx.de/blog/perhaps-a-useful-plone-4-migration-hint-clear-the-catalog-first

Plone5 rely on Chameleon's support for putting a full dictionary in tal:attributes to specify a number of different data attributes at once. But that isn't supported by the TAL parser used by i18ndude.

Catalog Indexes

Unicode, Catalog Index, SearchableText, Archetypes Objects

Fuzzy Search

中文全文檢索的挑戰,在於有效的斷詞方式。中文文法邏輯不足,一般 NLP 有效的方法,未必適用於中文。統計方式容易發生精準度問題。pat-tree 在資料量大時,處理時間越久。

中文斷詞: 中文分詞器比較 中文分詞系統 Smart 斷詞

Unicode encoding and decoding

http://plone.org/documentation/manual/plone-community-developer-documentation/troubleshooting/unicode zope.scema._field.Choice.fromUnicode

Python 2.x 並沒有特別區別 8-bit string (byte data) 和 16-bit unicode string (character data) 兩者的差異,當程式讀進的內容混用上述兩種資料格式時,錯誤就會產生:

SyntaxError: Non-ASCII character '\xe6' in file
/home/marr/Plone/zinstance/src/my.package/my/package/config.py
on line 14, but no encoding declared;

The message should include the hint: see http://www.python.org/peps/pep-0263.html for details.

# -*- coding: utf-8 -*-

icalendar

- if values and not isinstance(values, str):
+ if values and not isinstance(values, basestring):
    if hasattr(values, 'to_ical'):
      values = values.to_ical()

Graceful Handling UnicodeEncodeError http://docs.python.org/howto/unicode open() 讀取的資料使用 str 或 unicode 型態呢?

This tip should go to "Localization Tips"

Contribute to Core Translations

environment-vars =
    zope_i18n_compile_mo_files true

body_plain transforms.convertTo()

LingualPlone Instruction short name issue Remove Item From reference_catalog if ID is Gone

http://pypi.python.org/pypi/raptus.multilanguageplone

Mockup Translation

ATFieldProperty Title

http://code.google.com/p/dexterity/issues/detail?id=144

http://dev.plone.org/plone/changeset/46983

範例: collective.referencedatagridfield collective.searchandreplace configure.zcml

JavaScript

JavaScript jarn.jsi18n Javascript translation strings are defined in method getTranslatedJSLabels and returned as a string. The Content-Type header must be set to application/x-javascript or it defaults to text/plain, and the javascript resource is -- correctly -- rejected by modern security conscious browsers.

JSVariables View Translation UnicodeError

for key in messages:
  template = "{0}{1}: '{2}',\n".format(template, key, msg.encode('utf-8'))

Unicode Error in Page Template collective.recipe.template

Changing Workflow in Multilingual Site Works for Only One Language

dexterity vocabulary

不需要翻譯的欄位內容 Google App Engine Upload Filename

autoinclude only plone plugins so zope.i18n doesn't conflict with zope2 ConfigurationConflictError

en_GB Locale Example

Graceful Handling Unicode Encode Error 變成 28 month_aug_abbr 2015 而不是 28 Aug 2015

SQL Encoding

PostgreSQL Encoding

UsersOverviewControlPanel

collective.xmpp.chat 使用 plone.app.controlpanel 的 UsersOverviewControlPanel 來執行帳號搜尋的工作,當 Plone PAS 讀取 LDAP 資料,內含不同編碼時,就會出現 UnicodeDecodeError。

MonkeyPatch

RhaptosSite 使用 monkeypatch.py 方式來處理 unicode 資料。

return result.encode('utf-8')

Schemata

使用下列指定方式,會產生錯誤:

schemata = '附表'
ERROR Zope.SiteErrorLog 1297222850.410.655029907557
 http://localhost:8080/mysite/myfolder/
 portal_factory/MyType/mytype.2011-02-09.7742577101/base_edit
Traceback (innermost last):
  Module ZPublisher.Publish, line 119, in publish
  Module ZPublisher.mapply, line 88, in mapply
  Module ZPublisher.Publish, line 42, in call_object
  Module Products.CMFPlone.FactoryTool, line 379, in __call__
  Module ZPublisher.mapply, line 88, in mapply
  Module ZPublisher.Publish, line 42, in call_object
  Module Products.CMFFormController.FSControllerPageTemplate, line 90, in __call__
  Module Products.CMFCore.FSPythonScript, line 196, in _exec
  Module None, line 6, in validate_base
   - <FSControllerValidator at /mysite/validate_base used for
      /mysite/myfolder/portal_factory/MyType/mytype.2011-02-09.7742577101>
   - Line 6
  Module Products.Archetypes.BaseObject, line 510, in validate
  Module Products.Archetypes.Schema, line 562, in validate
  Module UserDict, line 17, in __getitem__
KeyError: '\xe9\x99\x84\xe8\xa1\xa8'

要改用小寫英文,例如 appendix,再修改 plone.app.locales/i18n/plone-zh-tw.po 手動加入下列設定值:

#. Default: "Appendix"
#: ./MyType/content/mytype.py
msgid "label_schema_appendix"
msgstr "附表"

Action Title

https://bugs.launchpad.net/zope-cmf/+bug/267356

Edge Condition

https://github.com/collective/collective.js.jqueryui/commit/74ce4cb55e0982c42e60b217be19660eb3ba7769#commitcomment-1734757

Python 3

http://plope.com/Members/chrism/python_2_vs_python_3_str_iter

http://lucumr.pocoo.org/2010/1/7/pros-and-cons-about-python-3

plone.app.multilingual

Plone5 內建安裝,但預設未啟用。Plone4 develop.cfg 直接加 plone.app.multilingual 執行 bin/buildout 會下載 p.a.multilingual 5.0.6, z3c.relationfield 0.6.3 並產生錯誤

Version and requirements information containing plone.app.layout:
  [versions] constraint on plone.app.layout: 2.3.17
  Requirement of plone.app.multilingual: plone.app.layout!=2.6.0,!=2.6.1,!=2.6.2,>=2.5.22
While:
  Installing client1.
Error: The requirement ('plone.app.layout!=2.6.0,!=2.6.1,!=2.6.2,>=2.5.22')
 is not allowed by your [versions] constraint (2.3.17)

原本只有 plone.app.i18n.locales.languageselector 的 Viewlet,啟用 plone.app.multilingual 後,會新增 plone.app.multilingual.languageselector 的 Viewlet,它能完全取代前者,所以透過 /@@manage-viewlets 將前者隱藏。

multilingual framework sprint discussion

plone.dexterity = 2.2.4 might support zope.schema context aware default factories for behaviors with custom factories

試過 plone.multilingual + plone.app.multilingual + plone.multilingualbehavior

plone.app.multilingual 2.0.x 初期版本必要先啟用 plone.app.contenttypes 不然會有錯誤 ValueError: undefined property 'behaviors',但啟用 plone.app.contenttypes 要先執行 @@atct_migrator 完成昇級,之後再啟用 plone.app.multilingual。安裝過程可能要求 plone.theme > 2.1.4,Plone 4.3.7 預設使用 plone.theme = 2.1.5。

p.a.multilingual 3.0.5 (2015-08-20) Move @@multilingual-selector registration from PloneRoot to Navigation root. This allows to hide language folders in nginx and to use different domains.

Nginx 語系切換的轉址設定

[archetypes] 選項啟用時,才需要 Plone 加入 archetypes.multilingual 模組。

設定 Viewlet - 新增 plone.app.multilingual.languageselector 後,就可以透過 @@manage-viewlets 來取消 plone.app.i18n.locales.languageselector 顯示。

In Plone 5:  a LanguageRootFolder is also an INavigationRoot. So plone.app.querystring already fixates all this to the current INavigationRoot if no other path is given. Not sure how this is handled in sitemap, search et al. At least either Language='all' nor Language='' has any positive effect on the search. It moreover results in an empty result.

Language get/set via an unified adapter

from plone.app.multilingual.interfaces import ILanguage
language_get = ILanguage(context).get_language()
language_set = ILanguage(context).set_language('zh')

Independent Field

Dexterity: Directive, Supermodel, Native

xmlns:lingua="http://namespaces.plone.org/supermodel/lingua"

<field name="myField" type="zope.schema.TextLine" lingua:independent="true">
  <description />
  <title>myField</title>
</field>

Language Independent Field Dexterity LIF-defaultvalueprovider

Use Babel to Translate Your Python Package

jquery + responsive theme

Access Translated Object's Attribute

for i in folder.objectIds():
    zh = folder[i]
    print "zh: ", i
    en = ITranslationManager(zh).get_translation('en')
    en.creation_date = zh.creation_date
    en.reindexObject()
    print "en: ", i

Local Time

日期顯示 Localized Date Time DateI18nWidget

from zope.app.form.browser import DateI18nWidget
from zope.i18n.format import DateTimeParseError
from zope.app.form.interfaces import ConversionError

class MyDateI18nWidget(DateI18nWidget):
    displayStyle = None

    def _toFieldValue(self, input):
        if input == self._missing:
            return self.context.missing_value
        else:
            try:
                formatter = self.request.locale.dates.getFormatter(
                    self._category, (self.displayStyle or None))
                return formatter.parse(input.lower())
            except (DateTimeParseError, ValueError), v:
                raise ConversionError(_("Invalid datetime data"),
                    "%s (%r)" % (v, input))

class SimuladorForm(PageForm):
    ...
    form_fields['start_date'].custom_widget = MyDateI18nWidget

Can't Subtract Offset-Naive and Offset-Aware DateTimes z3c.form plone.app.event

References

http://www.fileformat.info/info/unicode/

tutorial.todoapph3

Edge Condition Discourse 中文搜尋

for setting local panel managers on INavigationRoot instead of ISiteRoot. Useful eg. with modules for multilingual content: collective.panels

work with LinguaPlone: Products.PloneKeywordManager

Symfony Admin Generator Example

Rails Example 使用 rails-i18n 模組 YAML 格式

raptus.multilanguageconstraint 提供欄位來限制只顯示特定語系

web font across platform

ERROR plone.app.viewletmanager rendering of plone.contentviews
 in plone.contentactions fails:
 ('Could not adapt', <PloneSite at /mysite>,
 <InterfaceClass plone.app.multilingual.interfaces.ITranslationManager>

遇過上述問題,似乎重裝後被解決,留下訊息備查。

BytesIO vs StringIO in Python 2.7

In Python 2.x, "string" means "bytes", and "unicode" means "string". You should use the StringIO or cStringIO modules. The mode will depend on which kind of data you pass in as the buffer parameter.
http://stackoverflow.com/questions/1279244/bytesio-with-python-v2-5

CSS 中文處理

tagcloud-widget-utf8.py

pyramid kotti language selector

folder = site.zh.myfolder
mylist = [
'2004071702'
]

for i in mylist:
    zh = folder[i]
    print "zh: ", i
    en = ITranslationManager(zh).get_translation('en')
    en.creation_date = zh.creation_date
    en.reindexObject()
    print "en: ", i

下列是舊方法,會造成中文版內容要指定語系 (update_language) 後才生效

Available language (Untranslated languages from the current content)
update_language

    reader = unicode_csv_reader(f, dialect)
for row in csv_reader:
    folder = mysite.zh.myfolder
    a_zh = createContentInContainer(folder, 'MyType',
           id=str(row[0]), title=row[1])
    wftool.doActionFor(a_zh, 'publish')
    a_zh.reindexObject()
    a_en = api.translate(a_zh, 'en')
    a_en.id = str(row[0])
    a_en.title = row[3]
    wftool.doActionFor(a_en, 'publish')
    a_en.reindexObject()

ProxyPass: BereqURL /VirtualHostBase/https/mysite.com/Plone/en/VirtualHostRoot/@@overview-controlpanel

Migration Plone4 to Plone5 - AttributeError: use_content_negotiation

Contents for Different Target Groups ConstraintNotSatisfied when upgrading to Plone 5.1b4 with PAM: de vs u'de'

i18ndude removes plone.i18n dependency: only a mapping language code to language name needed, iso-639 pycountry could be a base

Missing Chinese Character Sets

$ i18ndude rebuild-pot --pot locales/my.content.pot --create my.content .
Warning: msgid 'description' in ./profiles/default/types/Data.xml already exists with a different default (bad: Edit, should be: View)

Unicode Error "unicodeescape" codec can't decode bytes... Cannot open text files in Python 3

translation_group tg_ca Get the needed Translation Group accordingly

https://community.plone.org/t/about-raptus-multilanguagefields-plone-5-and-beyond/6425