cjk characters word wrap problem
-
- Posts: 51
- Joined: 17.08.2022 13:42
cjk characters word wrap problem
I remember I had poste this problem house ago. But I can't find it now. Very strange. So, I rewrite it here but a little shorter.
The example figure shows that cudatext wrap the line with too many spaces when it encounters mixed English/Cjk characters.
My question is if this problem will be resolved. I think it's hard, so I can accept any result.
Thank you.
The example figure shows that cudatext wrap the line with too many spaces when it encounters mixed English/Cjk characters.
My question is if this problem will be resolved. I think it's hard, so I can accept any result.
Thank you.
-
- Posts: 2265
- Joined: 25.08.2021 18:15
-
- Posts: 2265
- Joined: 25.08.2021 18:15
I added these 2 cjk chars to option value using Options Editor Lite. user.json:
Code: Select all
"nonword_chars": "-+*=/\\()[]{}<>\"'.,:;~?!@#$%^&|`\u2026\uff0c\u3002\u4e00",
-
- Posts: 51
- Joined: 17.08.2022 13:42
Answer:main Alexey wrote:1. What is the language rule: CJK characters can be splitted in any place? or we need to split only on space/tab positions?
1. CJK characters can be splitted in any place except for the following exceptions
2. must not be split before
Code: Select all
, 。 ;!?: ” ’,.;!?:
3. must not be split after
Code: Select all
“ ‘ " '
It's much better than before.main Alexey wrote:1. What is the language rule: CJK characters can be splitted in any place? or we need to split only on space/tab positions?
2. what if you add CJK "comma" (I see it over red underline on your pic) and CJK "small circle" to the option "nonword_chars"?
I think when the first rule is applied, the result will be perfect.
-
- Posts: 2265
- Joined: 25.08.2021 18:15
-
- Posts: 51
- Joined: 17.08.2022 13:42
Yes, few text editors handle CJK word-wrap very well. So, I said I can accept any results.main Alexey wrote:so CJK text can be wrapped at almost-any position. so app needs special handling for CJK word-wrap. no wish to make code so complex. maybe later.
editors Geany and Sublime Text also don't support this CJK issue good. tested on copy/pasted CudaText webpage.
By splitting at CJK punctuations, It's not so ugly now. In extreme condition, I can turn to VS Code temporarily. Thanks a lot.
-
- Posts: 2265
- Joined: 25.08.2021 18:15
I tried now to improve it - added special code for these 3 unicode ranges
CJK Unified Ideographs 4E00-9FFF Common
CJK Unified Ideographs Extension A 3400-4DBF Rare
CJK Compatibility Ideographs F900-FAFF Duplicates, unifiable variants, corporate characters
test the windows demo from http://uvviewsoft.com/c/ , better?
CJK Unified Ideographs 4E00-9FFF Common
CJK Unified Ideographs Extension A 3400-4DBF Rare
CJK Compatibility Ideographs F900-FAFF Duplicates, unifiable variants, corporate characters
test the windows demo from http://uvviewsoft.com/c/ , better?
-
- Posts: 2265
- Joined: 25.08.2021 18:15
-
- Posts: 51
- Joined: 17.08.2022 13:42
I can't believe it! You did it! It works! You work so hard, quickly, and productively! Thank you very very very much!main Alexey wrote:I tried now to improve it - added special code for these 3 unicode ranges
CJK Unified Ideographs 4E00-9FFF Common
CJK Unified Ideographs Extension A 3400-4DBF Rare
CJK Compatibility Ideographs F900-FAFF Duplicates, unifiable variants, corporate characters
test the windows demo from http://uvviewsoft.com/c/ , better?
I will tell it to a very old cudatext user. He is a powerful technical writer in China. These days, he saw my posts about cudatext on zhihu.com and post a comment on it. He said he has been using cudatext as a total commander plugin for many years, although not very frequently. He pointed out that cudatext cannot handle CJK word wrap well and hope you can improve it, too. Now I will ask him to write a promotional article, since you afford the convenience for us generously.