Nп/п : 30 из 73
От : Michiel van der Vlist 2:280/5555 09 мар 25 11:42:16
К : Sergey Dorofeev 09 мар 25 14:15:02
Тема : UTF-8 nodelist report
----------------------------------------------------------------------------------
@MSGID: 2:280/5555 67cd7088
@REPLY: 2:5020/12000 4f5391fe
@TID: FMail-W32 2.3.0.1-B20240319
@TZUTC: 0100
@CHRS: UTF-8 4
Hello Sergey,
On Friday March 07 2025 15:01, you wrote to me:
MV>> He insists on entering the `a` and `o` with umlaut in S├дve and
MV>> Bj├╢rn in 202/208 in Latin-1 in the normal ASCII nodelist. So in
MV>> the ASCII list they are replaced by question marks by MakeNl. In
MV>> the UTF list which in his case is just a copy of the ASCII
MV>> segment submitted, they appear "as submitted" and the line is
MV>> flagged as in error by my program.
SD> I think it is not very contradictory. I he will success in entering
SD> non-ASCII chars in nodelist (making it full 8-bit), encoding must be
SD> defined.
The encoding for the regular nodelist IS defined: ASCII and ASCII
only. For backward compatibility it must stay that way. There still may
be nodelist processing software around that breaks when he highest bit
is not zero. That is why MakeNl (without the ALLOW8BIT setting)
substitutes a question mark for characters with the highest bit set.
The encoding for the UTF nodelist is also defined: UTF-8.
SD> Ok, if it will be latin-1, but let it be only for European
SD> segments. That is, lets define encoding on per-region or even
SD> per-network basis.
Very bad idea. Having more than one encoding within the same file
is a bad idea anyway, not just for the nodelist but for ANY text
file.
SD> So when importing nodelist, it must be split back on segments and
SD> correctly transcoded. E.g. default encoding if ASCII, so Zone records
SD> must be ASCII. But zone may specify own encoding, so regions in it may
SD> use it in own record, and define encoding for underlying regions.
SD> Further, region record use zone encoding, and may define encoding for
SD> networks. Network record use region encoding and may define encoding
SD> for node records.
Are you serious? You really still want every back alley in Fidonet
to have its own 8 bit encoding? With all the forward and backward
re-encoding and other limitations? C`mon.. That`s chaos! Unicode was invented for
the very purpose of getting rid of all this codepage shit.
Why do you think Microsoft went full Unicode internally? Three
decades ago. Why do you think 99% of what is on the web is UTF-8? To
get rid of the mess of all the hundreds of 8 bit encodings that
floated around!
Nah, as far as the nodelist goes, it is either just ASCII or
UTF-8. No more codepage shit.
Cheers, Michiel
--- GoldED+/W32-MSVC 1.1.5-b20170303
* Origin: Nieuw Schn├╕├╕rd (2:280/5555)
SEEN-BY: 50/109 154/10 203/0 221/6 240/5832 280/464
5555 292/789 301/1 310/31
SEEN-BY: 341/66 460/58 5015/46 5019/40 5020/715 830
848 1042 4441 12000
SEEN-BY: 5030/49 1081 1474 5058/104 5061/133
5075/35
@PATH: 280/5555 5020/1042 4441