Thursday, March 22, 2012

Newbie:sending German characters on TCP link

Hi,
I've a .NET application that sends XML string as bytes on a TCP
connection. It had used ASCII encoding so far and everything was fine. Now
on a German OS, German characters were sent as question mark in the received
string. I guess that's because ASCII encoding was used and unicode encoding
should be used instead. But then would byte ordering(little endian ir big
endian) be a problem because the received TCP stream could be consumed by
both a Java and .NET clients ? What could be done to ensure that the
received unicode encoded XML stream be decoded by both .NET and Java clients
? And, which encoding .NET class be used utf8, utf32,.. ? Is there a sample
anywhere ?
Thanks in advance and regards
NavinOn Feb 27, 7:44 pm, "Navin Mishra" <navin.mis...@.siemens.com> wrote:
> Hi,
> I've a .NET application that sends XML string as bytes on a TCP
> connection. It had used ASCII encoding so far and everything was fine. Now
> on a German OS, German characters were sent as question mark in the receiv
ed
> string. I guess that's because ASCII encoding was used and unicode encodin
g
> should be used instead. But then would byte ordering(little endian ir big
> endian) be a problem because the received TCP stream could be consumed by
> both a Java and .NET clients ? What could be done to ensure that the
> received unicode encoded XML stream be decoded by both .NET and Java clien
ts
> ? And, which encoding .NET class be used utf8, utf32,.. ? Is there a sampl
e
> anywhere ?
>
This group is for ASP.NET
Regarding your problem
First of all, ensure that the xml is encoded
sXML = "<?xml version=""1.0"" encoding=""windows-1252""?
>.........."
then set this encoding to the method which send the data.
For example
myWriter = New System.IO.StreamWriter(xmlRequest.GetRequestStream(),
Encoding.GetEncoding(1252))
myWriter.Write(sXML)
I used German code page 1252 here, but you can consider to use utf-8...
Thus wrote Navin,

> Hi,
> I've a .NET application that sends XML string as bytes on a TCP
> connection. It had used ASCII encoding so far and everything was fine.
> Now on a German OS, German characters were sent as question mark in
> the received string. I guess that's because ASCII encoding was used
> and unicode encoding should be used instead.
Absolutely. If not Unicode, then at least some 8 bit encoding like ISO Latin
1.

> But then would byte
> ordering(little endian ir big endian) be a problem because the
> received TCP stream could be consumed by both a Java and .NET clients
> ?
No. All UTF-16/32 variants include byte order marks that allow the receiver
to pick up the appropriate byte order, and UTF-8 is the same for both Big
Endian and Little Endian systems.

> What could be done to ensure that the received unicode encoded XML
> stream be decoded by both .NET and Java clients ? And, which encoding
> .NET class be used utf8, utf32,.. ? Is there a sample anywhere ?
If your XML contains mostly Latin characters, use UTF-8. That'll require
the least bandwidth. You can write an XML document to any stream using eithe
r
an XmlWriter or XmlDocument.Save().
Cheers,
--
Joerg Jooss
news-reply@.joergjooss.de

0 comments:

Post a Comment