German umlaut in schema is converted to question mark (?) in Lenses

Via the Schema Registry I edited an AVRO schema so that the “doc” section for a field contains a German umlaut (“ü”).

After saving/evolving the schema I opened it again in the browser, and the umlaut is replaced by a question mark (“?”).

Shouldn’t schemas be stored in UTF-8 format?

We use Lenses version 4.3.5 and Schema Registry version 5.4.3.

Trying to replicate your issue in Lenses Box (a quick way to validate issues) failed.

If I had to guess, I would place my bet that the server or the container image —if you run things in containers— is not set to use UTF-8.

A little while ago, we saw a similar issue in Kafka Connect with a user on Confluent’s Kafka Connect docker images. Their images do not have a set UTF-8 locale (they default in C - ASCII by default), so some connectors had issues processing strings in UTF-8 unless we explicitly forced UTF-8 in the connector’s source code.

In short, I suggest you check that the server or container that runs Lenses and Schema Registry, both are set to use a UTF-8 enabled locale.

1 Like

Hi Marios, thanks for your quick reply! I will check with our admins.