Failed to configure Lenses S3 source connector with specific_record

Hi,

I have a producer which produces messages serialised by AWS Glue schema and the message format is constructed using the specific_record class from Avro.

public class DemoFormatModel extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord 

I deployed the S3 sink connector using the lenses.io s3 connector jar, using version 6.3.0, and when I specify

value.converter.avroRecordType=SPECIFIC_RECORD

it will throw

Converting byte[] to Kafka Connect data failed due to serialization error: (org.apache.kafka.connect.runtime.WorkerSinkTask:547)

However, when I specify value.converter.avroRecordType=GENERIC_RECORD it works well. All other configs are the same, just changing from specific_record to generic_record, it solves the issue.

I wonder why would this be the case?

The next question I have, which can be related to this is that, after the S3 sink connector writes the file to S3 (avro file), I failed to run s3 source connector to restore the messages, with the error of Not an AVRO data file. I wonder if it is because I specify value.converter.avroRecordType=GENERIC_RECORD but in reality it should be value.converter.avroRecordType=SPECIFIC_RECORD and making the file invalid?

Any help would be great! Thank you

Hi Jkchoi,

Without having the full error stack, I think the exception is coming from the Glue converter for which we don’t own the code. Here is the code location for Glue avro converter: aws-glue-schema-registry/avro-kafkaconnect-converter/src/main/java/com/amazonaws/services/schemaregistry/kafkaconnect/AWSKafkaAvroConverter.java at dd145df137338c22c5548a7aad9b08de772cb7d1 · awslabs/aws-glue-schema-registry · GitHub

What’s the sink and source KCQL configuration?