I’m trying to use the Lenses Stream Reactor S3 Source to restore data from S3 to MSK. My topic data is stored as bucket/prefix/topicname/partition=0/topicname+0+0000000000.json
.
I have configured the following regex:
connect.s3.source.partition.extractor.type=regex
connect.s3.source.partition.extractor.regex=(?i)^(?:.*)\/(partition=[0-9]*)\/(?:[0-9]*)[.](?:Json|Avro|Parquet|Text|Csv|Bytes)$
I can see from the connector logs that it lists the files from S3 in the correct format but produces a scala.util.matching.Regex$MatchIterator
Below is the stacktrace:
[Worker-0acc4ac8eced061c4] java.lang.IllegalStateException
[Worker-0acc4ac8eced061c4] at scala.util.matching.Regex$MatchIterator.ensure(Regex.scala:848)
[Worker-0acc4ac8eced061c4] at scala.util.matching.Regex$MatchIterator.start(Regex.scala:858)
[Worker-0acc4ac8eced061c4] at scala.util.matching.Regex$MatchData.group(Regex.scala:660)
[Worker-0acc4ac8eced061c4] at scala.util.matching.Regex$MatchData.group$(Regex.scala:659)
[Worker-0acc4ac8eced061c4] at scala.util.matching.Regex$MatchIterator.group(Regex.scala:805)
[Worker-0acc4ac8eced061c4] at io.lenses.streamreactor.connect.aws.s3.model.RegexPartitionExtractor.extract(PartitionExtractor.scala:33)
[Worker-0acc4ac8eced061c4] at io.lenses.streamreactor.connect.aws.s3.source.config.SourceBucketOptions.$anonfun$getPartitionExtractorFn$4(S3SourceConfig.scala:65)