We’re having some problems setting up our Kafka Connect sink connectors, using the ADLS plugin. We want to use time based partitioning, but when we add the InsertRecordTimestampHeaders SMT and a
PARTITIONBY _header.year, _header.month, _header.day
clause to our KCQL, we get errors because the destination directories don’t exist.
The errors appear are :
ERROR Fatal error encountered while processing CopyOperation
“code”:“RenameDestinationParentPathNotFound”,“message”:"The parent directory of the destination path does not exist”
and appear to be when trying to copy the files from the .temp-upload directory to the target partitioned directory. Is this expected behaviour, and do we need to create all of our target directories ahead of time? This feels like it would be quite difficult if we’re partitioning to /year=YY/month=MM/day=DD/ directories - as we’d have to have a script which created the new directories every day.
Has anyone else come across this problem, found a solution, or have any idea what I might be doing wrong?