Feels like everyone is wrong.
A. Deploy small Kafka clusters in your data centers to buffer events.
- Silly in a GCP cloudnative context, plus they have messaging infra anyway
B. Have the data acquisition devices publish data to Cloud Pub/Sub.
- They have messaging infra, so why? Unless they want to replace, it, but that doesn't change the issue
C. Establish a Cloud Interconnect between all remote data centers and Google.
- Wrong, because Interconnect is basically a leased line. There must be some telecoms issue with it, which we can assume is unresolvable e.g. long distance remote locations and sometimes water ingress, and the telco can't justify sorting it yet, or is slow to, or something. Leased lines usually don't come with awful internet connectivity, so sound physical connectivity issue. Sure, an Interconnect is better, more direct, but a leased line should be bullet proof.
D. Write a Cloud Dataflow pipeline that aggregates all data in session windows.
- The only way to address dodgy/delayed data delivery