Apache Spark structured streaming is significantly more flexible in the window-processing model. As streams are virtually treated as continuously appended tables, and every row in such a table has a timestamp, operations on windows can be specified in the query itself and each query can define different windows. In addition, if there is a timestamp present in static data, window operations can also be defined, leading to a very flexible stream-processing model.
In other words, Apache Spark windowing is just a sort of special type of grouping on the timestamp column. This makes it really easy to handle late arriving data as well because Apache Spark can include it in the appropriate window and rerun the computation on that window when a certain data item arrives late. This feature is highly configurable.
The concept of late data is interesting when using event time instead of processing time to assign a unique timestamp to each tuple. Event time is the timestamp when a particular measurement took place, for example. Apache Spark structured streaming can automatically cope with subsets of data arriving at a later point in time transparently.
The watermark is basically the threshold used to define how old a late arriving data point is allowed to be in order to still be included in the respective window. Again, consider the HTTP server log file working over a minute length window. If, for whatever reason, a data tuple arrives which is more than 4 hours old it might not make sense to include it in the windows if, for example, this application is used to create a time-series forecast model on an hourly basis to provision or de-provision additional HTTP servers to a cluster. A four-hour-old data point just wouldn't make sense to process, even if it could change the decision, as the decision has already been made.