This type of sampling allows a query to randomly pick up n rows of data, n percentage of the data size, or n bytes of data. The sampling granularity is the HDFS block size. Refer to the following examples:
-- Sample by number of rows
> SELECT name
> FROM employee TABLESAMPLE(1 ROWS) a;
+----------+
| name |
+----------+
| Michael |
+----------+
1 rows selected (0.075 seconds)
-- Sample by percentage of data size
> SELECT name
> FROM employee TABLESAMPLE(50 PERCENT) a;
+----------+
| name |
+----------+
| Michael |
| Will |
+----------+
2 rows selected (0.041 seconds)
-- Sample by data size
-- Support b/B, k/K, m/M, g/G
> SELECT name FROM employee TABLESAMPLE(1B) a;
+----------+
| name |
+----------+
| Michael |
+----------+
1 rows selected (0.075 seconds)