It's important to understand the columnar layout and internal storage of the import mode datasets. Power BI creates individual segments of approximately one million rows and stores separate memory structures for column data, the dictionary of unique values for columns, relationships, and hierarchies.
In the following diagram, three segments are used to store a fact table of 2.8 million rows:
Since only the columns required for a query are scanned during query execution, a relatively expensive column in terms of memory consumption (due to many unique values), such as Order #, can be stored in the dataset without negatively impacting queries that only access other columns. Removing fact table columns or reducing the cardinality of fact table columns that are not used in queries or relationships will nonetheless benefit the storage size and resources required to refresh the dataset. Fewer fact table columns may also enable Power BI to find a more optimal sort order for compression and thus benefit the query performance.
During query execution over tables with more than one segment, one CPU thread is associated per segment. This parallelization is limited by the number of CPU threads available to the dataset (for example, Power BI Premium P1 with four backend v-cores), and the number of segments required to resolve the query. Therefore, ideally, the rows of fact tables can be ordered such that only a portion of the segments are required to resolve queries. Using the example of the 2.8M-row fact table, a query that's filtered on the year 2017 would only require one CPU thread and would only scan the required column segments within Segment 3.