Let's start with EAV model collections. We can instantiate the collection either through the entity factory
class like follows:
$collection = $this->employeeFactory->create() ->getCollection();
Or we can use object manager to instantiate the collection as shown next:
$collection = $this->_objectManager->create( 'FoggylineOfficeModelResourceModelEmployeeCollection's );
There is also a third way, which might be the preferred one, but it requires us to define APIs so we will skip that one for the moment.
Once we instantiate the collection object, we can loop through it and do some variable dumps to see the content on individual $employee
entities, like shown next:
foreach ($collection as $employee) { end_Debug::dump($employee->toArray(), '$employee'); }
The preceding would yield results like the following:
$employee array(5) { ["entity_id"] => string(2) "24" ["department_id"] => string(2) "27" ["email"] => string(14) "[email protected]" ["first_name"] => string(5) "Goran" ["last_name"] => string(6) "Gorvat" }
Notice how the individual $employee
only has fields on it, not the attributes. Let's see what happens when we want to extend our collection by using addAttributeToSelect
to specify the individual attributes to add to it, like shown next:
$collection->addAttributeToSelect('salary') ->addAttributeToSelect('vat_number');
The preceding would yield results like the following:
$employee array(7) { ["entity_id"] => string(2) "24" ["department_id"] => string(2) "27" ["email"] => string(14) "[email protected]" ["first_name"] => string(5) "Goran" ["last_name"] => string(6) "Gorvat" ["salary"] => string(9) "3800.0000" ["vat_number"] => string(11) "GB123451234" }
Though we are making progress, imagine if we had tens of attributes, and we want each and every one to be included into collection. Using addAttributeToSelect
numerous times would make for cluttered code. What we can do is pass '*'
as a parameter to addAttributeToSelect
and have collection pick up every attribute, as shown next:
$collection->addAttributeToSelect('*');
This would yield results like the following:
$employee array(10) { ["entity_id"] => string(2) "24" ["department_id"] => string(2) "27" ["email"] => string(14) "[email protected]" ["first_name"] => string(5) "Goran" ["last_name"] => string(6) "Gorvat" ["dob"] => string(19) "1984-04-18 00:00:00" ["note"] => string(7) "Note #1" ["salary"] => string(9) "3800.0000" ["service_years"] => string(1) "3" ["vat_number"] => string(11) "GB123451234" }
Though the PHP part of the code looks seemingly simple, what's happening in the background on the SQL layer is relatively complex. Though Magento executes several SQL queries prior to fetching the final collection result, let's focus on the last three queries as shown next:
SELECT COUNT(*) FROM 'foggyline_office_employee_entity' AS 'e' SELECT 'e'.* FROM 'foggyline_office_employee_entity' AS 'e' SELECT 'foggyline_office_employee_entity_datetime'.'entity_id', 'foggyline_office_employee_entity_datetime'.'attribute_id', 'foggyline_office_employee_entity_datetime'.'value' FROM 'foggyline_office_employee_entity_datetime' WHERE (entity_id IN (24, 25, 26)) AND (attribute_id IN ('349')) UNION ALL SELECT 'foggyline_office_employee_entity_text'.'entity_id', 'foggyline_office_employee_entity_text'.' attribute_id', 'foggyline_office_employee_entity_text'.'value' FROM 'foggyline_office_employee_entity_text' WHERE (entity_id IN (24, 25, 26)) AND (attribute_id IN ('352')) UNION ALL SELECT 'foggyline_office_employee_entity_decimal'.' entity_id', 'foggyline_office_employee_entity_decimal'.' attribute_id', 'foggyline_office_employee_entity_decimal'.'value' FROM 'foggyline_office_employee_entity_decimal' WHERE (entity_id IN (24, 25, 26)) AND (attribute_id IN ('350')) UNION ALL SELECT 'foggyline_office_employee_entity_int'.'entity_id', 'foggyline_office_employee_entity_int'.'attribute_id', 'foggyline_office_employee_entity_int'.'value' FROM 'foggyline_office_employee_entity_int' WHERE (entity_id IN (24, 25, 26)) AND (attribute_id IN ('348')) UNION ALL SELECT 'foggyline_office_employee_entity_varchar'.' entity_id', 'foggyline_office_employee_entity_varchar'.' attribute_id', 'foggyline_office_employee_entity_varchar'.'value' FROM 'foggyline_office_employee_entity_varchar' WHERE (entity_id IN (24, 25, 26)) AND (attribute_id IN ('351'))
Before we proceed any further, it is important to know that these queries are not copy and paste applicable. The reason is that the attribute_id
values will for sure differ from installation to installation. Queries given here are for us to gain a high-level understanding of what is happening in the backend on the SQL layer when we use Magento collections on the PHP application level.
The first query select simply counts the number of entries in the entity table, and then passes that info to the application layer. The second select fetches all entries from foggyline_office_employee_entity
, then passes that info to the application layer to use it to pass entity IDs in the third query as part of entity_id IN (24, 25, 26)
. Second and third queries here can be pretty resource intense if we have a large amount of entries in our entity and EAV tables. To prevent possible performance bottlenecks, we should always use the setPageSize
and setCurPage
methods on collection, like shown next:
$collection->addAttributeToSelect('*') ->setPageSize(25) ->setCurPage(5);
This would result in the first COUNT
query still being the same, but the second query would now look like the following:
SELECT 'e'.* FROM 'foggyline_office_employee_entity' AS 'e' LIMIT 25 OFFSET 4
This makes for a much smaller, thus performance-lighter dataset if we have thousands or tens of thousands of entries. The point here is to always use setPageSize
and setCurPage
. If we need to work with a really large set, then we need to page through it, or walk through it.
Now we know how to limit the size of the result set and fetch the proper page, let's see how we can further filter the set to avoid overusing PHP loops for the same purpose. Thus effectively passing the filtering to the database and not the application layer. To filter the EAV collection, we use its addAttributeToFilter
method.
Let's instantiate a clean new collection like shown next:
$collection = $this->_objectManager->create( 'FoggylineOfficeModelResourceModelEmployeeCollection' ); $collection->addAttributeToSelect('*') ->setPageSize(25) ->setCurPage(1); $collection->addAttributeToFilter('email', array('like'=>'%mail.loc%')) ->addAttributeToFilter('vat_number', array('like'=>'GB%')) ->addAttributeToFilter('salary', array('gt'=>2400)) ->addAttributeToFilter('service_years', array('lt'=>10));
Notice that we are now using the addAttributeToSelect
and addAttributeToFilter
methods on collection. We have already seen the database impact of addAttributeToSelect
on a SQL query. What addAttributeToFilter
does is something completely different.
With the addAttributeToFilter
method, the count query now gets transformed into the following SQL query:
SELECT COUNT(*) FROM 'foggyline_office_employee_entity' AS 'e' INNER JOIN 'foggyline_office_employee_entity_varchar' AS 'at_vat_number' ON ('at_vat_number'.'entity_id' = 'e'.'entity_id') AND ('at_vat_number'.'attribute_id' = '351') INNER JOIN 'foggyline_office_employee_entity_decimal' AS 'at_salary' ON ('at_salary'.'entity_id' = 'e'.'entity_id') AND ('at_salary'.'attribute_id' = '350') INNER JOIN 'foggyline_office_employee_entity_int' AS 'at_service_years' ON ('at_service_years'.'entity_id' = 'e'.'entity_id') AND ('at_service_years'.'attribute_id' = '348') WHERE ('e'.'email' LIKE '%mail.loc%') AND (at_vat_number.value LIKE 'GB%') AND (at_salary.value > 2400) AND (at_service_years.value < 10)
We can see that this is much more complex than the previous count query, now we have INNER JOIN
stepping in. Notice how we have four addAttributeToFilter
method calls but only three INNER JOIN
. This is because one of those four calls is for e-mail, which is not an attribute but a field within the foggyline_office_employee_entity
table. That is why there is no need for INNER JOIN
as the field is already there. The three INNER JOIN
then simply merge the required info into the query in order to get the select.
The second query also becomes more robust, as shown next:
SELECT 'e'.*, 'at_vat_number'.'value' AS 'vat_number', 'at_salary'.'value' AS 'salary', 'at_service_years'.'value' AS 'service_years' FROM 'foggyline_office_employee_entity' AS 'e' INNER JOIN 'foggyline_office_employee_entity_varchar' AS 'at_vat_number' ON ('at_vat_number'.'entity_id' = 'e'.'entity_id') AND ('at_vat_number'.'attribute_id' = '351') INNER JOIN 'foggyline_office_employee_entity_decimal' AS 'at_salary' ON ('at_salary'.'entity_id' = 'e'.'entity_id') AND ('at_salary'.'attribute_id' = '350') INNER JOIN 'foggyline_office_employee_entity_int' AS 'at_service_years' ON ('at_service_years'.'entity_id' = 'e'.'entity_id') AND ('at_service_years'.'attribute_id' = '348') WHERE ('e'.'email' LIKE '%mail.loc%') AND (at_vat_number.value LIKE 'GB%') AND (at_salary.value > 2400) AND (at_service_years.value < 10) LIMIT 25
Here, we also see the usage of INNER JOIN
. We also have three and not four INNER JOIN
, because one of the conditions is done against email
, which is a field. The result of the query is a flattened piece of rows where the attributes vat_number
, salary
, and service_years
are present. We can imagine the performance impact if we haven't used setPageSize
to limit the result set.
Finally, the third query is also affected and now looks similar to the following:
SELECT 'foggyline_office_employee_entity_datetime'.'entity_id', 'foggyline_office_employee_entity_datetime'.'attribute_id', 'foggyline_office_employee_entity_datetime'.'value' FROM 'foggyline_office_employee_entity_datetime' WHERE (entity_id IN (24, 25)) AND (attribute_id IN ('349')) UNION ALL SELECT 'foggyline_office_employee_entity_text'.'entity_id', 'foggyline_office_employee_entity_text'.' attribute_id', 'foggyline_office_employee_entity_text'.'value' FROM 'foggyline_office_employee_entity_text' WHERE (entity_id IN (24, 25)) AND (attribute_id IN ('352'))
Notice here how UNION ALL
has been reduced to a single occurrence now, thus effectively making for two selects. This is because we have a total of five attributes (service_years
, dob
, salary
, vat_number
, note
), and three of them have been pulled in through second query. Out of the preceding three queries demonstrated, Magento basically pulls the collection data from second and third query. This seems like a pretty optimized and scalable solution, though we should really give it some thought on the proper use of setPageSize
, addAttributeToSelect
, and addAttributeToFilter
methods when creating collection.
During development, if working with collections that have lot of attributes, filters, and possibly a future large dataset, we might want to use SQL logging to record actual SQL queries hitting the database server. This might help us spot possible performance bottlenecks and react on time, either by adding more limiting values to setPageSize
or addAttributeToSelect
, or both.
In the preceding examples, the use of addAttributeToSelect
results in AND
conditions on the SQL layer. What if we want to filter collection using OR
conditions? addAttributeToSelect
can also result in SQL OR
conditions if the $attribute
parameter is used in the following way:
$collection->addAttributeToFilter([ ['attribute'=>'salary', 'gt'=>2400], ['attribute'=>'vat_number', 'like'=>'GB%'] ]);
Without going into the details of actual SQL queries this time, it is suffice to say that they are near identical to the previous example with the AND
condition use of addAttributeToFilter
.
Using collection methods like addExpressionAttributeToSelect
, groupByAttribute
, and addAttributeToSort
, collections offer further gradient filtering and even shift some calculations from the PHP application layer to the SQL layer. Getting into the ins and outs of those and other collection methods is beyond the scope of this chapter, and would probably require a book on its own.
Looking back at the preceding addAttributeToFilter
method call examples, questions pop out as to where can we see the list of all available collection filters. If we take a quick look inside the vendor/magento/framework/DB/Adapter/Pdo/Mysql.php
file, we can see the method called prepareSqlCondition
(partially) defined as follows:
public function prepareSqlCondition($fieldName, $condition) { $conditionKeyMap = [ 'eq' => "{{fieldName}} = ?", 'neq' => "{{fieldName}} != ?", 'like' => "{{fieldName}} LIKE ?", 'nlike' => "{{fieldName}} NOT LIKE ?", 'in' => "{{fieldName}} IN(?)", 'nin' => "{{fieldName}} NOT IN(?)", 'is' => "{{fieldName}} IS ?", 'notnull' => "{{fieldName}} IS NOT NULL", 'null' => "{{fieldName}} IS NULL", 'gt' => "{{fieldName}} > ?", 'lt' => "{{fieldName}} /* AJZELE */ < ?", 'gteq' => "{{fieldName}} >= ?", 'lteq' => "{{fieldName}} <= ?", 'finset' => "FIND_IN_SET(?, {{fieldName}})", 'regexp' => "{{fieldName}} REGEXP ?", 'from' => "{{fieldName}} >= ?", 'to' => "{{fieldName}} <= ?", 'seq' => null, 'sneq' => null, 'ntoa' => "INET_NTOA({{fieldName}}) LIKE ?", ]; $query = ''; if (is_array($condition)) { $key = key(array_intersect_key($condition, $conditionKeyMap)); ... }
This method is what eventually gets called at some point during SQL query construction. The $condition
parameter is expected to have one of the following (partially listed) forms:
array("from" => $fromValue, "to" => $toValue)
array("eq" => $equalValue)
array("neq" => $notEqualValue)
array("like" => $likeValue)
array("in" => array($inValues))
array("nin" => array($notInValues))
array("notnull" => $valueIsNotNull)
array("null" => $valueIsNull)
array("gt" => $greaterValue)
array("lt" => $lessValue)
array("gteq" => $greaterOrEqualValue)
array("lteq" => $lessOrEqualValue)
array("finset" => $valueInSet)
array("regexp" => $regularExpression)
array("seq" => $stringValue)
array("sneq" => $stringValue)
If $condition
is passed as an integer or string, then the exact value will be filtered ('eq'
condition). If none of the conditions is matched, then a sequential array is expected as a parameter and OR
conditions will be built using the preceding structure.
The preceding examples covered EAV model collections, as they are slightly more complex. Though the approach to filtering more or less applies to simple model collections as well, the most notable difference is that there are no addAttributeToFilter
, addAttributeToSelect
, and addExpressionAttributeToSelect
methods. The simple model collections make use of addFieldToFilter
, addFieldToSelect
, and addExpressionFieldToSelect
, among other subtle differences.
3.145.93.136