Managing collections

Let's start with EAV model collections. We can instantiate the collection either through the entity factory class like follows:

$collection = $this->employeeFactory->create()
                   ->getCollection();

Or we can use object manager to instantiate the collection as shown next:

$collection = $this->_objectManager->create(
    'FoggylineOfficeModelResourceModelEmployeeCollection's
);

There is also a third way, which might be the preferred one, but it requires us to define APIs so we will skip that one for the moment.

Once we instantiate the collection object, we can loop through it and do some variable dumps to see the content on individual $employee entities, like shown next:

foreach ($collection as $employee) {
    end_Debug::dump($employee->toArray(), '$employee');
}

The preceding would yield results like the following:

$employee array(5) {
  ["entity_id"] => string(2) "24"
  ["department_id"] => string(2) "27"
  ["email"] => string(14) "[email protected]"
  ["first_name"] => string(5) "Goran"
  ["last_name"] => string(6) "Gorvat"
}

Notice how the individual $employee only has fields on it, not the attributes. Let's see what happens when we want to extend our collection by using addAttributeToSelect to specify the individual attributes to add to it, like shown next:

$collection->addAttributeToSelect('salary')
           ->addAttributeToSelect('vat_number');

The preceding would yield results like the following:

$employee array(7) {
  ["entity_id"] => string(2) "24"
  ["department_id"] => string(2) "27"
  ["email"] => string(14) "[email protected]"
  ["first_name"] => string(5) "Goran"
  ["last_name"] => string(6) "Gorvat"
  ["salary"] => string(9) "3800.0000"
  ["vat_number"] => string(11) "GB123451234"
}

Though we are making progress, imagine if we had tens of attributes, and we want each and every one to be included into collection. Using addAttributeToSelect numerous times would make for cluttered code. What we can do is pass '*' as a parameter to addAttributeToSelect and have collection pick up every attribute, as shown next:

$collection->addAttributeToSelect('*');

This would yield results like the following:

$employee array(10) {
    ["entity_id"] => string(2) "24"
    ["department_id"] => string(2) "27"
    ["email"] => string(14) "[email protected]"
    ["first_name"] => string(5) "Goran"
    ["last_name"] => string(6) "Gorvat"
    ["dob"] => string(19) "1984-04-18 00:00:00"
    ["note"] => string(7) "Note #1"
    ["salary"] => string(9) "3800.0000"
    ["service_years"] => string(1) "3"
    ["vat_number"] => string(11) "GB123451234"
}

Though the PHP part of the code looks seemingly simple, what's happening in the background on the SQL layer is relatively complex. Though Magento executes several SQL queries prior to fetching the final collection result, let's focus on the last three queries as shown next:

SELECT COUNT(*) FROM 'foggyline_office_employee_entity' AS 'e'

SELECT 'e'.* FROM 'foggyline_office_employee_entity' AS 'e'

SELECT
  'foggyline_office_employee_entity_datetime'.'entity_id',
  'foggyline_office_employee_entity_datetime'.'attribute_id',
  'foggyline_office_employee_entity_datetime'.'value'
FROM 'foggyline_office_employee_entity_datetime'
WHERE (entity_id IN (24, 25, 26)) AND (attribute_id IN ('349'))
UNION ALL SELECT
            'foggyline_office_employee_entity_text'.'entity_id',
            'foggyline_office_employee_entity_text'.' attribute_id',
            'foggyline_office_employee_entity_text'.'value'
          FROM 'foggyline_office_employee_entity_text'
          WHERE (entity_id IN (24, 25, 26)) AND (attribute_id IN ('352'))
UNION ALL SELECT
            'foggyline_office_employee_entity_decimal'.' entity_id',
            'foggyline_office_employee_entity_decimal'.' attribute_id',
            'foggyline_office_employee_entity_decimal'.'value'
          FROM 'foggyline_office_employee_entity_decimal'
          WHERE (entity_id IN (24, 25, 26)) AND (attribute_id IN ('350'))
UNION ALL SELECT
            'foggyline_office_employee_entity_int'.'entity_id',
            'foggyline_office_employee_entity_int'.'attribute_id',
            'foggyline_office_employee_entity_int'.'value'
          FROM 'foggyline_office_employee_entity_int'
          WHERE (entity_id IN (24, 25, 26)) AND (attribute_id IN ('348'))
UNION ALL SELECT
            'foggyline_office_employee_entity_varchar'.' entity_id',
            'foggyline_office_employee_entity_varchar'.' attribute_id',
            'foggyline_office_employee_entity_varchar'.'value'
          FROM 'foggyline_office_employee_entity_varchar'
          WHERE (entity_id IN (24, 25, 26)) AND (attribute_id IN ('351'))

Note

Before we proceed any further, it is important to know that these queries are not copy and paste applicable. The reason is that the attribute_id values will for sure differ from installation to installation. Queries given here are for us to gain a high-level understanding of what is happening in the backend on the SQL layer when we use Magento collections on the PHP application level.

The first query select simply counts the number of entries in the entity table, and then passes that info to the application layer. The second select fetches all entries from foggyline_office_employee_entity, then passes that info to the application layer to use it to pass entity IDs in the third query as part of entity_id IN (24, 25, 26). Second and third queries here can be pretty resource intense if we have a large amount of entries in our entity and EAV tables. To prevent possible performance bottlenecks, we should always use the setPageSize and setCurPage methods on collection, like shown next:

$collection->addAttributeToSelect('*')
           ->setPageSize(25)
           ->setCurPage(5);

This would result in the first COUNT query still being the same, but the second query would now look like the following:

SELECT 'e'.* FROM 'foggyline_office_employee_entity' AS 'e' LIMIT 25 OFFSET 4

This makes for a much smaller, thus performance-lighter dataset if we have thousands or tens of thousands of entries. The point here is to always use setPageSize and setCurPage. If we need to work with a really large set, then we need to page through it, or walk through it.

Now we know how to limit the size of the result set and fetch the proper page, let's see how we can further filter the set to avoid overusing PHP loops for the same purpose. Thus effectively passing the filtering to the database and not the application layer. To filter the EAV collection, we use its addAttributeToFilter method.

Let's instantiate a clean new collection like shown next:

$collection = $this->_objectManager->create(
    'FoggylineOfficeModelResourceModelEmployeeCollection'
);

$collection->addAttributeToSelect('*')
           ->setPageSize(25)
           ->setCurPage(1);

$collection->addAttributeToFilter('email', array('like'=>'%mail.loc%'))
           ->addAttributeToFilter('vat_number', array('like'=>'GB%'))
           ->addAttributeToFilter('salary', array('gt'=>2400))
           ->addAttributeToFilter('service_years', array('lt'=>10));

Notice that we are now using the addAttributeToSelect and addAttributeToFilter methods on collection. We have already seen the database impact of addAttributeToSelect on a SQL query. What addAttributeToFilter does is something completely different.

With the addAttributeToFilter method, the count query now gets transformed into the following SQL query:

SELECT COUNT(*)
FROM 'foggyline_office_employee_entity' AS 'e'
  INNER JOIN 'foggyline_office_employee_entity_varchar' AS 'at_vat_number'
    ON ('at_vat_number'.'entity_id' = 'e'.'entity_id') AND ('at_vat_number'.'attribute_id' = '351')
  INNER JOIN 'foggyline_office_employee_entity_decimal' AS 'at_salary'
    ON ('at_salary'.'entity_id' = 'e'.'entity_id') AND ('at_salary'.'attribute_id' = '350')
  INNER JOIN 'foggyline_office_employee_entity_int' AS 'at_service_years'
    ON ('at_service_years'.'entity_id' = 'e'.'entity_id') AND ('at_service_years'.'attribute_id' = '348')
WHERE ('e'.'email' LIKE '%mail.loc%') AND (at_vat_number.value LIKE 'GB%') AND (at_salary.value > 2400) AND
      (at_service_years.value < 10)

We can see that this is much more complex than the previous count query, now we have INNER JOIN stepping in. Notice how we have four addAttributeToFilter method calls but only three INNER JOIN. This is because one of those four calls is for e-mail, which is not an attribute but a field within the foggyline_office_employee_entity table. That is why there is no need for INNER JOIN as the field is already there. The three INNER JOIN then simply merge the required info into the query in order to get the select.

The second query also becomes more robust, as shown next:

SELECT
  'e'.*,
  'at_vat_number'.'value'    AS 'vat_number',
  'at_salary'.'value'        AS 'salary',
  'at_service_years'.'value' AS 'service_years'
FROM 'foggyline_office_employee_entity' AS 'e'
  INNER JOIN 'foggyline_office_employee_entity_varchar' AS 'at_vat_number'
    ON ('at_vat_number'.'entity_id' = 'e'.'entity_id') AND ('at_vat_number'.'attribute_id' = '351')
  INNER JOIN 'foggyline_office_employee_entity_decimal' AS 'at_salary'
    ON ('at_salary'.'entity_id' = 'e'.'entity_id') AND ('at_salary'.'attribute_id' = '350')
  INNER JOIN 'foggyline_office_employee_entity_int' AS 'at_service_years'
    ON ('at_service_years'.'entity_id' = 'e'.'entity_id') AND ('at_service_years'.'attribute_id' = '348')
WHERE ('e'.'email' LIKE '%mail.loc%') AND (at_vat_number.value LIKE 'GB%') AND (at_salary.value > 2400) AND
      (at_service_years.value < 10)
LIMIT 25

Here, we also see the usage of INNER JOIN. We also have three and not four INNER JOIN, because one of the conditions is done against email, which is a field. The result of the query is a flattened piece of rows where the attributes vat_number, salary, and service_years are present. We can imagine the performance impact if we haven't used setPageSize to limit the result set.

Finally, the third query is also affected and now looks similar to the following:

SELECT
  'foggyline_office_employee_entity_datetime'.'entity_id',
  'foggyline_office_employee_entity_datetime'.'attribute_id',
  'foggyline_office_employee_entity_datetime'.'value'
FROM 'foggyline_office_employee_entity_datetime'
WHERE (entity_id IN (24, 25)) AND (attribute_id IN ('349'))
UNION ALL SELECT
            'foggyline_office_employee_entity_text'.'entity_id',
            'foggyline_office_employee_entity_text'.' attribute_id',
            'foggyline_office_employee_entity_text'.'value'
          FROM 'foggyline_office_employee_entity_text'
          WHERE (entity_id IN (24, 25)) AND (attribute_id IN ('352'))

Notice here how UNION ALL has been reduced to a single occurrence now, thus effectively making for two selects. This is because we have a total of five attributes (service_years, dob, salary, vat_number, note), and three of them have been pulled in through second query. Out of the preceding three queries demonstrated, Magento basically pulls the collection data from second and third query. This seems like a pretty optimized and scalable solution, though we should really give it some thought on the proper use of setPageSize, addAttributeToSelect, and addAttributeToFilter methods when creating collection.

During development, if working with collections that have lot of attributes, filters, and possibly a future large dataset, we might want to use SQL logging to record actual SQL queries hitting the database server. This might help us spot possible performance bottlenecks and react on time, either by adding more limiting values to setPageSize or addAttributeToSelect, or both.

In the preceding examples, the use of addAttributeToSelect results in AND conditions on the SQL layer. What if we want to filter collection using OR conditions? addAttributeToSelect can also result in SQL OR conditions if the $attribute parameter is used in the following way:

$collection->addAttributeToFilter([
    ['attribute'=>'salary', 'gt'=>2400],
    ['attribute'=>'vat_number', 'like'=>'GB%']
]);

Without going into the details of actual SQL queries this time, it is suffice to say that they are near identical to the previous example with the AND condition use of addAttributeToFilter.

Using collection methods like addExpressionAttributeToSelect, groupByAttribute, and addAttributeToSort, collections offer further gradient filtering and even shift some calculations from the PHP application layer to the SQL layer. Getting into the ins and outs of those and other collection methods is beyond the scope of this chapter, and would probably require a book on its own.

Collection filters

Looking back at the preceding addAttributeToFilter method call examples, questions pop out as to where can we see the list of all available collection filters. If we take a quick look inside the vendor/magento/framework/DB/Adapter/Pdo/Mysql.php file, we can see the method called prepareSqlCondition (partially) defined as follows:

public function prepareSqlCondition($fieldName, $condition)
{
    $conditionKeyMap = [
        'eq'            => "{{fieldName}} = ?",
        'neq'           => "{{fieldName}} != ?",
        'like'          => "{{fieldName}} LIKE ?",
        'nlike'         => "{{fieldName}} NOT LIKE ?",
        'in'            => "{{fieldName}} IN(?)",
        'nin'           => "{{fieldName}} NOT IN(?)",
        'is'            => "{{fieldName}} IS ?",
        'notnull'       => "{{fieldName}} IS NOT NULL",
        'null'          => "{{fieldName}} IS NULL",
        'gt'            => "{{fieldName}} > ?",
        'lt'            => "{{fieldName}} /* AJZELE */ < ?",
        'gteq'          => "{{fieldName}} >= ?",
        'lteq'          => "{{fieldName}} <= ?",
        'finset'        => "FIND_IN_SET(?, {{fieldName}})",
        'regexp'        => "{{fieldName}} REGEXP ?",
        'from'          => "{{fieldName}} >= ?",
        'to'            => "{{fieldName}} <= ?",
        'seq'           => null,
        'sneq'          => null,
        'ntoa'          => "INET_NTOA({{fieldName}}) LIKE ?",
    ];

    $query = '';
    if (is_array($condition)) {
        $key = key(array_intersect_key($condition, $conditionKeyMap));

    ...
}

This method is what eventually gets called at some point during SQL query construction. The $condition parameter is expected to have one of the following (partially listed) forms:

  • array("from" => $fromValue, "to" => $toValue)
  • array("eq" => $equalValue)
  • array("neq" => $notEqualValue)
  • array("like" => $likeValue)
  • array("in" => array($inValues))
  • array("nin" => array($notInValues))
  • array("notnull" => $valueIsNotNull)
  • array("null" => $valueIsNull)
  • array("gt" => $greaterValue)
  • array("lt" => $lessValue)
  • array("gteq" => $greaterOrEqualValue)
  • array("lteq" => $lessOrEqualValue)
  • array("finset" => $valueInSet)
  • array("regexp" => $regularExpression)
  • array("seq" => $stringValue)
  • array("sneq" => $stringValue)

If $condition is passed as an integer or string, then the exact value will be filtered ('eq' condition). If none of the conditions is matched, then a sequential array is expected as a parameter and OR conditions will be built using the preceding structure.

The preceding examples covered EAV model collections, as they are slightly more complex. Though the approach to filtering more or less applies to simple model collections as well, the most notable difference is that there are no addAttributeToFilter, addAttributeToSelect, and addExpressionAttributeToSelect methods. The simple model collections make use of addFieldToFilter, addFieldToSelect, and addExpressionFieldToSelect, among other subtle differences.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.93.136