Skip to content
This repository was archived by the owner on Apr 29, 2019. It is now read-only.

Memory error on customer import #97

Conversation

dmanners
Copy link
Contributor

Description

With the following settings a customer import would crash with an out of memory problem.

  1. Current database of 250k customers (can be created using bin/magento setup:perf:generate-fixtures setup/performance-toolkit/profiles/ce/small.xml and editing the small.xml to setup 250000 customers)
  2. Memory limit set via .htaccess to 268M

Fixed Issues (if relevant)

  1. Failure to allocate memory error when importing customers in admin #43: Failure to allocate memory error when importing customers in admin

Manual testing scenarios

  1. After setting up the database and memory limit as described visit the admin section and try to import at least 1 customer,

Contribution checklist

  • Pull request has a meaningful description of its purpose
  • All commits are accompanied by meaningful commit messages
  • All new or changed code is covered with unit/integration tests (if applicable)
  • All automated tests passed successfully (all builds on Travis CI are green)

…ustomer data in the database on low memory systems

 - replace the usage of the CollectionByPagesIteratorFactory with a fetchAll when loading current customers from the database,
   this was causing issues because of the extra overhead that the the iterator caused by using objects to store each row's data
   in this way we can use a similar approach to how the product import loads old skus. See: Magento\Catalog\Model\ResourceModel\Product::getProductEntitiesInfo
…t use the DataObject

 - replace current usage,
 - keep BC by using the toArray method of the DataObject in the current public method
@dmanners
Copy link
Contributor Author

I have some open questions on this solution I would like to get people's thoughts on.

There are two approaches here either replace the usage of CollectionByPagesIteratorFactory inside this class or try to get the iterator to work without making objects. I picked to replace the usage here as the iterator is used in not just the customer import but other import processes, though that does mean this problem can happen in multiple places. Should we fix or replace?

The current Storage class allows you to init it with a page size. My current changes ignores this and just loads all the customers. I am not too sure in what cases page size is setup but it maybe that we find some cases that only load a limited set of customers during import or export.

* @param array $customer
* @return $this
*/
public function addCustomerByArray(array $customer)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not such a big fan of doing things via array. Since we are only using id, email and website id would people suggest passing these in specifically?

@dmanners
Copy link
Contributor Author

Also I am still working on the tests but I thought I would get people's thoughts first.

 - since we are now working with the connection in the load method we need to correct mocks in place
 - uses getMockBuilder
 - only has items that are needed
 - make sure that we are testing the new method for adding customer by array
… object is used

 - in the addCustomer covert $customerData['id'] to $customerData['entity_id'] so that older code that uses customer objects will still work as expected
if (!isset($this->_customerIds[$email])) {
$this->_customerIds[$email] = [];
$customerData = $customer->toArray();
if (!isset($customerData['entity_id']) && isset($customer['id'])) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added this in here as there could be the case that we get an object that only has the getId but we converted to array that means we have id as the key not the entity_id as we expect. To me this leads me down the route that the new add method should be more like addCustomerByData(int $customerId, int $websiteId, string $email) this way we would not need such an if here.

…port/CustomerComposite.php

 - make sure that the data format used when importing data matches the expected format
 - fix up the lines of over 120 characters
 - import classes that will be used to shorten lines
@dmanners dmanners requested a review from vrann January 24, 2018 10:34
Copy link
Contributor

@vrann vrann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resulting code looks like equivalent to how it worked before with the replacement of collection with the direct select requests.

I noticed only unit tests are added. Can you add integration as well?

@dmanners dmanners self-assigned this Jan 26, 2018
@dmanners dmanners added this to the January 2018 milestone Jan 26, 2018
@magento-engcom-team
Copy link
Contributor

@dmanners thank you for contributing. Please accept Community Contributors team invitation here to gain extended permissions for this repository.

@magento-engcom-team magento-engcom-team merged commit 330634b into magento-engcom:2.3-develop Feb 1, 2018
magento-engcom-team pushed a commit that referenced this pull request Apr 13, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants