Background information
IMPORTANT: If you choose to ignore this issue report template, your issue will be closed as we cannot help without the requested information.
Please make sure you tick (add an x between the square brackets with no spaces) the following check boxes:
Installation information
- OSPOS version is: 3.2.3
- OSPOS git commit hash is: 4f5ad57
- PHP version is: 7.3.1
- MySQL or MariaDB version is: 10.1.37-MariaDB
- OS and version is: Ubuntu 18.10
- WebServer is: Apache 2.4
- Selected language is: en_US
- (If applicable) Docker installation:
- (If applicable) Installation package for the LAMP/LEMP stack is: xampp
Issue / Bug / Question / New Feature
Severe performance issue with large number of items (60,000) and inventory history 1m+
Main Menu - Items takes upwards of 2 minutes to "load". Page load timing was 117 seconds for some instances, but never below 90 seconds.
The ladies (operators) don't have that kind of patience, and neither do I. I've been running an OLD version and have improved some slower areas, now upgrading this has to be fixed!
Someone would have to search for my body if I installed right now to production 😁
A few days has made some huge improvements, but I don't want to break others' code either.
A couple of BUGS/design deficiencies have been found also that are being corrected while in this section. I'll note them here also.
This version of changes is "draft - for review only", and is not for release/integration.
As I was going through this a number of questions / comments are in my notes. I'll summarize here and multiple "bug" reports, but full is available should someone want to see it.
I'm picky about some areas. Speed is one huge one - a 1/10 of a second improvement is huge on response times. With "staff costs" at a $22.50/hour ($15 pay plus overhead) that is $0.375 per minute! Do you want to pay them for waiting "several seconds" every time they interact with the computer? Especially when they will do some of those queries a hundred times a day!
First on the "hit-list" is Items - list query speed.
From debug logs - initial query times for this list were:
Execution Time:50.268499851227 PLUS Execution Time:31.159582138062
Rendering time is ignored.
Why?
First - the actual listing "query" (models/Item.php search) is done twice. Once to do the query and return data, the second time to find how many records could have been returned. There MUST be a way to eliminate the second one at least! (There is on most DB software)
A search query for a particular item_number is much faster, at 0.79 + 0.33 - but that is still MUCH slower than it should be. Then remember - your operator wanted to find ONE item to find the stock level - they had to live through the first query, then do the search. That one "request" cost you as an employer at least a Dollar! Not to mention the frustration.
"explain" shows temporary tables being created (28,000 records), and filesort being done.
Where being used to match/eliminate from several tables, and joins done sub-optimally.
The inventory table is being hit on every record, data converted for comparison, then where. Ugly!
BUG FOUND!! Because of the dependency on the inventory table, an item that has no transactions can never be found! RESULTS ERROR. IF this query code is used elsewhere, a new product could never be located to be received, sold, or edited. I’m sure this must show as a bug somewhere! I’m sure this isn’t the intent either. Still can’t figure why there is a selection of inventory transactions on the list anyway!
Aggregates are a big slow-down, and costing about 10 seconds on this query. I understand why they were added (to cover-up for the possible duplication fault caused by the join), but there is a better way. Fix the joins!
Inventory table has no index on the field for the matching where clause (BETWEEN). Costly.
*** create index trans_timestamp_index on ospos_inventory (trans_date);
Cleanup part 1 - Eliminate the inventory table from the join. It is causing bad results, and is certainly an ugly join. I'll re-integrate that later once the use is found.
Execution Time:0.86312890052795 PLUS Execution Time:0.23467803001404
Lots better, but still slower.
The second query (count) is .6 of a second faster - most due to the GROUP BY which shouldn't be needed. Each item has ONE supplier and ONE quantities record that is being considered!
Quantities join matches item_id only, then duplicates eliminated using a WHERE. Hmm
PROBABLE FUTURE BUG? "Stock Location" feature breakage. Where item is created "for me" in location 1, a stock quantities record is created. Someone who only sees location 2 cannot query or see this product. IS THIS INTENTIONAL? If so, HOW do they add a product with the same SKU for the second location? Just another duplicate? OR is each "Item" only intended to every be only in one location? If so, why is this a second table instead of remaining as a "property" field and keeping the quantity in the main table?
For now - I'll do this as an OUTER join. and properly put the LOCATION into the JOIN syntax. Faster anyway!
Removing the MAX(xx) from each select helped a small bit (a couple seconds off the large time), but every second counts.
QUESTION: Is "Field Order" used anywhere in the application? Or are all table references "associative" to the name? A * SELECT shows slightly slower for some reason, but is more flexible for customizing things. I'll leave the field names. There are only 2 "tiny" fields not returned by the query. There'll be a lot more with "MY" needed additions. There are no BLOBS.
Eliminate the "redundant" GROUP BY clause. The joined tables are (now) a 1:1 relationship via KEY, and LEFT further enforces that.
Execution Time:0.64102482795715 PLUS Execution Time:0.27455306053162
Better!
MOST SQL engines have a way to return the rows returned ignoring the LIMIT clause. For mysql, mysqli, Maria, Oracle and SQL server it is SQL_CALC_FOUND_ROWS as part of the initial query, followed by a second query "SELECT FOUND_ROWS()". That can eliminate the second query time.
AFTER all these changes:
List With no sort:
Execution Time:0.35885500907898 PLUS Execution Time:0.00020813941955566
List With a sort:
Execution Time:0.6206910610199 PLUS Execution Time:0.00011992454528809
for a QUERY (with a sort):
Execution Time:0.26358985900879 PLUS Execution Time:0.0001060962677002
without sort:
Execution Time:0.25322484970093
NEXT will be to re-integrate the "Transactions within Date Range" functionality.
Dan
Background information
IMPORTANT: If you choose to ignore this issue report template, your issue will be closed as we cannot help without the requested information.
Please make sure you tick (add an x between the square brackets with no spaces) the following check boxes:
Installation information
Issue / Bug / Question / New Feature
Severe performance issue with large number of items (60,000) and inventory history 1m+
Main Menu - Items takes upwards of 2 minutes to "load". Page load timing was 117 seconds for some instances, but never below 90 seconds.
The ladies (operators) don't have that kind of patience, and neither do I. I've been running an OLD version and have improved some slower areas, now upgrading this has to be fixed!
Someone would have to search for my body if I installed right now to production 😁
A few days has made some huge improvements, but I don't want to break others' code either.
A couple of BUGS/design deficiencies have been found also that are being corrected while in this section. I'll note them here also.
This version of changes is "draft - for review only", and is not for release/integration.
As I was going through this a number of questions / comments are in my notes. I'll summarize here and multiple "bug" reports, but full is available should someone want to see it.
I'm picky about some areas. Speed is one huge one - a 1/10 of a second improvement is huge on response times. With "staff costs" at a $22.50/hour ($15 pay plus overhead) that is $0.375 per minute! Do you want to pay them for waiting "several seconds" every time they interact with the computer? Especially when they will do some of those queries a hundred times a day!
First on the "hit-list" is Items - list query speed.
From debug logs - initial query times for this list were:
Execution Time:50.268499851227 PLUS Execution Time:31.159582138062
Rendering time is ignored.
Why?
First - the actual listing "query" (models/Item.php search) is done twice. Once to do the query and return data, the second time to find how many records could have been returned. There MUST be a way to eliminate the second one at least! (There is on most DB software)
A search query for a particular item_number is much faster, at 0.79 + 0.33 - but that is still MUCH slower than it should be. Then remember - your operator wanted to find ONE item to find the stock level - they had to live through the first query, then do the search. That one "request" cost you as an employer at least a Dollar! Not to mention the frustration.
"explain" shows temporary tables being created (28,000 records), and filesort being done.
Where being used to match/eliminate from several tables, and joins done sub-optimally.
The inventory table is being hit on every record, data converted for comparison, then where. Ugly!
BUG FOUND!! Because of the dependency on the inventory table, an item that has no transactions can never be found! RESULTS ERROR. IF this query code is used elsewhere, a new product could never be located to be received, sold, or edited. I’m sure this must show as a bug somewhere! I’m sure this isn’t the intent either. Still can’t figure why there is a selection of inventory transactions on the list anyway!
Aggregates are a big slow-down, and costing about 10 seconds on this query. I understand why they were added (to cover-up for the possible duplication fault caused by the join), but there is a better way. Fix the joins!
Inventory table has no index on the field for the matching where clause (BETWEEN). Costly.
*** create index trans_timestamp_index on ospos_inventory (trans_date);
Cleanup part 1 - Eliminate the inventory table from the join. It is causing bad results, and is certainly an ugly join. I'll re-integrate that later once the use is found.
Execution Time:0.86312890052795 PLUS Execution Time:0.23467803001404
Lots better, but still slower.
The second query (count) is .6 of a second faster - most due to the GROUP BY which shouldn't be needed. Each item has ONE supplier and ONE quantities record that is being considered!
Quantities join matches item_id only, then duplicates eliminated using a WHERE. Hmm
PROBABLE FUTURE BUG? "Stock Location" feature breakage. Where item is created "for me" in location 1, a stock quantities record is created. Someone who only sees location 2 cannot query or see this product. IS THIS INTENTIONAL? If so, HOW do they add a product with the same SKU for the second location? Just another duplicate? OR is each "Item" only intended to every be only in one location? If so, why is this a second table instead of remaining as a "property" field and keeping the quantity in the main table?
For now - I'll do this as an OUTER join. and properly put the LOCATION into the JOIN syntax. Faster anyway!
Removing the MAX(xx) from each select helped a small bit (a couple seconds off the large time), but every second counts.
QUESTION: Is "Field Order" used anywhere in the application? Or are all table references "associative" to the name? A * SELECT shows slightly slower for some reason, but is more flexible for customizing things. I'll leave the field names. There are only 2 "tiny" fields not returned by the query. There'll be a lot more with "MY" needed additions. There are no BLOBS.
Eliminate the "redundant" GROUP BY clause. The joined tables are (now) a 1:1 relationship via KEY, and LEFT further enforces that.
Execution Time:0.64102482795715 PLUS Execution Time:0.27455306053162
Better!
MOST SQL engines have a way to return the rows returned ignoring the LIMIT clause. For mysql, mysqli, Maria, Oracle and SQL server it is SQL_CALC_FOUND_ROWS as part of the initial query, followed by a second query "SELECT FOUND_ROWS()". That can eliminate the second query time.
AFTER all these changes:
List With no sort:
Execution Time:0.35885500907898 PLUS Execution Time:0.00020813941955566
List With a sort:
Execution Time:0.6206910610199 PLUS Execution Time:0.00011992454528809
for a QUERY (with a sort):
Execution Time:0.26358985900879 PLUS Execution Time:0.0001060962677002
without sort:
Execution Time:0.25322484970093
NEXT will be to re-integrate the "Transactions within Date Range" functionality.
Dan