Watson Supply Chain Ideas

Use this form to submit an idea for a new product feature. The product team will review your input and provide status updates as decisions are made regarding the request.

Before you submit a new idea, please view requests that have already been submitted. If your idea has already been submitted, you can add comments or vote on the existing idea, thereby indicating your agreement with the idea. We may use this information to help prioritize development of new features.

 

Submit ideas for Watson Marketing and Watson Commerce products

Fix poorly written SQL Query relating to TS000903142 - (~3,750,550% improvement)

Edit: On 5th September this query was found to run for 2 hours 24 minutes with 3 BILLION reads.

The following query is poorly written.  It scans EVERY record in the table, to find a subset of those records.  This leads to incredibly high I/O operations and this query has been observed to run on production systems for in excess of 20 minutes.

SELECT DISTINCT A.ROOT_WF_ID, SUM( CASE WHEN B.WF_ID IS NOT NULL THEN 1 ELSE 0 END ) FROM WORKFLOW_LINKAGE A (nolock)  LEFT OUTER JOIN ARCHIVE_INFO B (nolock) ON (( A.ROOT_WF_ID = B.WF_ID OR A.C_WF_ID = B.WF_ID) AND GROUP_ID = 1) GROUP BY A.ROOT_WF_ID HAVING SUM( CASE WHEN B.WF_ID IS NOT NULL THEN 1 ELSE 0 END ) = 0

An optimised version is as follows:

SELECT DISTINCT A.ROOT_WF_ID FROM WORKFLOW_LINKAGE A (nolock) WHERE A.ROOT_WF_ID NOT IN (SELECT DISTINCT WF_ID FROM ARCHIVE_INFO (nolock) WHERE GROUP_ID=1 AND WF_ID IS NOT NULL)   AND A.ROOT_WF_ID NOT IN (SELECT DISTINCT ROOT_WF_ID FROM ARCHIVE_INFO B (nolock) INNER JOIN WORKFLOW_LINKAGE C (nolock) ON (B.WF_ID=C.C_WF_ID) WHERE GROUP_ID=1 AND WF_ID IS NOT NULL)

 

A query to verify that the two above give the same results is:

 

-- Original query - put results into a temp table.

SELECT DISTINCT A.ROOT_WF_ID, SUM( CASE WHEN B.WF_ID IS NOT NULL THEN 1 ELSE 0 END ) As nbr

INTO #temp

FROM WORKFLOW_LINKAGE A (nolock) 

   LEFT OUTER JOIN ARCHIVE_INFO B (nolock) ON (( A.ROOT_WF_ID = B.WF_ID OR A.C_WF_ID = B.WF_ID) AND GROUP_ID = 1)

   GROUP BY A.ROOT_WF_ID

   HAVING SUM( CASE WHEN B.WF_ID IS NOT NULL THEN 1 ELSE 0 END ) = 0

 -- Display a count of the original records, then display a count of an inner join between the original records and the results of the new query.

SELECT 'Records from original Query' As description, COUNT(*) As Number FROM #temp

UNION ALL

SELECT 'Records from new Query', COUNT(*) FROM (

/*** Start of new query ***/

SELECT DISTINCT A.ROOT_WF_ID

FROM WORKFLOW_LINKAGE A (nolock)

WHERE A.ROOT_WF_ID NOT IN (SELECT DISTINCT WF_ID FROM ARCHIVE_INFO (nolock) WHERE GROUP_ID=1 AND WF_ID IS NOT NULL)

  AND A.ROOT_WF_ID NOT IN (SELECT DISTINCT ROOT_WF_ID FROM ARCHIVE_INFO B (nolock) INNER JOIN WORKFLOW_LINKAGE C (nolock) ON (B.WF_ID=C.C_WF_ID) WHERE GROUP_ID=1 AND WF_ID IS NOT NULL)

/*** End of new Query ***/

) src

INNER JOIN #temp ON (src.ROOT_WF_ID=#temp.ROOT_WF_ID)

DROP TABLE #temp

  • Avatar32.5fb70cce7410889e661286fd7f1897de Guest
  • Aug 28 2018
  • Under Consideration
How will this idea be used?

Improvements like this will assist with the overall performance of the product.

What is your industry? Government
What is the idea priority? High
DeveloperWorks ID
RTC ID
Link to original RFE
  • Attach files
  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    August 28, 2018 02:06

    IBM Labs or Level 3 Engineering can use this new SQL query which gives the same results with much better performance.

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    August 28, 2018 02:25

    This is an excellent improvement and should be implemented soon.

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    August 28, 2018 02:27

    We are in dire need of fixes like this

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    August 28, 2018 02:30

    That is such a big improvement! With an improvement like that you'd be silly not to use it. It should be implemented ASAP.

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    August 28, 2018 03:51

    How come that standard queries are so poorly written? Shouldn't those specialists that write those queries know what they're doing.

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    August 28, 2018 23:04

    Such a performance increase seems like a good idea!

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    August 31, 2018 06:27

    this is essentially a bug fix and a must-have.

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    September 2, 2018 23:13

    I would like this idea implemented

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    September 5, 2018 00:00

    innovative idea , should be implemented

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    September 6, 2018 01:28

    This query took 2 hour and 24 minutes to execute in a production database by the application Sterling product, with more than 3.2 BILLION reads and more than 8400 seconds CPU time. The query needs to be changed to perform in large TB databases.

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    September 6, 2018 08:23

    This is causing huge issues for our implementation. The query works fine for a small test database but not for a real enterprise database. Given Sterling Integrator is intended for enterprise use, Sterling customers like us need much improved quality in database code. Please fix, along with other database code quality issues.