Tutorial: Search performance improvement for Apache Derby

Tutorials > Catalog search > Search Performance Improvement

Tutorial: Search performance improvement for Apache Derby

In this tutorial we will create a unique table to enable a single WebSphere Commerce type of search. The example cannot be directly reused in every WebSphere Commerce solution. However, the basic design elements can be followed for customizing for any solution. The tutorial is provided in an 11-step program format. This format provides shows you a how to approach search implementations and customization.

Procedure
Acknowledge search performance improvement areas.
In this example, Apache Derby tables are created that match common queries employed by the search bean on the catalog subsystem.

Acknowledge search limitations.
The new table will not have LONGVARCHAR data types. Any existing LONGVARCHAR fields are d Examine the catalog search features to see how they can fit the requirements divided into substring pieces of summary table fields. This part of the design permits a performance-friendly solution.

In this example, minimize the features to a controlled set of parameters to search on. The Boolean search imposes an OR expression across all fields since there is only one summary table to search against.

Perform a search comparison against all features provided in WebSphere Commerce.
In this example, the simple search method is used since to illustrate a basic example of improving search performance.

Consider and decide which parameters to use for a specific search.
This example uses the following parameters:

resultCatEntryType (String)

For this example, only products are to be returned in the search result - sending a value of 2 through form values. The code is customized such that this parameter is not needed.

manufacturer (String)

The value of this variable is used in a search on a manufacturer's name (table columns searched: Catentry.Mfname). This parameter is only passed from search to locate products within a manufacturer.

beginIndex (String)

This variable is used for paging the result set. The value must be the index of the first result row in a page. This parameter has not been changed.

pageSize (String)

The value of this variable specifies the number of search result rows to be displayed per page. The value was changed to represent 18 (that is, 18 displayed items per page).

resultType (String)

Merchant can specify if they want to show Products or Items or both Products and Items in a search result. The value must be 1 (Items only), 2 (Products only) or 3 (both products and items). The value was restricted to 3 (both products and items). The code is customized such that this parameter is not needed.

searchTerm (String)

The value of this variable is used in a search on a word (table columns searched: Catentdesc.name, Catentdesc.Shortdescription, Catentdesc.Longdescription). This parameter has not been changed. The code is customized such that this parameter searches the summary table fields instead.

sku (String)

The value of this variable is used in a search on SKU (table columns searched: Catentry.Sku). This parameter has not been changed. The code is customized such that this parameter searches the summary table fields instead.

orderBy1 (String)

Result set is sorted by this attribute. The value was changed to point to a new external RuleQuery class. Base code must be checked to find the extended RuleQuery to point to the summary table field declared in the class reference: 'ENTRYSHORTDESCAttributeInfo'

coSearchSkuEnabled (Boolean)

Search input terms (one contigious term only) for descriptions are also co-search with SKU column. This parameter has not been changed. Test using two or more search term value inputs. The default implementation does not include SKU search if more than one search term is entered in the search term parameter.

isProduct (Boolean)

Returns results of 'product data bean' catentry type, which is false by default. This parameter has not been changed. The code is customized such that this parameter is not needed.

isItem (Boolean)

Returns results of 'item data bean' catentry type, which is false by default. This parameter has not been changed. The code is customized such that this parameter is not needed.

narrowBy (Boolean)

Returns results of a narrowed search for Categories and Vendors. This is a new parameter for customized search.

categoryList (String)

Passes the user selected category to filter search results. This is a new parameter for customized search.

vendorList (String)

Passes the user selected vendor to filter search results. This is a new parameter for customized search.

pageIndex (String)

Passes an index for page size in order to structure the 10-pages listing per view. When pages exceed 10, only a given 10 pages are shown while the other remaining pages are only provided with a forwarding hypertext link. This is a new parameter for customized search.
Run tests and output the generated SQL. Dissect this SQL to design the summary table. Run an initial test of the search bean:

Enable the WC_SEARCH component. Information is written to the appropriate log files.

In a starter store search for the term component.

There is a search text input available on the Advanced Search page.
The log file produced as a result of the submission of the search term 'component' shows the following output:
XX:XX:XX.XXX *
com.ibm.commerce.search.rulequery.RuleQuery.execute(Cursor aCursor)
Servlet.Engine.Transports : 0 com.ibm.websphere.commerce.WC_SEARCH
  Query = select distinct CATENTRY.CATENTRY_ID, CATENTDESC.SHORTDESCRIPTION
  from CATENTRY, CATENTDESC, CATGPENREL
  where ((((((upper(CATENTRY.PARTNUMBER) like '%COMPONENT%'
                or  upper(CATENTDESC.SHORTDESCRIPTION) like
'%COMPONENT%'
                or  upper(CATENTDESC.LONGDESCRIPTION) like
'%COMPONENT%'
                or  upper(CATENTDESC.NAME) like '%COMPONENT%')
        and  CATGPENREL.CATALOG_ID = 10001))
        and  (((CATENTDESC.PUBLISHED = 1
        and  CATENTDESC.LANGUAGE_ID = -1)))))
        and  (CATENTRY.CATENTRY_ID = (CATGPENREL.CATENTRY_ID)
        and  CATENTRY.CATENTRY_ID = (CATENTDESC.CATENTRY_ID)))
        order by CATENTDESC.SHORTDESCRIPTION asc
This SQL statement shows a small customization to remove the buyable constraint.
The preceding SQL shows that expected fields to be searched are refreshed into a summary table. This SQL above is one example of a number of possible fields. Other fields to include would be the CATGRPDESC table as part of a requirement to search parents of products for a search term match (particular to a business requirement).
Create the summary tables based on the expected fields to be searched against. From the SQL that was extracted from the previous step, a summary table can be created based on a prediction of catalog fields that would be searched. The summary table, ITEMSFORPERF, follows:
 CREATE TABLE ITEMSFORPERF
AS (STOREENT_ID INT NOT NULL, CATENTRY_ID INT NOT NULL,         ENTRYSHORTDESC VARCHAR(244),         PARTNUMBER VARCHAR(244),         MFPARTNUMBER VARCHAR(244),         MFNAME VARCHAR(244),,         ENTRYNAME VARCHAR(244),,         KEYWORD VARCHAR(244),,         GROUPNAME VARCHAR(244),,         GROUPSHORTDESC VARCHAR(244));


INSERT INTO ITEMSFORPERF (STOREENT_ID,CATENTRY_ID,E
NTRYSHORTDESC,PARTNUMBER,MFPARTNUMBER,MFNAME,ENTRYNAME, KEYWORD,GROUPNAME,GROUPSHORTDESC)
        (SELECT STORECENT.STOREENT_ID,         CATENTRY.CATENTRY_ID,         UPPER(CATENTDESC.SHORTDESCRIPTION) AS ENTRYSHORTDESC,         UPPER(CATENTRY.PARTNUMBER) AS PARTNUMBER,         UPPER(CATENTRY.MFPARTNUMBER) AS MFPARTNUMBER,         UPPER(CATENTRY.MFNAME) AS MFNAME,         UPPER(CATENTDESC.NAME) AS ENTRYNAME,         UPPER(CATENTDESC.KEYWORD) AS KEYWORD,         UPPER(CATGRPDESC.NAME) AS GROUPNAME,         UPPER(CATGRPDESC.SHORTDESCRIPTION) AS GROUPSHORTDESC
FROM CONSUMERDIRECT.CATGPENREL CATGPENREL
LEFT OUTER JOIN CONSUMERDIRECT.CATENTRY CATENTRY
ON CATGPENREL.CATENTRY_ID = CATENTRY.CATENTRY_ID
LEFT OUTER JOIN CONSUMERDIRECT.CATENTDESC CATENTDESC
ON CATENTRY.CATENTRY_ID = CATENTDESC.CATENTRY_ID
LEFT OUTER JOIN CONSUMERDIRECT.CATGRPDESC CATGRPDESC
ON CATGPENREL.CATGROUP_ID = CATGRPDESC.CATGROUP_ID
RIGHT OUTER JOIN CONSUMERDIRECT.STORECENT STORECENT
ON CATENTRY.CATENTRY_ID = STORECENT.CATENTRY_ID
WHERE CATENTDESC.PUBLISHED = 1
AND CATENTDESC.LANGUAGE_ID = -1
AND CATGRPDESC.LANGUAGE_ID = -1
AND CATENTRY.CATENTTYPE_ID = 'ProductBean'
AND CATGPENREL.CATALOG_ID =
(SELECT CATALOG_ID FROM CATALOG WHERE IDENTIFIER =
'ConsumerDirect'));
Develop and implement the customization of the search bean to work against each summary table.
Since we have refined the catalog search fields into one single table, then only one TableInfo class is created for ITEMSFORPERF. There are several new fields searchable within the ITEMSFORPERF table, and therefore, multiple AttributeInfo classes are created for each field with the following naming convention: CATENTRY_IDAttributeInfo, ENTRYSHORTDESCAttributeInfo, PARTNUMBERAttributeInfo, MFPARTNUMBERAttributeInfo, MFNAMEAttributeInfo, ENTRYNAMEAttributeInfo, KEYWORDAttributeInfo, GROUPNAMEAttributeInfo, GROUPSHORTDESCAttributeInfo. The RuleQuery class must be extended in order to map these new attributeInfo class references. Also, some extra methods must override in order to control the orderBy clause. An entry is not required in search.xml file since we are only searching within one table for optimal performance improvement.
The class CatEntrySearchListDataBean must be extended in order to revamp the SQL in order to map to the new summary table. The execute method will override in order to remove references to product type constraints. The summary table will already have the product types embedded within.
This tutorial must override the method setPredefinedAttributes. We will only construct the summary table SQL reference for this search bean for the Apache Derby database. Therefore, if the database type is DB2J (Apache Derby), then the method setPredefinedAttributes of the class CatEntrySearchListDataBean is used instead to call the out-of-the-box tables. Otherwise, the summary table ITEMSFORPERF is referenced and the setPredefinedAttributes needs to override all table and field references.
To do this, all references to q.addFilterAttribute method will include the new AttributeInfo classes instead of the out-of-the-box AttributeInfo class references.
Run tests and output the generated SQL. An initial test of the search bean is run. The WC_SEARCH component must be enabled to write the appropriate logs. A test search term that is used for this particular tutorial is 'component'. There is a search text input available on the Advanced Search page. From the logs produced as a result of the submission of the search term 'component', you should see output...
XX:XX:XX.XXX *
com.ibm.commerce.search.rulequery.RuleQuery.execute(Cursor aCursor)
Servlet.Engine.Transports : 0 com.ibm.websphere.commerce.WC_SEARCH
Query =
        select distinct ITEMSFORPERF.Catentry_id, ITEMSFORPERF.EntryShortDesc
        from ITEMSFORPERF
        where ((((((
                ITEMSFORPERF.PartNumber like '%COMPONENT%'
                or ITEMSFORPERF.EntryShortDesc like '%COMPONENT%'
                or  ITEMSFORPERF.EntryName like '%COMPONENT%')
                or  ITEMSFORPERF.Keyword like '%COMPONENT%')
                or (ITEMSFORPERF.GroupShortDesc like '%COMPONENT%'
                or  ITEMSFORPERF.GroupName like '%COMPONENT%')))))
        order by ITEMSFORPERF.EntryShortDesc asc
The SQL above does have some performance improvements over the earlier out-of-the-box catalog search bean SQL. There is only one table being searched for this customized instance.
Measure performance improvements. The performance improvements can be measured by executing the SQL statements directly on the database apart from the application. This would provide true performance results at the database level without confounding variables of application constraints.
Some tools are available that could stress test the server and could provide feedback on application response to multiple users. With stress feedback in place, the searching could be verified against the customer performance requirements.

Refresh the summary tables. Since Apache Derby does not support IMMEDIATE REFRESH for summary tables, the table must be dropped and recreated in order to do an INSERT SELECT.
Related tasks

Tutorial: Search Performance Improvement
Tutorial: Search performance improvement

+
Search Tips | Advanced Search