How to work with candidate selection in Match Transform
Candidate Selection process helps when new data set from source system is to be compared and matched to an existing data collection in data warehouse. Without candidate selection, the new set of matched data would simply get appended and there would not be any logical grouping between new data set and the existing data in database.
To achieve grouping between new source data set and existing DB data, we have option in Match Transform – Candidate Section
Open Match Transform: For example FirmAddress_MatchBatch
Mapping Input:
- Map default fields: FIRM, ADDRESS_PRIMARY_NAME, ADDRESS_PRIMARY_NUMBER, ADDRESS_SECONDARY_NUMBER and POSTCODE
- Apart from default fields, map all columns that is required to be mapped with DB columns
Options:
Right Click on “Group Forming” options and select Candidate Selection. In the candidate section editor, provide below details
Datastore: Select the valid DW connection where existing data collection is available
- Suppose secondary source containing existing data collection is static, persistent data store can also be used as data store. This is also advantageous in case secondary source is not an RDBMS.
Cache Type: Pre-Load Cache or No-Cache
Auto Generate SQL or Custom SQL: Either use auto generated SQL or custom SQL for fetching columns from database table that is to be mapped with input data.
Break Key: Break Key on the input data must match with break key used in database table.
- Use Break Key column from Database
- Or Match the break key fields with DB columns
Use Constant Source Value: Suppose in case of MDM, you would be matching data from multiple sources. In case you want to keep track of input source data set, provide constant value (Physical Source Value) and map the same to the DB column (Physical Source Field).
Column Mapping: Map fields from input data set with your DB Table columns. In this case for break key fields, the Break Key column gets automatically set to “YES”.
Output: Map all required fields to the output
Group Prioritization: Priority can be set using Group Forming > Group Prioritization operation option. This ensures that high priority records are the master or driver record. Without this prioritization, records from the original source are always considered as driver.
Update DB with new data set: In case data from new data set source has to be matched and merged again in the DB table, use Insert Else Update option in target table update control property.
Hope you find this information useful.
Thanks,
Sukanya Krishnan