A strict naming schema for all DS objects (projects, jobs, workflows, dataflows, datastores, file formats, custom functions) is essential when working in a multi-user environment. The central repository has no folder concept or hierarchy or grouping functionality. The only way to distinguish between objects from one grouping and another one is by name. Most effective approach for naming objects is based on prefixing.
Note: In order to display the full names of DS objects in the Designer workspace, increase the icon name length. You do this by selecting Tools --> Options from the Designer menu, then expand Designer and select General. In this window, specify the value of 64 in the “Number of characters in workspace icon name” box.
General note: Versioning should not be handled by naming conventions. So, never include a version number in an object name. Use the central repository concept for maintaining successive versions of any object.
1. Reusable objects
Object | Naming Convention | Example |
---|---|---|
Project | <project_name> | BI4B |
Job | <project_name>_<job_name> | BI4B_D |
Workflow contained in one job only | <project_name>_<job_name>_[XT|TF|LD|AG…] <project_name>_<job_name>_[XT|TF|LD|AG…]_<workflow_name> <project_name>_<job_name>_<workflow_name> | BI4B_D_XT BI4B_D_LD_Facts BI4B_D_Init |
Workflow that is reused | <project_name>_COMMN_[XT|TF|LD|AG…]_<workflow_name> COMMN_[XT|TF|LD|AG…]_<workflow_name> | BI4B_COMMN_Init COMMN_ErrorHandling |
Dataflow contained in one job only | <project_name>_<job_name>_[XT|TF|LD|AG…]_<dataflow_name> | BI4B_D_LD_Opportunities |
Dataflow that is reused | <project_name>_COMMN_[XT|TF|LD|AG…]_<dataflow_name> | BI4B_COMMN_LD_JobCycles COMMN_LD_Jobs |
Embedded Dataflow | <project_name>_<job_name>_[XT|TF|LD|AG…]_<dataflow_name>_EMB | BI4B_D_LD_Employees_EMB |
ABAP Dataflow | <project_name>_<job_name>_XT_<dataflow_name>_ABAP | BI4B_D_XT_KNA1_ABAP |
Custom Function contained in one job only | <project_name>_<function_name> | BI4B_getDate |
Custom Function that is reused | COMMN_<function_name> | COMMN_dateKey |
1.1. Projects: <project_name>
Give every DS project a 5-character short name. The name has to be short, because it will be used as a prefix for the name of all reusable objects defined within the project.
E.g.: P2345
1.2. Jobs: <project_name>_<job_name>
Give every job a 5-character short name. Use < project name>_ as a prefix for the job name. The name has to be short, because it will be used as a prefix for the name of all workflows and dataflows defined within that job.
E.g.: P2345_J2345
1.3. Workflows: <project_name>_<job_name>_[XT|TF|LD|AG…][_<workflow_name>]
Name every workflow with <project_name>_<job name>_ as a prefix. Use COMMN_ as prefix for shared workflows, used across projects, <project_name>_COMMN_ when used in multiple jobs within a given project.
Workflows are often used to group dataflows for serial or parallel execution. In a typical ETL job, dataflows are executed in “stages”: a first set of dataflows have be executed (in parallel) before a next set can be started; and so on. A data warehouse loading job may extract data from the sources, load them into staging, optionally transform from staging-in to staging-out before loading into the core EDW and aggregating into the semantic layer.
Distinguish between job stages by extending the prefix with a 2 character code:
- XT: extract from source to staging
- TF: transform from staging-in to staging-out
- LD: load from staging into the core EDW layer
- AG: load (physically aggregate) from core to semantic layer
- …
The workflow name will be used as a prefix for the name of all embedded workflows and dataflows.
E.g.: P2345_J2345_XT
Within a workflow, objects (scripts, sub-workflows, dataflows) must either all be defined in parallel or all sequentially, and will be executed as such. There is no limit to the number of objects within a workflow. When the number of objects is higher than the number of processors available, DS will internally control the execution order of embedded parallel objects. Only when there are fewer objects than the number of processors available, they will really be executed in parallel.
Complex hierarchical structures can be defined by nesting workflows. There is no limit to the number of nesting levels, either. With nested workflows, use a name (_Facts for facts extraction or load, _Dims for dimension processing…) combined with an outline numbering scheme (1, 11, 111, 112, 12, 2…).
E.g.: P2345_J2345_LD_Dims21
Some workflows may not contain dataflows at all; they only contain not reusable objects. In that case, just name the workflow according to its function.
E.g. for a workflow embedding an initialization script: P2345_J2345_Initialise
1.4. Dataflows
DS supports three types of dataflows. The dataflow names must be unique across the different types. To distinguish the embedded and ABAP dataflows from the regular ones, use a suffix in their name.
- Regular dataflows: <project_name>_<job_name>_[XT|TF|LD|AG…]_<dataflow_name>
According to design and development best practices there should only be a single target table in a dataflow. Name a dataflow according to that target table.
Use <project_name>_<job name>_ as a prefix. Use COMMN_ as prefix for shared dataflows, used across projects, <project_name>_COMMN_ when used in multiple jobs within a given project. Distinguish between dataflow locations (extract, transform, load, aggregate…) by extending the prefix with a 3 character code (XT_, TF_, LD_, AG_...) as from the embedding workflow.
E.g.: P2345_J2345_XT_S_TABLE1, P2345_J2345_LD_TargetTable
- Embedded dataflows: <project_name>_<job_name>_[XT|TF|LD|AG…]_<dataflow_name>_EMB
Name every embedded dataflow with <project_name>_<job name>_ as a prefix; use _EMB as a suffix for the dataflow name. Distinguish between dataflow locations (extract, transform, load and aggregate) by extending the prefix with a 3 character code (XT_, TF_, LD_ and AG_).
E.g.: P2345_J2345_LD_TargetTable_EMB
- ABAP dataflows: <project_name>_<job_name>_XT_<dataflow_name>_ABAP
An ABAP dataflow is always used as a source in a regular dataflow. Reuse that name for the ABAP dataflow and add _ABAP as a suffix to make it unique.
E.g.: P2345_J2345_XT_S_TABLE1_ABAP
1.5. Custom Functions: <project_name>_<function_name>
Give every Custom Function a descriptive name. Use <project_name>_<job name>_ as a prefix. Use COMMN_ as prefix for shared custom functions, used across projects.
E.g.: P2345_TrimBlanksExt
2. Datastores: [SAP|BWS|BWT|HANA…]_<datastore_name>
As datastores are often used in multiple projects, they do not follow the same naming conventions as for other reusable projects.
Name a datastore in line with its physical name, and make the prefix depend on the object’s type:
Datastore Type | Database Type | Naming Convention | Example |
---|---|---|---|
Database | SQL Server | SQL_ | SQL_OC4A1 |
Database | Oracle | ORA_ | ORA_ITSD |
Database | Teradata | TD_ | TD_Staging |
Database | DB2 | DB2_ | DB2_MDM |
Database | MySQL | MySQL_ | MySQL_GEB |
Database | Sybase ASE | ASE_ | ASE_CRN |
Database | Sybase IQ | IQ_ | IQ_CMDWH |
SAP Applications | SAP_ | SAP_MDR | |
SAP BW as a source | BWS_ | BWS_Achme | |
BW as a target | BWT_ | BWT_Hana | |
SAP Hana | HANA_ | HANA_DB | |
Adapter | AS_ | AS_Nexus | |
Web Services | WS_ | WS_Weather |
Note 1: Pay attention when choosing datastore names. A datastore name cannot be changed anymore once the object has been created.
Note 2: Landscape-related information should not be handled with datastore names. So, never include a physical system indicator (DEV, T, QA…) in a datastore name. Landscape information should be configured using datastore configurations. Create one datastore, then create a datastore configuration for every tier (development, test, QA, production…) in the landscape.
3. File formats: <project_name>_<file_format_name>
Reuse the file name for the format name of a project-specific file. Use < project name>_ as a prefix.
E.g.: P2345_ISO_Cntr_Codes
Note: Pay attention when choosing a file format names. A file format name cannot be changed anymore once the object has been created.
4. Not reusable objects
Because not reusable objects are only defined within the context of a workflow or a dataflow, no strict naming standards are necessary. Names will only serve documentation purposes.
Use meaningful names for workflow objects (Script, Condition, While Loop, Try, Catch).
Do not change the transform names unless you want to stress the specific purpose of a Query transform, e.g. Join, Order, OuterJoin…