Skip to main content

Publish & PG-Elastic Sync Issues & Resolutions

This page will help on addressing any issues related to below 2 categories

  1. Publishing Customer, Contact, Concern, Customer Facility entities to downstream systems
  2. Sync from main database (Postgres) to Elastic database

Publish issues and resolution

Below table shows list issues reported till now in the PROD environment related to publishing. The issues are categorised (Error Category) and the resolutions are provided in Resolution column in the below table.

The issues with Code fix resolution, needs to be fixed by technical team to minimise the future issues.

The issues with manual innervation needs to be fixed by our team. These issues also we need to see how we can automate them to reduce manual intervention dependency.

Any new error category reported in the table producer_sync_track needs to be documented here for further analysis.

Sl. NoCountError CategoryResolutionSample Error Message
1212Publish failed for this recordManually reprocess parentPublish failed for this record as parent Customer has not been published yet. It will be picked by scheduler for re-publishing after some time
24923Publish to EMP topic failedManually check for code existancePublish to EMP topic failed, could not find the customer with customer code: IN03804724
38954geoCityCodeManually check geocodegeoCityCode : Exception. Customer published, but without geoCityCode
44Exception occurred while publishingCode fixException occurred while publishing customer GB02564328 :: 500 Internal Server Error from POST https://cmdprodpublishcustomerapi.trafficmanager.net/global-mdm/customer/publish?customerCode=GB02564328&eventDetails=CREATE
50Index 0 out of bounds for length 0Code fixIndex 0 out of bounds for length 0
6118com.maersk.smds.cmd.exceptionsManual verificationcom.maersk.smds.cmd.exceptions.BadRequestException: Customer with code IN03805825 not found
71001Not able to call the (customer/contact/concern) retrieve endpointCode fixNot able to call the customer retrieve endpointAn unexpected error occurred while trying to retrieve customer with code CN06492766
80500 Internal Server ErrorCode fix500 Internal Server Error from POST https://cmdsitproduceconcernapi.trafficmanager.net/global-mdm/concerns/bulk/publish
996Error serializing Avro messageCode fixError serializing Avro message
1012Failed to construct kafka producerCode fixFailed to construct kafka producer

Below are the queries can be used to look into the issues tracked in the table and categorise them as mentioned in above table.

select * from mdm_smdsmd.producer_sync_track where processed='N' and error_msg not like 'Publish to EMP topic failed%' and error_msg not like 'Publish failed for this record%' and error_msg not like 'Not able to call the %' and error_msg not like 'geoCityCode%' and error_msg not like '%Exception occurred while publishing %' and error_msg not like 'Index 0 out of bounds for length 0%' and error_msg not like 'com.maersk.smds.cmd.exceptions%' and error_msg not like '500 Internal Server Error%' and error_msg not like 'Error serializing Avro message%' and error_msg not like 'Failed to construct kafka producer%';

select count(*) from mdm_smdsmd.producer_sync_track where processed='N' and error_msg like 'Publish failed for this record%' ;

Steps to follow:

  1. Execute below query to see any new issues are tracked

select * from mdm_smdsmd.producer_sync_track where processed='N' and create_time > CURRENT_DATE-10

  1. Categorise the errors into various categories based on above table
  2. If there is any new category, then create new category with resolution
  3. If the issue needs to be fixed in code, assign it to Dev team to fix in the code
  4. If the issue needs to be fixed manually, then perform the manual activity and see how we can automate it

PG-Elastic sync issues and resolution

Below table shows list issues reported till now in the PROD environment related to auto sync from PG to Elastic DB. The issues are categorised (Error Category) and the resolutions are provided in Resolution column in the below table.

The issues with Code fix resolution, needs to be fixed by technical team to minimise the future issues.

The issues with manual innervation needs to be fixed by our team. These issues also we need to see how we can automate them to reduce manual intervention dependency.

Any new error category reported in the table pg_elk_sync_trck needs to be documented here for further analysis.

Sl. NoCountError CategoryEntityResolutionSample Error Message
11timeout on connectionFCLTYCode fix10,000 milliseconds timeout on connection http-outgoing-40 [ACTIVE]
22Facility information is not availableFCLTYManual VerificationFacility information is not available in write db
33542Customer information is not availableCUSTManual VerificationCustomer information is not available in write database
42timeout on connectionCUSTCode fix10,000 milliseconds timeout on connection http-outgoing-602 [ACTIVE]
52Contact information is not availableCONTManual VerificationContact information is not available in write db
61Elasticsearch exceptionCONTCode fixElasticsearch exception [type=mapper_parsing_exception, reason=The number of nested documents has exceeded the allowed limit of [10000]. This limit can be set by changing the [index.mapping.nested_objects.limit] index level setting.]
71timeout on connectionCONTCode fix10,000 milliseconds timeout on connection http-outgoing-1551 [ACTIVE]
8507Concern information is not availableCNCRNManual VerificationConcern information is not available in write database

Below are the queries can be used to look into the issues tracked in the table and categorise them as mentioned in above table.

select error_msg,entity_type, count(error_msg) from mdm_smdsmd.pg_elk_sync_trck where processed='N' group by error_msg,entity_type

Steps to follow:

  1. Execute below query to see any new issues are tracked

select * from mdm_smdsmd.pg_elk_sync_trck where processed='N' and create_time > CURRENT_DATE-10

  1. Categorise the errors into various categories based on above table
  2. If there is any new category, then create new category with resolution
  3. If the issue needs to be fixed in code, assign it to Dev team to fix in the code
  4. If the issue needs to be fixed manually, then perform the manual activity and see how we can automate it
Was this page helpful?