Incoming Web Services and API Query : a lesson learned

After several months after upgrading to version 23 in December, one of the API that used to run silently overnight to copy data to Azure Data Lake failed returning HTTP 408 – one of my favorites -.

Telemetry on the rescue.  

As per documentation, OData / API timeout in SaaS is set to 8 minutes (480.000 ms). This is an operational limit and cannot be changed.

Operation Limits in Dynamics 365 Business Central – Business Central | Microsoft Learn

And that is exactly what was reported analyzing Incoming Web Services calls (RT0008) signal. Smashed to the roof top (480k = 8 minutes).

Then I was told that it was always that bad since the upgrade happened in December.

REALLY???

Oh yes. Again, telemetry on the rescue. Luckily, we have 90-days retention policy with this customer and found out it was true.

Considering all APIs, THAT VERY SPECIFIC ONE was skyrocketing since 4th December (exactly when the SaaS environment was updated from 22 to 23).

traces 
| where timestamp between (datetime(2023-02-01T08:58:07.9294929Z) .. datetime(2024-02-11T08:58:07.9294929Z))
| where customDimensions.eventId == "RT0008"
| where customDimensions.environmentName == "<REDACTED>"
| where customDimensions.endpoint contains "<REDACTED>"
| extend _endpoint = strcat("WS/", customDimensions.alObjectType,'/', customDimensions.alObjectId, '(',tostring(customDimensions.endpoint),')')
| extend _processingTimeMS = round(toreal(totimespan(customDimensions.serverExecutionTime))/10000,0)
| project timestamp,
["Endpoint"] = _endpoint,
["Object Type"] = tostring(customDimensions.alObjectType),
["Object No."] = tostring(customDimensions.alObjectId),
_processingTimeMS,
["Environment Type"] = tostring(customDimensions.environmentType)

Looking into the Repo, nothing changed in the API object since a long while.

Sometimes repo might not be aligned – I know, sometimes I am too paranoid -. I have also compared, then, queries (signal RT0005) from November 2023 with a recent one and the only difference I could spot out was just a replacement of SELECT TOP (1) with a placeholder like

Old : OUTER APPLY (SELECT  TOP (1)

New : OUTER APPLY (SELECT  TOP (@13)

I smell, then, that it could have been some of that damn FlowFields and the data pool, or something strange happened to the object itself, right after conversion.

FlowFields?

Yes, if these are added in the API object, they are calculated (OUTER APPLY). This is duly written in the official documentation:

Query Objects and Performance – Business Central | Microsoft Learn

Obviously there have been some:

Within SaaS, there are no query execution statistics available, unfortunately, to catch exactly if and which of these could have been the pesky one(s).

The first question was: can we remove the ones that are unused? Of course, there were some.  More precisely, we removed some of the FlowField with EXISTS (and few others unused normal fields)

  • Comment
  • Cancelled
  • Paid

We were working while the clock was ticking, and next BI refresh was scheduled to be asap. Like in “The hunt for Red October” we were waiting for torpedo collision.

45 seconds… 30 seconds… Mark. Mark +8 seconds…

We were all alive and data beautifully flown into the BI, like it was pre-update.

Execution time dropped from timeout (8 minutes) to approx. 6 seconds.

CONCLUSION

There have been few takeaways from this experience.

  1. Be careful on which field you are adding to an API Query. Remember that FlowFields, if added to the query definition, are calculated.
  2. Do not be lazy. Just add the fields that are really needed to your object.
  3. Telemetry is always useful to spot regressions. But sometimes you have to keep data for a little longer.

Leave a comment

Blog at WordPress.com.

Up ↑