Dayalan Punniyamoorthy Blog

Wednesday, May 13, 2026

Oracle Cloud EPM : Two Long-Awaited Pipeline Features That Change Everything!

 The Oracle Cloud EPM May 2026 (26.05) update is packed with meaningful enhancements — from the mandatory Groovy engine upgrade to the new Environment Backup capability. But for those of us who live and breathe Data Integration Pipelines, two features stand out as absolute game-changers that the EPM community has been requesting for a long time:

 


  1. Ability to Restart Pipelines from Failed Stages
  2. Ability to Control Timing within Pipeline Jobs (Wait Job Type)

 

These aren't just incremental improvements — they fundamentally change how we design, execute, and recover from pipeline failures in Oracle Cloud EPM. In this post, I'll break down both features in detail, explain why they matter, and show you exactly how to use them.

Applies to: Planning, Financial Consolidation and Close (FCCS), FreeForm, Tax Reporting, Account Reconciliation, Enterprise Profitability and Cost Management (EPCM), and Profitability and Cost Management (PCM).

 

 Feature 1: Restart Pipelines from Failed Stages

Oracle Doc Reference: Ability to Restart Pipelines from Failed Stages

The Problem We've All Faced

If you've ever managed a complex data integration pipeline in Oracle Cloud EPM — especially during month-end close — you know the pain. A pipeline with 15+ stages runs for 45 minutes, and then fails at Stage 12 due to a transient network timeout or a substitution variable mismatch. What happened before this update?

 

You had to rerun the entire pipeline from Stage 1.

That means repeating 11 perfectly successful stages, wasting 30+ minutes of server processing time, and delaying the close cycle. During peak close windows with concurrent users, this wasn't just inconvenient — it was a serious operational risk.

 

What's New in 26.05

Oracle now allows you to restart a pipeline directly from the point of failure. When a pipeline fails during execution, you can resume processing from the exact failed stage, completely eliminating the need to repeat previously completed steps.

 

How It Works — Step by Step

  1. Navigate to Data Integration → Pipeline in your EPM application
  2. Open the pipeline that has failed — you'll see the execution status showing which stages succeeded and which failed
  3. Click on the failed pipeline execution to open the details
  4. Instead of clicking "Run" (which would restart from the beginning), you'll now see a "Start from Failure" option
  5. Select "Start from Failure" and click Execute
  6. The pipeline resumes execution from the exact stage where it failed, skipping all previously completed stages

 

 

Why This Is a Game-Changer

Aspect

Before (Pre-26.05)

After (26.05)

Failed Pipeline Recovery

Rerun entire pipeline from Stage 1

Resume from exact failure point

Time Wasted on Reruns

30–60+ minutes for complex pipelines

Only the failed stage + remaining

Server Resource Usage

Duplicate processing of completed stages

Zero redundant processing

Close Cycle Impact

Significant delays during month-end

Minimal disruption, faster recovery

Workaround Needed

Manual stage-by-stage execution or Groovy scripts

Built-in, one-click recovery

 

Real-World Scenario

Consider a typical FCCS month-end close pipeline:

  • Stage 1: ERP Cloud data extraction (5 min)
  • Stage 2: FX Rate load via FDMEE (3 min)
  • Stage 3: Intercompany data load (8 min)
  • Stage 4: Data validation business rule (2 min)
  • Stage 5: Consolidation (10 min)
  • Stage 6: Translation (5 min)
  • Stage 7: Data extract + SFTP upload (4 min)

If Stage 6 (Translation) fails because a substitution variable was incorrectly set, you previously had to rerun Stages 1–5 (28 minutes of work) before even getting back to the fix. Now? Fix the substitution variable, click "Start from Failure", and Stage 6 picks up right where it left off.

Key Considerations

  • The "Start from Failure" option is available only for pipelines that have completed some stages successfully before failing
  • Previously completed stages are not re-executed — their results are preserved
  • You can still choose to rerun the entire pipeline from the beginning if needed
  • Runtime variables can be edited before restarting from the failed stage — this is critical for fixing the root cause before retry

 

⏱️ Feature 2: Ability to Control Timing within Pipeline Jobs (Wait Job Type)

Oracle Doc Reference: Ability to Control Timing within Pipeline Jobs

The Problem: Processing Bottlenecks and Overlapping Jobs

Anyone who has built complex data integration pipelines knows this scenario: you have multiple data load rules running in sequence, and when they execute back-to-back without any breathing room, the Essbase post-processing (aggregation, formula calculations) from one job overlaps with the next job's data load. The result? System bottlenecks, timeouts, and degraded performance — especially during the close window when everyone is running jobs concurrently.

Until now, the only workaround was to write custom Groovy-based delay scripts — essentially a Thread.sleep() wrapped in a business rule — which was hacky, hard to maintain, and not visible in the pipeline execution logs.

 

What's New in 26.05

Oracle has introduced a new "Wait" job type that you can insert within a pipeline to pause execution for a specified duration between steps. This allows you to control the timing of sequential processes, spacing out job executions to prevent system bottlenecks.

 

How It Works — Step by Step

  1. Open your pipeline in Data Integration → Pipeline Editor
  2. Within a Stage, click "Add Job" or the "+" icon to add a new job
  3. In the Job Type dropdown, select "Wait"
  4. Configure the Wait job:
    • Job Name: Give it a descriptive name (e.g., "Wait – Post-Processing Buffer")
    • Duration: Enter the wait time in seconds (maximum: 300 seconds / 5 minutes)
  5. Position the Wait job between the resource-intensive jobs where you need the delay
  6. Save the pipeline

 

 

Here's a recommended pipeline architecture pattern using the Wait job type:

Stage 1: Data Loads

├── Job 1: DLR_ERP_US          (Data Load)

├── Job 2: DLR_ERP_India          (Data Load)

├── Job 3: Wait – Aggregation Buffer   (Wait: 120 sec)

├── Job 4: DLR_ERP_Japan          (Data Load)

├── Job 5: DLR_ERP_ANZ          (Data Load)

├── Job 6: Wait – Processing Cooldown  (Wait: 90 sec)

└── Job 7: DLR_ERP_Korea              (Data Load)

 

Stage 2: Post-Processing

├── Job 1: Wait – Pre-Consolidation    (Wait: 60 sec)

├── Job 2: Consolidation Rule          (Business Rule)

├── Job 3: Wait – Post-Consolidation   (Wait: 60 sec)

└── Job 4: Translation Rule            (Business Rule)

 

Why This Is a Game-Changer

Aspect

Before (Pre-26.05)

After (26.05)

Controlling Job Timing

Custom Groovy Thread.sleep() scripts

Native Wait job type in Pipeline

Visibility

Groovy delays invisible in pipeline logs

Wait steps visible with duration tracking

Maintenance

Code changes required to adjust timing

Simple seconds field — no code needed

Maximum Delay

Unlimited (risky)

Capped at 300 seconds (safe)

Applicability

Required Groovy knowledge

Any Pipeline user can configure

Post-Processing Buffer

Manual workarounds

Intentional, controlled delays

 

Practical Use Cases

1. Preventing Essbase Aggregation Overlap When loading data into FCCS or Planning, Essbase performs post-processing aggregation after each data load. If the next data load starts before aggregation completes, you get lock contention and timeouts. A 60–120 second Wait between loads gives Essbase time to complete.

2. SFTP Rate Limiting When uploading multiple files to an SFTP server some servers enforce rate limits. A 30-second Wait between uploads prevents connection throttling.

3. REST API Throttling When making sequential REST API calls to external systems, a short Wait prevents hitting API rate limits and receiving HTTP 429 errors.

4. Concurrent User Load Management During the close window, spacing out resource-intensive jobs with Wait steps reduces the overall server load and improves performance for all concurrent users.

 

 How These Two Features Work Together

The real power emerges when you combine both features in a single pipeline:

 

 

Scenario: You have a 7-stage close pipeline with Wait buffers between data loads. Stage 5 fails due to a timeout.

  1. Wait jobs ensured Stages 1–4 ran cleanly without bottlenecks
  2. You identify and fix the root cause of the Stage 5 failure
  3. You use "Start from Failure" to resume from Stage 5 — Stages 1–4 are NOT rerun
  4. The remaining Wait jobs in Stages 5–7 continue to protect against bottlenecks

This combination delivers a resilient, self-healing pipeline architecture that was simply not possible before 26.05.

 

 Happy days on the cloud!

No comments:

Post a Comment