8.10. Estimating the Amount of Time Left in a Process

Problem

You are running a program that takes a long time to execute, and you need to present the user with an estimated time until completion.

Solution

Use Commons Math’s SimpleRegression and Commons Lang’s StopWatch to create a ProcessEstimator class that can be used to predict when a particular program will be finished. Your program needs to process a number of records, and this program could take a few hours to finish. You would like to provide some feedback, and, if you are confident that each record will take roughly the same amount of time, you can use SimpleRegression’s slope and intercept to estimate the time when all records will be processed. Example 8-1 defines the ProcessEstimator class that combines the power of StopWatch and ProcessEstimator to estimate the time remaining in a process.

Example 8-1. ProcessEstimator to estimate time of program execution

package com.discursive.jccook.math.timeestimate;

import org.apache.commons.lang.time.StopWatch;
import org.apache.commons.math.stat.multivariate.SimpleRegression;

public class ProcessEstimator {

    private SimpleRegression regression = new SimpleRegression( );
    private StopWatch stopWatch = new StopWatch( );

    // Total number of units
    private int units = 0;
    
    // Number of units completed
    private int completed = 0;

    // Sample rate for regression
    private int sampleRate = 1;
    
    public ProcessEstimator( int numUnits, int sampleRate ) {
        this.units = numUnits;
        this.sampleRate = sampleRate;
    }
    
    public void start( ) {
        stopWatch.start( );
    }
    
    public void stop( ) {
        stopWatch.stop( );
    }

    public void unitCompleted( ) {
        completed++;
        
        if( completed % sampleRate == 0 ) {
            long now = System.currentTimeMillis( );
            regression.addData( units - completed, stopWatch.getTime( ));
        }
    }
    
    public long projectedFinish( ) {
        return (long) regression.getIntercept( );
    }
    
    public long getTimeSpent( ) {
        return stopWatch.getTime( );
    }

    public long projectedTimeRemaining( ) {
        long timeRemaining = projectedFinish( ) - getTimeSpent( );        
        return timeRemaining;
    }
    
    public int getUnits( ) {
        return units;
    }

    public int getCompleted( ) {
        return completed;
    }

}

ProcessEstimator has a constructor that takes the number of records to process and the sample rate to measure progress. With 10,000 records to process and a sample of 100, the SimpleRegression will add a data point of units remaining versus time elapsed after every 100 records. As the program continues to execute, projectedTimeRemaining( ) will return an updated estimation of time remaining by retrieving the y-intercept from SimpleRegression and subtracting the time already spent in execution. The y-intercept from SimpleRegression represents the y value when x equals zero, where x is the number of records remaining; as x decreases, y increases, and y represents the total time elapsed to process all records.

The ProcessEstimationExample in Example 8-2 uses the ProcessEstimator to estimate the time remaining while calling the performLengthyProcess( ) method 10,000 times.

Example 8-2. An example using the ProcessEstimator

package com.discursive.jccook.math.timeestimate;

import org.apache.commons.lang.math.RandomUtils;

public class ProcessEstimationExample {

    private ProcessEstimator estimate;

    public static void main(String[] args) {
        ProcessEstimationExample example = new ProcessEstimationExample( );
        example.begin( );
    }

    public void begin( ) {
        estimate = new ProcessEstimator( 10000, 100 );
        estimate.start( );
        
        for( int i = 0; i < 10000; i++ ) {
            // Print status every 1000 items
            printStatus(i);
            performLengthyProcess( );
            estimate.unitCompleted( );
        }
        
        estimate.stop( );
       
        System.out.println( "Completed " + estimate.getUnits( ) + " in " + 
                  Math.round( estimate.getTimeSpent( ) / 1000 ) + " seconds." );
    }
    
    private void printStatus(int i) {
        if( i % 1000 == 0 ) {
            System.out.println( "Completed: " + estimate.getCompleted( ) +
                                " of " + estimate.getUnits( ) );
            
            System.out.println( "	Time Spent: " +
                                 Math.round( estimate.getTimeSpent( ) / 1000) +
                                 " sec" + ", Time Remaining: " +
                       Math.round( estimate.projectedTimeRemaining( ) / 1000) +
                                " sec" );
        }
    }

    private void performLengthyProcess( ) {
        try {
            Thread.sleep(RandomUtils.nextInt(10));
        } catch( Exception e ) {}
    }
}

After each call to performLengthyProcess( ), the unitCompleted( ) method on ProcessEstimator is invoked. Every 100th call to unitComplete( ) causes ProcessEstimator to update SimpleRegression with the number of records remaining and the amount of time spent so far. After every 1000th call to performLengthyProcess( ), a status message is printed to the console as follows:

Completed: 0 of 10000
    Time Spent: 0 sec, Time Remaining: 0 sec
Completed: 1000 of 10000
    Time Spent: 4 sec, Time Remaining: 42 sec
Completed: 2000 of 10000
    Time Spent: 9 sec, Time Remaining: 38 sec
Completed: 3000 of 10000
    Time Spent: 14 sec, Time Remaining: 33 sec
Completed: 4000 of 10000
    Time Spent: 18 sec, Time Remaining: 28 sec
Completed: 5000 of 10000
    Time Spent: 24 sec, Time Remaining: 23 sec
Completed: 6000 of 10000
    Time Spent: 28 sec, Time Remaining: 19 sec
Completed: 7000 of 10000
    Time Spent: 33 sec, Time Remaining: 14 sec
Completed: 8000 of 10000
    Time Spent: 38 sec, Time Remaining: 9 sec
Completed: 9000 of 10000
    Time Spent: 43 sec, Time Remaining: 4 sec
Completed 10000 in 47 seconds.

As shown above, the output periodically displays the amount of time you can expect the program to continue executing. Initially, there is no data to make a prediction with, so the ProcessEstimator returns zero seconds, but, as the program executes the performLengthyProcess( ) method 10,000 times, a meaningful time remaining is produced.

Discussion

The previous example used a method that sleeps for a random number of milliseconds between 1 and 10, and this value is selected using the RandomUtils class described in Recipe 8.4. It is easy to predict how long this process is going to take because, on average, each method call is going to sleep for five milliseconds. The ProcessEstimator is inaccurate when the amount of time to process each record takes a steadily increasing or decreasing amount of time, or if there is a block of records that takes substantially more or less time to process. If the amount of time to process each record does not remain constant, then the relationship between records processed and time elapsed is not linear. Because the ProcessEstimator uses a linear model, SimpleRegression, a nonconstant execution time will produce inaccurate predictions for time remaining. If you are using the ProcessEstimator, make sure that it takes roughly the same amount of time to process each record.

See Also

This recipe refers to the StopWatch class from Commons Lang. For more information about the StopWatch class, see Recipe 1.19.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.138.178