+1 vote

Hi,

I begin to abandon the use of datasync since I can provide a definitly better job by doing scheduled tasks

I ve made a tool that allow you to log an estimated remaining time for the different tasks you will have to run. Fun fact this would probably work in a datasync as well^^.

this is an exemple of how you can use it :

var dataToWorkOn = new Array();
//-----(Fill your array with Sql, file reading, inputStream, ...)

var chrono = new NL_chrono();

for (var i=0; i<dataToWorkOn.length; i++){
    //-----(do your job here)

    chrono.lap();    //-----(add a new time flag in your chrono)
    log(format("$0/$1 done. Estimated lasting time: $2", i+1, dataToWorkOn.length, chrono.forecast(dataToWorkOn.length)));
}

chrono.reset(); //-----(if you want to reuse chrono)

the log output will look like something like that :

-> 45 / 12598 done. Estimated lasting time: 1d 13h 12min 23s
-> 46 / 12598 done. Estimated lasting time: 1d 13h 11min 19s
-> 47 / 12598 done. Estimated lasting time: 1d 13h 10min 13s

the forecast function will automatically calculate the remaining time depending of the total number of laps and the already achieved laps.

You can of course use chrono for different purpose.
This is the code to use it:

    function NL_chrono(){
    this.start = new Date();
    this.laps = new Array();

    NL_chrono.prototype.stringify = function(){
        return "not yet";
    }

    NL_chrono.prototype.reset = function(){
        this.start = new Date();
        this.laps = new Array();
    }

    NL_chrono.prototype.lap = function(){
        var laptime = new Date();
        var lasttime = this.start;
        if (this.laps.length > 0) lasttime = this.laps[this.laps.length -1].date;

        this.laps.push({
            date: laptime,
            laptime: laptime.getTime() - lasttime.getTime(),
            time: laptime.getTime() - this.start.getTime()
        });

        return this.laps[this.laps.length -1];
    }

    NL_chrono.prototype.forecast = function(_lap, _unit){
        if(this.laps.length === 0 || _lap < this.laps.length) return;

        var result = (_lap / this.laps.length -1) * this.laps[this.laps.length -1].time;

        switch (_unit){
            case "d": result = result / (1000 * 3600 * 24); break;
            case "h": result = result / (1000 * 3600); break;
            case "m": result = result / (1000 * 60); break;
            case "s": result = result / 1000; break;
            case "ms": break;
            default: 
                result = getTimeText(result);
                break;
        }

        return result;
    }

    NL_chrono.prototype.showStats = function(_options){
        _options = _options || {};

        var result = new Array();
        _options.maxwidth = _options["maxwidth"] || 60;

        var maxLapTime = this.laps[0].laptime;
        this.laps.map(function(_item){if (_item.laptime > maxLapTime) maxLapTime = _item.laptime;});

        var temp = "";
        for (var i=0; i<_options.maxwidth +28; i++) temp += "_";
        result.push(temp);

        result.push(format("Lap time evolution | max = $0 | total = $1 |", getTimeText(maxLapTime), getTimeText(this.laps[this.laps.length -1].time)));

        result.push(temp);
        result.push(" ");
        forEach(this.laps, function(_lap, _index){
            var size = Math.round(_options.maxwidth * _lap.laptime / maxLapTime);
            var bar = "";
            for (var i=0; i<size; i++) bar += "=";
            bar += format("(Lap $0: $1)", _index +1, getTimeText(_lap.laptime));
            result.push(bar);
        });
        return result;
    }

    NL_chrono.prototype.getTimeText = getTimeText;

    function getTimeText(_ms){
        var val = _ms;
        if (val == 0) return "";

        var result = "";
        if (val >= (1000 * 3600 * 24)) result = (val - val % (1000 * 3600 * 24)) / (1000 * 3600 * 24) + "d "
        val = val % (1000 * 3600 * 24)
        if (val != 0) {
            if (val >= (1000 * 3600)) result += (val - val % (1000 * 3600)) / (1000 * 3600) + "h "
            val = val % (1000 * 3600)
            if (val != 0) {
                if (val >= (1000 * 60)) result += (val - val % (1000 * 60)) / (1000 * 60) + "min "
                val = val % (1000 * 60)
                if (val != 0) {
                    if (val >= 1000) result += (val - val % 1000) / 1000 + "s"
                    val = val % 1000
                }
            }
        }

        return result;
    }
}

EDIT: I added a function showStats that will show you in the console a graphical representation of the time of your job :

enter image description here

NB the showStats function result is a string array that you ll have to log with something like :

forEach(chrono.showStats(), function(_stat){log(_stat);});

Cheers

asked in WorkFlow / Serverscript by (989 points)
edited by

1 Answer

0 votes

As an Efficy architect, I experienced that doing imports using the Scheduler Admin is only acceptable for API integrations or importing files that are automatically exported and have a good quality.

As soon as a user wants to upload manually created files and doing that immediately, the Scheduler service should not be used. It will need to poll your DB in order to find new files to be processed and when there are exceptions, it’s hard to trace the issue. And you have to write code like you did in order to know when something is finished. In terms of performance, it also single threaded and not the fasted possibility.

For cloud-based installations or for manually created files, the best solution is DataSynchroRemote or using batched server scripts, as demonstrated here with the ProcessRunner.

For fast performance, until Efficy 11.0 you can run multiple instances of a DataSynchro(Remote) to import multiple isolated sources in parallel. Starting from 11.1, the DataSynchroRemote will support multiple threads and it increases the speed a lot!

Regards,
Kristof

answered by (7.4k points)
Hi Kristof, thanks a lot for your feedback.

Of course I am using scheduler for automatically generated files or good quality files like you call them^^.

Anyway I will read the ProcessRunner with full intrest but can you be more specific about the DataSynchroRemote improvement in 11.1?
is the regular dataSynchro improve as well?
How do you launch the web interface process runner? can't find it anywhere...
does Efficy 11.0 supports it?
It's a custom development you first have to download and then integrate in your custom. It can be started from anywhere if you want. Two samples have been provided, from the entity list and from the Document edit. The last one is the most likely to be used by you, because it concerns an import.
So far what I read is absolutly brilliant. I will check the serverside code cause i don't get yet how this free the user session... but thank you.
I really hate this part of my job where most of my work seems to be stupid and when someone have already done something better before. but this is my problem.
There is no magic involved. The paragraphs below from the manual should describe the design, at least I hoped it would ;-)

The browser is the orchestrator of the whole process. It defines the size of batch, manages the state, updates the user interface elements like progress bar, displays logs and exceptions and can abort the process.

At launch, a Javascript data object (see code snippet below) is JSON stringified and send to the serverscript with an ajax request. The serverscript parses the data object, loads the data source (query, file...) and loops the data source until the count of processed records exceeds the batch size or when EOF. The data object including the actual progress state is JSON stringified and returned as string into the response to the browser.

The browers scans for exceptions, aborts or completes the process based on the state. When there are no fatal exceptions and there is still data to process, the client will recursively execute the process.start() method.
1,249 questions
1,521 answers
1,859 comments
328 users