-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move autoremove builds logic to a scheduled job #2468
base: master
Are you sure you want to change the base?
Changes from all commits
6268e96
5ed8a83
deea53c
19cd37f
77630eb
118373d
376c467
542dad0
173e373
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,6 +25,15 @@ DB_PASSWORD=secret | |
# Should CDash automatically remove old builds? | ||
#AUTOREMOVE_BUILDS=true | ||
|
||
# How many builds should CDash try to remove per iteration? | ||
# Consider tweaking this value if you notice the autoremove queries | ||
# are slow to execute. | ||
#AUTOREMOVE_BUILDS_BATCH_SIZE=10 | ||
|
||
# Skip the autoremove builds step if more than this number of submissions | ||
# are waiting to be parsed. | ||
#AUTOREMOVE_BUILDS_SKIP_THRESHOLD=100 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Instead of a new configuration option, what do you think about using queue priorities to automatically prioritize submissions if such jobs exist? That way, the queuing system will automatically handle cleanup tasks whenever submission activity is low. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds like a good idea to me! |
||
|
||
# How long should CDash store parsed input files (in hours?) | ||
# Set to 0 if you do not wish to backup parsed submission files. | ||
#BACKUP_TIMEFRAME=48 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
<?php | ||
|
||
namespace App\Console\Commands; | ||
|
||
use Illuminate\Console\Command; | ||
use Illuminate\Support\Facades\DB; | ||
|
||
class CleanupDatabase extends Command | ||
{ | ||
/** | ||
* The name and signature of the console command. | ||
*/ | ||
protected $signature = 'db:cleanup'; | ||
|
||
/** | ||
* The console command description. | ||
*/ | ||
protected $description = 'Delete orphaned records from the CDash database'; | ||
|
||
/** | ||
* Execute the console command. | ||
*/ | ||
public function handle(): void | ||
zackgalbreath marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
self::delete_unused_rows('banner', 'projectid', 'project'); | ||
self::delete_unused_rows('build', 'projectid', 'project'); | ||
self::delete_unused_rows('dailyupdate', 'projectid', 'project'); | ||
|
||
self::delete_unused_rows('buildfailuredetails', 'id', 'buildfailure', 'detailsid'); | ||
self::delete_unused_rows('configure', 'id', 'build2configure', 'configureid'); | ||
self::delete_unused_rows('configureerror', 'configureid', 'configure'); | ||
self::delete_unused_rows('dailyupdatefile', 'dailyupdateid', 'dailyupdate'); | ||
self::delete_unused_rows('note', 'id', 'build2note', 'noteid'); | ||
self::delete_unused_rows('testoutput', 'id', 'build2test', 'outputid'); | ||
self::delete_unused_rows('updatefile', 'updateid', 'buildupdate'); | ||
self::delete_unused_rows('uploadfile', 'id', 'build2uploadfile', 'fileid'); | ||
|
||
self::delete_unused_rows('subproject2subproject', 'subprojectid', 'subproject'); | ||
|
||
self::delete_unused_rows('coveragefile', 'id', 'coverage', 'fileid'); | ||
self::delete_unused_rows('coveragefile2user', 'fileid', 'coveragefile'); | ||
|
||
self::delete_unused_rows('test2image', 'outputid', 'testoutput'); | ||
self::delete_unused_rows('image', 'id', 'test2image', 'imgid'); | ||
} | ||
|
||
/** Delete unused rows */ | ||
private static function delete_unused_rows(string $table, string $field, string $targettable, string $selectfield = 'id'): void | ||
{ | ||
DB::delete("DELETE FROM $table WHERE $field NOT IN (SELECT $selectfield AS $field FROM $targettable)"); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this delete in batches too? You can add a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It turns out postgres does not support There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could try There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wouldn't that result in data loss? The inner There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oops I mistyped the pseudo-query. See edit. |
||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
<?php | ||
|
||
namespace App\Jobs; | ||
|
||
use Illuminate\Bus\Queueable; | ||
use Illuminate\Contracts\Queue\ShouldQueue; | ||
use Illuminate\Foundation\Bus\Dispatchable; | ||
use Illuminate\Queue\InteractsWithQueue; | ||
use Illuminate\Queue\SerializesModels; | ||
use Illuminate\Support\Facades\Artisan; | ||
use Illuminate\Support\Facades\DB; | ||
|
||
/** | ||
* Removes builds that have expired according to per-project and | ||
* per-buildgroup settings. | ||
*/ | ||
class CleanupDatabase implements ShouldQueue | ||
{ | ||
use Dispatchable, InteractsWithQueue, Queueable, SerializesModels; | ||
|
||
/** | ||
* Execute the job. | ||
*/ | ||
public function handle(): void | ||
{ | ||
if (!(bool) config('cdash.autoremove_builds')) { | ||
return; | ||
} | ||
|
||
$skip_threshold = (int) config('cdash.autoremove_builds_skip_threshold'); | ||
if (DB::table('jobs')->count() > $skip_threshold) { | ||
return; | ||
} | ||
|
||
Artisan::call('build:remove all'); | ||
Artisan::call('db:cleanup'); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding another configuration option, what do you think about adding some logic to automatically adjust the batch size to keep the running time for a given batch under a set time limit, while maximizing the deletion rate under that time limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of figuring out the batch size automatically rather than making the CDash admin figure it out via trial & error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about this further, I wonder if we should even bother with batching in the first place. If we simply delete one at a time, deletion rate will probably be suboptimal, but we are practically guaranteed that the deletion process won't hang.
If each delete is guaranteed to be fast, we can delete for a maximum period of 5 minutes, and schedule the job to run every 10 minutes. This 5-minutes-on-5-minutes-off approach gets rid of the large monolithic job and keeps cleanup tasks out of the way of submission processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One-at-a-time is fine with me! Practically speaking, that's the patch I've resorted to applying on large production instances that hang when attempting to delete 100 builds in a single pass.