Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prologue/epilogue per clusters #18

Open
jgaida opened this issue Jul 10, 2015 · 3 comments
Open

Prologue/epilogue per clusters #18

jgaida opened this issue Jul 10, 2015 · 3 comments

Comments

@jgaida
Copy link

jgaida commented Jul 10, 2015

On Grid5000, it might be useful to run prologue/epilogue per clusters rather than by site (ex: for compiling code with optimized options according to CPU type).

@bzizou
Copy link
Contributor

bzizou commented Jul 10, 2015

For CIGRI, a prologue/epilogue is already a per cluster process. The problem with g5k is that you have configured a "cigri cluster" as a "g5k site". I think that you should create as much "cigri clusters" as there are "g5k clusters" (not sites). Even in a OAR multicluster context, Cigri can manage the different clusters (with the side effect that it will multiply the number of queries on the OAR REST API...).
You have to use the "properties" field of the clusters table to narrow the context to one cluster 'for example with properties="cluster='genepi'"

@jgaida
Copy link
Author

jgaida commented Jul 10, 2015

Sure I can create as many CiGri clusters as there are clusters on Grid5000. But then:

  • I cannot have prologue/epilogue per site anymore (which is also useful, for data management for instance, as file systems are shared within a site).
  • CiGri is already stressing OAR quite a lot as is.
  • Users will have to know the name of all the clusters on every site. I believe it is more easy to use site names on the JDL.
  • Most of the time, when a cluster is unavailable it is because the entire site is unreachable. IMHO, the "site" granularity is what works best for G5K.
  • Events such as blacklists are "per CiGri cluster". If one ciGri cluster = one G5K cluster, the amount of events will be crazy.

I was aware of the properties field. I documented it here : https://www.grid5000.fr/mediawiki/index.php/CiGri. But I do not see how it helps for running a prologue/epilogue per clusters on Grid5000.

Anyway, this issue is more "for the record" than something I want to work on. For the prologue, a workaround is to put the prologue action at the beginning of the task and use mutual exclusion on the file system to run it only once. The epilogue of a site can run the epilogue task for each cluster easily.

@bzizou
Copy link
Contributor

bzizou commented Jul 10, 2015

I think that we have to discuss about the "site" notion because it has no sense in the current cigri code.
By "properties" I was talking about the field of the clusters table into the cigri database, not the JDL option of the same name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants