Enable parallel builds #49

Ecco · 2018-11-26T17:21:22Z

Builds can take a while. CPUs are getting more and more cores. Let's use them.

Steps to reproduce

nanoc compile
wait
wait
wait some more

Expected behavior

Just like make -j N, it would be great if nanoc could build in parallel and use many cores.

Actual behavior

Nanoc processes items sequentially.

The text was updated successfully, but these errors were encountered:

denisdefreyne · 2018-11-26T18:06:44Z

Hi Ecco,

Parallel compilation is on my wish list too, but it’s not trivial because of two reasons:

The standard Ruby implementation (usually referred to as CRuby, KRI, or MRI) has a global interpreter/VM lock (abbreviated GIL or GVL, respectively), which makes the Ruby interpreter effectively use a single CPU. It does bring benefits for IO-heavy operations (since the Ruby process can do CPU work while waiting for an IO operation to complete) but the research I’ve done on Nanoc web sites is that they are all CPU-heavy, and thus wouldn’t benefit much from parallelization.
JRuby is an alternate Ruby implementation that does not have a GIL/GVL, but it also incurs a startup time, which can be significant in relatively short-running processes such as Nanoc.

Therefore, while it’s certainly possible to parallelize Nanoc, there would be no significant measurable benefit.

One thing that you can do to speed up the compilation of your site is to figure out where the slowness is coming from. Run nanoc compile with the -VV option (which is the same as --verbose --verbose). This will print detailed results of where Nanoc spends its time. While it’s not 100% accurate (because measuring time is hard), it will show you what is taking up the most time.

In particular, the list of filters is interesting to look at; it might be worth using different filters, or optimising slow ones. For example, in one site that I worked on, I swapped out a filter that uses pygmentize in favour of one that uses pygments.rb, yielding a speedup of more than 10x.

I’ve also attempted to run Nanoc within Docker for Mac for a while, but the slow filesystem lead to a 10x-20x slowdown for Nanoc. (This might not be relevant for you, but I found it worth mentioning.)

I’ve also had some success with Bootsnap, which can speed up nanoc invocations quite a bit.

I’ve spent a lot of effort in optimising Nanoc, and you can see that (I hope) in the fact that repeated builds are significantly faster than clean builds. In some sites that I work on, I can have an editor and a live-reloading browser window open next to each other, and have my changes show up with a sub-second latency. That’s the experience that I aim for for even large Nanoc sites, although that’s not always possible.

At this point, however, I believe I might be hitting the limits of what is achievable with Ruby. I’ve experimented with partially reimplementing Nanoc’s core in pre-compiled languages, and that’s certainly a direction that I want to explore further, as I believe there’s a lot of untapped potential.

What are your thoughts?

Ecco · 2018-11-27T12:54:36Z

Hi @ddfreyne ! First of all, thank you very much for such a nice and comprehensive answer!

Our use-case might be a bit special: 99% of the time is spent generating PDFs with PDFKit using a custom Nanoc::Filter. Generating a single item takes over 10 seconds, while other items (html, css) are usually far below 0.1 sec.

Those items are all independent one from another, and I'm pretty sure it could be possible to run the filter in parallel. Anyway, I'm going to look at how I could make the filter faster and will keep you posted 😄

denisdefreyne · 2018-11-29T21:14:48Z

@Ecco That’s a good point! PDF generation is notoriously slow, and this case might indeed be something that can be parallellized. I’ll need to think a bit more about how to make this work, though.

denisdefreyne · 2018-12-02T09:04:12Z

I’ve started an experiment to make Nanoc compile items in parallel. You can find it at nanoc/nanoc/pull/1385 — but be warned that this is highly experimental and very much work in progress.

denisdefreyne · 2018-12-02T10:16:21Z

@Ecco Is there a chance that I can get hold of the source for the web site that you’re talking about? It’d help me in building a properly-parallelized Nanoc.

Ecco · 2018-12-02T14:58:36Z

Is there a chance that I can get hold of the source for the web site that you’re talking about

Unfortunately, not as is. But I guess I could make a minimum example that reproduces the exact issue we're running into. 99% of that website's sources aren't relevant to this anyway :)

denisdefreyne · 2018-12-02T15:15:36Z

@Ecco A minimal example would be quite useful!

Ecco · 2018-12-02T17:06:41Z

Here goes :)

nanoc-demo-mt.zip

denisdefreyne · 2018-12-02T17:15:10Z

Look at that:

% bundle exec nanoc
Loading site… done
Compiling site…
      create  [0.01s]  output/index.html
      create  [7.11s]  output/index/index.pdf
      create  [0.00s]  output/two/index.html
      create  [6.96s]  output/two/index.pdf
      create  [0.00s]  output/one/index.html
      create  [0.00s]  output/stylesheet.css
      create  [6.91s]  output/one/index.pdf
      create  [0.00s]  output/three/index.html
      create  [6.94s]  output/three/index.pdf

Site compiled in 27.95s.

% bundle exec nanoc
Loading site… done
Compiling site…
      create  [0.02s]  output/stylesheet.css
      create  [0.02s]  output/index.html
      create  [0.02s]  output/two/index.html
      create  [0.03s]  output/one/index.html
      create  [0.04s]  output/three/index.html
      create  [8.49s]  output/three/index.pdf
      create  [8.50s]  output/index/index.pdf
      create  [8.50s]  output/one/index.pdf
      create  [8.50s]  output/two/index.pdf

Site compiled in 8.53s.

denisdefreyne · 2018-12-02T17:16:06Z

There’s quite a bit more work to do, but the basic stuff is there.

I suppose I also need to start thinking about how to report all the recorded durations, since they don’t add up anymore.

Ecco · 2018-12-02T18:15:27Z

Oh, wow 😮 Color me impressed! I definitely need to take a look at that branch then 😄

As for the way durations are reported, I think the current output is actually great. It's rather clear what each output corresponds to, and they don't really need to add up.

denisdefreyne · 2018-12-03T09:15:16Z

Unfortunately, for sites that don’t run external processes, the parallel version of Nanoc is 5% to 10% slower. I’ll need to investigate, but it’s probable that the threading/locking/context-switching overhead is causing it.

Ecco · 2018-12-03T16:29:46Z

Hmm, that's a bummer. Maybe nanoc could use a flag like make -jN?

denisdefreyne · 2018-12-03T16:52:37Z

The slowdown is also noticeable when running with a single thread (using the new implementation).

I’ve measured lock contention, but it’s small (< 0.5%, don’t have more detailed results).

denisdefreyne · 2018-12-10T09:27:17Z

While still experimental, I think the PR is in a pretty good shape by now:

All existing tests pass, and it works correctly on the handful of Nanoc sites that I have here (both small and large ones).
After some refactoring (outside of the PR), the slowdown is not as large as anymore. There will always be some slowdown due to the extra overhead, but it doesn’t seem to exceed 5%, so I believe it’s fine.

The PR introduced quite a bit of code that is not yet thoroughly tested by unit tests and integration tests. I suppose that now is a good time to start working on that.

I think you can test out this branch for your own project, but do let me know when you run into unexpected behavior!

denisdefreyne · 2021-06-25T07:14:26Z

Ruby 3.0 opens up new possibilities here, via Ractors. I’d love to make use of this, though it would mean dropping support for Ruby 2.x. I think it’s too early for this, as Ruby 3.0 is quite new and Ruby 2.6 and 2.7 are still supported.

Ecco · 2021-06-28T20:37:08Z

I don't know if I'm biased, but I always use a ruby version manager. As a result, installing any version of Ruby is really not a concern for me. I would assume most Ruby devs also do, but I don't know 😄

dseomn · 2024-06-08T17:30:21Z

If people are still interested in this, would it make sense to reconsider it now? From https://www.ruby-lang.org/en/downloads/branches/ it looks like 2.7 is end-of-life. (I'd really like to switch to nanoc, but I've got about 2500 images to resize to multiple sizes each, and that seems like a good use of parallelism as long as the GIL/GVL is released when executing an external program.)

denisdefreyne · 2024-06-08T17:48:48Z

Unfortunately, even in the most recent Ruby version (3.3), ractors are still experimental and not usable in production-like settings. I wrote up some more detail on the lack of parallellism in Nanoc.

I am not sure where ractors are headed in the future, but I am keeping my eyes peeled.

There would be some benefit to using threads when using external processes (e.g. for resizing images), but I’d much prefer to use ractors, because that’d be far more impactful.

denisdefreyne transferred this issue from nanoc/nanoc Nov 29, 2018

denisdefreyne added the enhancement label Dec 1, 2018

denisdefreyne mentioned this issue Dec 2, 2018

Compile in parallel nanoc/nanoc#1385

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable parallel builds #49

Enable parallel builds #49

Ecco commented Nov 26, 2018

denisdefreyne commented Nov 26, 2018 •

edited

Loading

Ecco commented Nov 27, 2018

denisdefreyne commented Nov 29, 2018

denisdefreyne commented Dec 2, 2018

denisdefreyne commented Dec 2, 2018

Ecco commented Dec 2, 2018 •

edited

Loading

denisdefreyne commented Dec 2, 2018

Ecco commented Dec 2, 2018

denisdefreyne commented Dec 2, 2018

denisdefreyne commented Dec 2, 2018

Ecco commented Dec 2, 2018 •

edited

Loading

denisdefreyne commented Dec 3, 2018

Ecco commented Dec 3, 2018 •

edited

Loading

denisdefreyne commented Dec 3, 2018

denisdefreyne commented Dec 10, 2018

denisdefreyne commented Jun 25, 2021

Ecco commented Jun 28, 2021

dseomn commented Jun 8, 2024

denisdefreyne commented Jun 8, 2024

Enable parallel builds #49

Enable parallel builds #49

Comments

Ecco commented Nov 26, 2018

Steps to reproduce

Expected behavior

Actual behavior

denisdefreyne commented Nov 26, 2018 • edited Loading

Ecco commented Nov 27, 2018

denisdefreyne commented Nov 29, 2018

denisdefreyne commented Dec 2, 2018

denisdefreyne commented Dec 2, 2018

Ecco commented Dec 2, 2018 • edited Loading

denisdefreyne commented Dec 2, 2018

Ecco commented Dec 2, 2018

denisdefreyne commented Dec 2, 2018

denisdefreyne commented Dec 2, 2018

Ecco commented Dec 2, 2018 • edited Loading

denisdefreyne commented Dec 3, 2018

Ecco commented Dec 3, 2018 • edited Loading

denisdefreyne commented Dec 3, 2018

denisdefreyne commented Dec 10, 2018

denisdefreyne commented Jun 25, 2021

Ecco commented Jun 28, 2021

dseomn commented Jun 8, 2024

denisdefreyne commented Jun 8, 2024

denisdefreyne commented Nov 26, 2018 •

edited

Loading

Ecco commented Dec 2, 2018 •

edited

Loading

Ecco commented Dec 2, 2018 •

edited

Loading

Ecco commented Dec 3, 2018 •

edited

Loading