update to dask 0.18.0 #66

kain88-de · 2018-09-20T13:17:20Z

Fix #48 and #17

Changes made in this Pull Request:

Replace get= with scheduler= keyword dask/dask#3448

PR Checklist

Tests?
Docs?
CHANGELOG updated?
Issue raised/referenced?

orbeckst

Assuming tests pass eventually, this looks awesome to me.

EDIT: Also, update docs please! (I only realized after reviewing...)

orbeckst · 2018-09-20T20:20:20Z

pmda/test/test_parallel.py

@@ -95,7 +95,7 @@ def scheduler(request, client):
    if request.param == 'distributed':
        return client
    else:
-        return multiprocessing
+        return request.param


This is cool: so now we will be able to expand to all schedulers that dask supports by just adding strings to the fixture parametrization.

orbeckst · 2018-09-20T20:23:01Z

@kain88-de I might have approved a bit prematurely but in principle I think this is very good and I trust that you know best what else needs to be done.

orbeckst · 2018-09-20T23:41:23Z

The docs do not mention get anywhere because a while back we switched to scheduler for the user API.

kain88-de · 2018-09-21T05:47:45Z

Yeah we had a crystal ball last year when we started with the word scheduler. I don’t remember ever using get.

…

On Fri 21. Sep 2018 at 01:41, Oliver Beckstein ***@***.***> wrote: The docs do not mention get anywhere because a while back we switched to scheduler for the user API. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#66 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEGnVgjwhbuL3I-QBChwrd6QSlg3tZLXks5udCejgaJpZM4WyG31> .

kain88-de · 2018-09-21T06:44:03Z

Dask introduced some more changes that require larger changes within pmda. It should make the code clearer at the end. @VOD555 can you look into this? The relevant documentation is linked below.

https://dask.pydata.org/en/latest/configuration.html

kain88-de · 2018-09-21T12:47:39Z

The main issue is that dask now uses a global value to look for the appropriate interpreter. For the tests we now have to remove the session scope from fixtures to properly set the global values. I haven't experimented how to handle two clusters in the same session.

…ring - modified tests so that they use default scheduler - supplying n_jobs - NOTE: test_leaflets() failes for n_jobs=2; this NEEDS TO BE FIXED in a separate PR; right now this is marked as XFAIL

orbeckst · 2018-10-29T08:28:50Z

pmda/test/test_leaflet.py

@@ -39,24 +38,29 @@ def correct_values(self):
    def correct_values_single_frame(self):
        return [np.arange(1, 2150, 12), np.arange(2521, 4670, 12)]

-    def test_leaflet(self, universe, correct_values):
+    # XFAIL for 2 jobs needs to be fixed!
+    @pytest.mark.parametrize('n_jobs', (1, pytest.mark.xfail(2)))


@iparask the test_leaflets test failed for me with n_jobs=2

E AssertionError: E Arrays are not almost equal to 7 decimals E error: leaflets should match test values E (shapes (1,), (6,) mismatch) E x: array([36634]) E y: array([36507, 36761, 37523, 37650, 38031, 38285])

My expectation was that this should give the same answer, just run faster... Can you please look into this?

Codecov Report

Merging #66 into master will decrease coverage by 3.67%.
The diff coverage is 54.34%.

@@            Coverage Diff             @@
##           master      #66      +/-   ##
==========================================
- Coverage   98.09%   94.41%   -3.68%     
==========================================
  Files           8        8              
  Lines         419      448      +29     
  Branches       58       61       +3     
==========================================
+ Hits          411      423      +12     
- Misses          4       18      +14     
- Partials        4        7       +3

Impacted Files	Coverage Δ
pmda/leaflet.py	`86.5% <34.78%> (-8.14%)`	⬇️
pmda/parallel.py	`95.12% <73.91%> (-4.88%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6aa6c61...dd88867. Read the comment docs.

orbeckst · 2018-10-29T16:58:42Z

@kain88-de @richardjgowers @VOD555 The tests for upgrade to dask 0.20 pass now; the coverage dropped for reasons that I do not understand.

I had reviewed and approved this PR before it was passing Travis but now that I edited it, I'd appreciate some additional eyes, please.

orbeckst · 2018-10-30T01:04:23Z

I get locally

 pytest --pep8 -n 4 --cov pmda

all passing

============== 1001 passed, 1 xfailed, 24 warnings in 96.34 seconds ==============

but different coverage changes:

---------- coverage: platform darwin, python 3.6.5-final-0 -----------
Name               Stmts   Miss Branch BrPart  Cover
----------------------------------------------------
pmda/__init__.py       5      0      0      0   100%
pmda/contacts.py      53      1     14      1    97%
pmda/custom.py        34      0      6      0   100%
pmda/leaflet.py      112     37     44      3    60%
pmda/parallel.py     108      0     34      0   100%
pmda/rdf.py           53      4      6      1    92%
pmda/rms.py           17      1      0      0    94%
pmda/util.py          37      0     14      0   100%
----------------------------------------------------
TOTAL                419     43    118      5    87%

Using

coverage html
open htmlcov/pmda_leaflet_py.html

shows that

njobs == -1 is not tested

the whole _find_connected_components() code is not run (?!?!?!), which is used in

pmda/pmda/leaflet.py

Lines 203 to 204 in 6aa6c61

    
           parAtomsMap = parAtoms.map_partitions(self._find_connected_components, 
        
                                                 cutoff=cutoff)

where parAtoms is a dask bag.

Perhaps coverage has a hard time to see what's been covered when something is run under dask, under certain circumstances??

This test is needed to get coverage of leaflet back up but: TEST or CODE needs to be fixed.

orbeckst · 2018-10-30T01:22:03Z

I wanted to add tests for leafletfinder with distributed (by using the new parametrized scheduler fixture) but as described in #76 this opened a whole can of worms so this has to wait.

VOD555 · 2018-10-30T01:30:23Z

In my local tests, the things under _single_frame() in rdf.py and rms.py are not covered. That's weird...

kain88-de · 2018-10-30T18:29:23Z

I haven't looked at the code changes yet! But I did stop working on this because dask 0.18 has changed the idiomatic style to change the scheduler docs. The new idom is to set the scheduler in a global variable

dask.config.set(scheduler='threads')

or with a context manager

with dask.config.set(scheduler='threads'):
    x.compute()

The distributed scheduler overwrites these defaults now on creation

from dask.distributed import Client
client = Client(...)  # Connect to distributed cluster and override default
df.x.sum().compute()  # This now runs on the distributed system

The correct solution seems to rather be we remove the scheduler and get keyword arguments completely. In the tests I guess we can work with something like

@pytest.fixture(params=['multiprocessing', ClientIP])
def scheduler(params):
    with dask.config.set(params)
        yield

I assume I have the API wrong but the general idea is to start a context manager in the fixture and yield to release it at the end. How well this works I don't know.

orbeckst · 2018-10-30T19:55:26Z

From my reading, setting the scheduler on compute()

x.compute(scheduler='threads')

is still supported. I think as long as all our compute() calls also contain the scheduler, we should be ok, even though

client = Client(...)  # Connect to distributed cluster and override default

will set the global defaults.

Or do I misunderstand how this is working now?

The correct solution seems to rather be we remove the scheduler and get keyword arguments completely.

I think you're right that this is the medium term correct solution so that using PMDA conforms to how people use Dask. In the short term (i.e., for this PR at least!) I'd like to move ahead with our current scheduler argument because it is still correct.

Or do you see a problem?

orbeckst · 2018-10-30T19:56:57Z

(Alternatively, if someone manages to get the new Dask paradigm working I am also happy... I just only have limited time for this right now.)

kain88-de · 2018-10-30T20:56:38Z

The problem is that code suddenly behaves unexpected. Take the following example

client = Client()  # yeah lets use dask.distributed.
pdma.contacts.Contacts.run()  # This uses multiprocessing!

I would be surprised to see here that the distributed workers don't receive any jobs. Without knowledge of how dask used to work this is also hard to debug.

kain88-de · 2018-10-30T20:58:00Z

pmda/parallel.py

+        # job. Therefore we run this on the single threaded scheduler for
+        # debugging.
+        if scheduler is None and n_jobs == 1:
+            scheduler = 'single-threaded'


- someone should check with dask. It seems a bit brittle - fix tests maybe - update documentation fixes #17

orbeckst · 2018-10-30T21:31:19Z

I would be surprised to see here that the distributed workers don't receive any jobs.

... because PMDA defaults to 'multiprocessing'? Yes, I agree with you.

Thanks for working on it!

- fix #48 - updated boiler-plate code in ParallelAnalysisBase.run and copied and pasted into leaflet.LeafletFinder.run() (TODO: makes this more DRY) - dask.distributed added as dependency (it is recommended by dask for a single node anyway, and it avoids imports inside if statements... much cleaner code in PMDA) - removed scheduler kwarg: use dask.config.set(scheduler=...) - 'multiprocessing' and n_jobs=-1 are now only selected if nothing is set by dask; if one wants n_jobs=-1 to always grab all cores then you must set the multiprocessing scheduler - default for n_jobs=1 (instead of -1), i.e., the single threaded scheduler - updated tests - removed unnecessary broken(?) test for "no deprecations" in parallel.ParallelAnalysisBase - updated CHANGELOG

- install conda package of MDA on travis - require MDA and MDATests >= 0.19.0

orbeckst · 2018-10-31T07:24:15Z

@kain88-de are you sure you wanted to push 114a2b0? It looks as if it undoes some of the changes that I pushed and breaks the tests again. If it was intentional and you're working on it then just ignore me ;-).

kain88-de · 2018-10-31T08:16:47Z

I don't want to depend on distributed for such simple checks. The code now does a trial import when necessary. We can still check the distributed scheduler in our tests it is not needed for the pmda though/

orbeckst · 2018-10-31T08:30:43Z

Fine with me, although I am pretty sure that anyone using dask will also install distributed or not mind having it installed, especially as http://docs.dask.org/en/latest/scheduling.html says

we currently recommend the distributed scheduler on a local machine

i.e., pretty much in all cases.

orbeckst · 2018-10-31T18:08:37Z

@kain88-de please check – I'd like to get this merged so that we can move forward and I'd like to get 0.2.0 asap.

orbeckst

Minor comments.

orbeckst · 2018-10-31T19:15:35Z

CHANGELOG

+    scheduler (Issue #48)
+  * removed the 'scheduler' keyword from the run() method; use
+    dask.config.set(scheduler=...) as recommended in the dask docs
+  * uses single-threaaded scheduler if n_jobs=1 (Issue #17)


typo, needs fixing

orbeckst · 2018-10-31T19:17:07Z

CHANGELOG

+    dask.config.set(scheduler=...) as recommended in the dask docs
+  * uses single-threaaded scheduler if n_jobs=1 (Issue #17)
+  * n_jobs=1 is now the default for run() (used to be n_jobs=-1)
+  * dask.distributed is now a dependency


setup.py has it as full dep; it could be moved into test dependencies if you really want to keep it optional. If you make it fully optional, please remove this line from CHANGELOG

orbeckst · 2018-10-31T19:17:29Z

conftest.py

@@ -8,7 +8,8 @@
 #
 # Released under the GNU Public Licence, v2 or any higher version

-from dask import distributed, multiprocessing
+from dask import distributed


tests require distributed

orbeckst · 2018-10-31T19:18:00Z

docs/userguide/pmda_classes.rst

@@ -17,7 +17,7 @@ are provided as keyword arguments:

 	   set up the parallel analysis

-   .. method:: run(n_jobs=-1, scheduler=None)
+   .. method:: run(n_jobs=-1)


I think the default is now n_jobs=1, isn't it?

orbeckst · 2018-10-31T19:21:56Z

pmda/test/test_parallel.py

@@ -91,6 +86,10 @@ def test_no_frames(analysis, n_jobs):
    assert analysis.timing.universe == 0


+def test_scheduler(analysis, scheduler):
+    analysis.run()


No assert here – either ERROR or pass?

kain88-de · 2018-10-31T21:20:28Z

I removed my changes again. Lets go with the easier version

kain88-de · 2018-10-31T21:23:03Z

I don't know where the reduced coverage comes from right now.

orbeckst · 2018-11-01T00:15:13Z

I'll merge it regardless and then we need to dig a bit more into how coverage works with the different schedulers.

Thanks for pushing forward here!!!

orbeckst · 2018-11-01T00:15:15Z

I'll merge it regardless and then we need to dig a bit more into how coverage works with the different schedulers.

Thanks for pushing forward here!!!

orbeckst · 2018-11-01T00:17:49Z

Stupid GitHub web interface does not work on my slightly outdated mobile. Can you please do a squash merge? Thanks!

This will allow @VOD555 to continue.

VOD555 · 2018-11-01T00:24:10Z

@orbeckst I've merged this PR.

orbeckst · 2018-11-01T17:54:21Z

Thanks.

The drop in coverage is due to leaflet.py because we do not currently test all schedulers. This should be addressed as part of #76

kain88-de added 3 commits September 20, 2018 15:14

update to dask 0.18.0

ab791ac

test with strings

f6980f7

remove unused imports

56dd56a

orbeckst approved these changes Sep 20, 2018

View reviewed changes

kain88-de mentioned this pull request Sep 21, 2018

test custom with scheduler fixture #61

Merged

4 tasks

orbeckst mentioned this pull request Oct 17, 2018

Fix 47 pleafletfinder #69

Merged

4 tasks

orbeckst mentioned this pull request Oct 29, 2018

Add InterRDF_s #70

Merged

4 tasks

orbeckst added 2 commits October 29, 2018 00:50

Merge branch 'master' into new-dask

cc11410

replaced get --> scheduler in leaflet and use 'multiprocessing' as st…

5103271

…ring - modified tests so that they use default scheduler - supplying n_jobs - NOTE: test_leaflets() failes for n_jobs=2; this NEEDS TO BE FIXED in a separate PR; right now this is marked as XFAIL

orbeckst force-pushed the new-dask branch from e7174fa to 5103271 Compare October 29, 2018 08:26

orbeckst reviewed Oct 29, 2018

View reviewed changes

orbeckst force-pushed the new-dask branch from d210110 to 19166c2 Compare October 29, 2018 08:46

test leaflet with n_jobs=-1 (but XFAIL it)

8686767

This test is needed to get coverage of leaflet back up but: TEST or CODE needs to be fixed.

orbeckst mentioned this pull request Oct 30, 2018

LeafletFinder issues #76

Open

kain88-de commented Oct 30, 2018

View reviewed changes

enable new dask 0.18 scheduler selection idoms

d71eb98

- someone should check with dask. It seems a bit brittle - fix tests maybe - update documentation fixes #17

kain88-de force-pushed the new-dask branch from 6770bca to d71eb98 Compare October 30, 2018 21:06

fix missing import

6f54546

orbeckst mentioned this pull request Oct 30, 2018

dask tutorial MDAnalysis/WorkshopHackathon2018#20

Closed

orbeckst self-assigned this Oct 31, 2018

orbeckst added 2 commits October 30, 2018 18:08

bumped MDAnalysis requirement to 0.19.0

dd88867

- install conda package of MDA on travis - require MDA and MDATests >= 0.19.0

orbeckst force-pushed the new-dask branch from 5444720 to dd88867 Compare October 31, 2018 01:11

kain88-de force-pushed the new-dask branch from b9c7b11 to 114a2b0 Compare October 31, 2018 07:02

kain88-de force-pushed the new-dask branch from 114a2b0 to 2d3be9a Compare October 31, 2018 08:15

orbeckst reviewed Oct 31, 2018

View reviewed changes

kain88-de force-pushed the new-dask branch from b7af888 to dd88867 Compare October 31, 2018 21:19

VOD555 merged commit aaa478c into master Nov 1, 2018

orbeckst deleted the new-dask branch November 1, 2018 17:54

orbeckst mentioned this pull request Nov 1, 2018

updated docs with explicit scheduler setting example #78

Closed

update to dask 0.18.0 #66

update to dask 0.18.0 #66

Conversation

kain88-de commented Sep 20, 2018 • edited by orbeckst Loading

PR Checklist

orbeckst left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orbeckst commented Sep 20, 2018

orbeckst commented Sep 20, 2018

kain88-de commented Sep 21, 2018 via email

kain88-de commented Sep 21, 2018

kain88-de commented Sep 21, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orbeckst commented Oct 29, 2018

orbeckst commented Oct 29, 2018

codecov bot commented Oct 29, 2018 • edited Loading

Codecov Report

orbeckst commented Oct 29, 2018

orbeckst commented Oct 30, 2018

orbeckst commented Oct 30, 2018

VOD555 commented Oct 30, 2018

kain88-de commented Oct 30, 2018

orbeckst commented Oct 30, 2018

orbeckst commented Oct 30, 2018

kain88-de commented Oct 30, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orbeckst commented Oct 30, 2018

orbeckst commented Oct 31, 2018

kain88-de commented Oct 31, 2018

orbeckst commented Oct 31, 2018

orbeckst commented Oct 31, 2018

orbeckst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kain88-de commented Oct 31, 2018

kain88-de commented Oct 31, 2018

orbeckst commented Nov 1, 2018

orbeckst commented Nov 1, 2018

orbeckst commented Nov 1, 2018

VOD555 commented Nov 1, 2018

orbeckst commented Nov 1, 2018

kain88-de commented Sep 20, 2018 •

edited by orbeckst

Loading

orbeckst left a comment •

edited

Loading

codecov bot commented Oct 29, 2018 •

edited

Loading