Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: XZ compress HBRT image #66

Open
wants to merge 1 commit into
base: master-p8
Choose a base branch
from

Conversation

ghost
Copy link

@ghost ghost commented Sep 15, 2016

The HBRT image is 3.4MB, which will take around 2 seconds to load from PNOR (if using the optimized read in skiboot). HostBoot loads the HBRT image in istep 21.1 (along with loading the payload and geting OCC going). If we xz compressed the HBRT image, it shrinks to 921kb, which is loaded (by the same fast skiboot code) in ~0.6 seconds.

Question: does hostboot need to load the hbrt image? Is there anything stopping us loading it in skiboot? Is there any initial setup done?.

before (run 1):
172.00640|ISTEP 21. 1
200.40760|htmgt|OCCs are now running in ACTIVE state
(total: 28.4s)

before (run 2):
168.26122|ISTEP 21. 1
195.30265|htmgt|OCCs are now running in ACTIVE state
(total: 27.0s)

after (run 1):
169.37556|ISTEP 21. 1
195.06400|htmgt|OCCs are now running in ACTIVE state
(total: 25.68s)

after(run 2):
169.90019|ISTEP 21. 1
193.87845|htmgt|OCCs are now running in ACTIVE state
(total: 23.98)

Saving: 1.3 - 4.4 seconds from boot. (average of 2.87s)

and that number seems in the correct ballpark considering the time pflash takes to read it.

More significantly, it frees up 2.4MB of PNOR.


This change is Reviewable

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
@dcrowell77
Copy link
Collaborator

Question: does hostboot need to load the hbrt image? Is there anything stopping us loading it in skiboot? Is there any initial setup done?.

We load it and point at it in the devtree because that was what we were told to do originally by the OPAL team. Remember that we do this for both OpenPOWER and the enterprise systems. In the latter case, OPAL needs to request the data from the FSP as a lid since you don't have access to the PNOR. It would not be difficult to have divergent behavior on our side if OPAL can handle it though.

However, one more wrinkle in this story is Secureboot. Hostboot has to load HBRT during IPL because we are the only ones would can verify the signature. I think that forces us to keep the current design where HBRT is pushed into mainstore as part of the Hostboot IPL.

@dcrowell77
Copy link
Collaborator

@e-liner - Can you take a look and pull this into gerrit if it looks okay?

@williamspatrick
Copy link

@bofferdn, @e-liner - The secureboot code added an extra layer to the PNOR VFS. If we add additional sections to be XZ compressed, I wonder if we should try to integrate that feature directly into that new secureboot VFS layer. With this proposed patch from Stewart we end up with effectively duplicated code.

On the performance aspect, I thought we were going to attempt to overlap PAYLOAD and HBRT loading with OCC start. Did that ever happen?

@dcrowell77
Copy link
Collaborator

No performance tweaks have been done to load pnor partitions in the background.

@dcrowell77
Copy link
Collaborator

RE: Pushing decompression code into the VFS/PNOR code - I'm in total agreement with that idea as a better design. The question is always about resources and schedule. Putting this in as-is produces an immediate benefit (a few seconds) for very little work, while finding time to put the right solution in could take awhile.

@e-liner
Copy link
Contributor

e-liner commented Sep 15, 2016

This looks okay, but I'm going to want to test a few images on it. Plus we'll need an additional op-build change to compress the image if we're going this route. I'm running short on time this week, but I can try to look at it tomorrow.

@ghost
Copy link
Author

ghost commented Sep 15, 2016

Patrick Williams notifications@github.com writes:

@bofferdn, @e-liner - The secureboot code added an extra layer to the
PNOR VFS. If we add additional sections to be XZ compressed, I wonder
if we should try to integrate that feature directly into that new
secureboot VFS layer. With this proposed patch from Stewart we end up
with effectively duplicated code.

Hence the RFC, because that's totally what should happen rather than
have inconsistent XZ calls scattered throughout hostboot.

This patch should totally not go in as-is, even though the error
handling (which I totally removed) is eerily similar to the error
handling already loading the HBRT image.

Ultimately, it looks like we don't even have to load the HBRT image
until opal_prd starts (there's no way I can think of how to do this on
P8 without breaking ABI though, but we could go that way for P9).

All of the partitions read in one go could be XZ compressed pretty
easily and save a bunch of boot time.

On the performance aspect, I thought we were going to attempt to
overlap PAYLOAD and HBRT loading with OCC start. Did that ever
happen?

Tackling OCC start was the next thing to look at. ISTEP 21.1 was really
constrained by PNOR access and OCC start time when I delved into it,
with absolutely nothing else going on there except that, so everything
there (except loading the PAYLOAD itself) is a candidate for the PAYLOAD
to do in parallel with other tasks.

Stewart Smith
OPAL Architect, IBM.

@ghost
Copy link
Author

ghost commented Sep 15, 2016

Dan notifications@github.com writes:

RE: Pushing decompression code into the VFS/PNOR code - I'm in total
agreement with that idea as a better design. The question is always
about resources and schedule. Putting this in as-is produces an
immediate benefit (a few seconds) for very little work, while finding
time to put the right solution in could take awhile.

I'm happy to do that work, may need some assistance with a few pointers
here and there.

After previous experiments, it seems like the best time to load the
PAYLOAD is during bringing up DRAM, as that's easiest the longest time
spent doing anything (especially with P8 and DDR4).. I know there's memory
constraints on larger systems (Brazos), but we probably have a bit more
leeway with the OpenPOWER kind of size... even if we only loaded a
subset of the PAYLOAD that early.

I know that if I implemented that it'd make all the IPL flow documents
out of date, but technical documentation that's up to date and current
just seems.... not quite right :)

Stewart Smith
OPAL Architect, IBM.

@ghost
Copy link
Author

ghost commented Sep 15, 2016

e-liner notifications@github.com writes:

This looks okay, but I'm going to want to test a few images on
it. Plus we'll need an additional op-build change to compress the
image if we're going this route. I'm running short on time this week,
but I can try to look at it tomorrow.

I think there's a few more partitions we could compress.

WINK and OCC come to mind, and they compress fairly well.

Stewart Smith
OPAL Architect, IBM.

@ghost
Copy link
Author

ghost commented Sep 16, 2016

Dan notifications@github.com writes:

Question: does hostboot need to load the hbrt image? Is there anything stopping us loading it in skiboot? Is there any initial setup done?.

We load it and point at it in the devtree because that was what we
were told to do originally by the OPAL team. Remember that we do this
for both OpenPOWER and the enterprise systems. In the latter case,
OPAL needs to request the data from the FSP as a lid since you don't
have access to the PNOR. It would not be difficult to have divergent
behavior on our side if OPAL can handle it though.

It was certainly the easy way to get things going, no problem to revisit
though :)

However, one more wrinkle in this story is Secureboot. Hostboot has
to load HBRT during IPL because we are the only ones would can verify
the signature. I think that forces us to keep the current design
where HBRT is pushed into mainstore as part of the Hostboot IPL.

We have to verify signatures in skiboot, so this shouldn't be too much
of an issue.

Stewart Smith
OPAL Architect, IBM.

@williamspatrick
Copy link

The "other payload" requires Hostboot to load HBRT. Their secureboot support will be part of the HBRT image so they end up with a (chicken-egg) otherwise.

@ghost
Copy link
Author

ghost commented Sep 16, 2016

Patrick Williams notifications@github.com writes:

The "other payload" requires Hostboot to load HBRT. Their secureboot
support will be part of the HBRT image so they end up with a
(chicken-egg) otherwise.

Okay, I thought that was a possibility.

Seeing as there's existing logic based on the payload type, I guess that
could be used.

Architecturally though, it does seem neater to have less knowledge of
the payload implementation and fewer if (payload=blah)....

Stewart Smith
OPAL Architect, IBM.

@dcrowell77
Copy link
Collaborator

I know there's memory constraints on larger systems (Brazos), but we probably have a bit more leeway with the OpenPOWER kind of size

Actually, we are significantly more memory constrained in the OP systems because we have a lot more code loaded. We are very close to the edge during the memory initialization steps, that is typically where we have seen OOM issues crop up. If you make any changes, be sure to test on a Garrison box with full memory, that has been our worst case scenario in the past.

@bofferdn
Copy link
Collaborator

The secureboot code added an extra layer to the PNOR VFS. If we add additional sections to be XZ ?compressed, I wonder if we should try to integrate that feature directly into that new secureboot VFS layer. With this proposed patch from Stewart we end up with effectively duplicated code.

the secure provider already verifies the payload which is compressed, it just doesn't do the decompression after the fact. I think what you are perhaps saying is .. after the verification, take the additional step of doing the decompression and expose the final, uncompressed image to the caller. In that case getSectionInfo would have to be smart enough to give the uncompressed final size vs. the normal size. Also right now the VA spaces for PNOR , temp, secure all mirror each other, but the secure space would have to be bigger to accommodate the larger section, so perhaps we'd have to migrate to a unique VA space per partition to allow for the expanded size. Could probably be done with a new shim VA layer in between temp + secure, so something like

  1. compressed image in pnor
  2. temp space reads in protected payload, does verification if secure boot enabled
  3. if xz compressed, decompress into matching uncompressed VA space + free up all temp space pages
  4. user access to secure space VA range pulls in corresponding page of uncompressed, and free uncompressed page. Now only secure space has that verified, uncompressed, pinned page

Would work great if partition is purely a read-only, protected payload. If we need to support unprotected payloads or hash-page-table verification of anything, that introduces more complexity. We're already mem constrained loading OCC early in the boot and we are prob. going to have to switch to hash page table verification for that, assuming we can't otherwise drive the memory footprint down (and it is pretty inefficient how it's being done).

There are also other modifications like making sure getSectionInfo for secure space is reporting the fully uncompressed size, etc.

And as mentioned, the plan was to load HBRT for P9 from Hostboot due to the chicken-and-egg issue of PHyp not having the verification code.

@ghost
Copy link
Author

ghost commented Sep 16, 2016

Dan notifications@github.com writes:

I know there's memory constraints on larger systems (Brazos), but we probably have a bit more leeway with the OpenPOWER kind of size

Actually, we are significantly more memory constrained in the OP
systems because we have a lot more code loaded. We are very close to
the edge during the memory initialization steps, that is typically
where we have seen OOM issues crop up. If you make any changes, be
sure to test on a Garrison box with full memory, that has been our
worst case scenario in the past.

Fascinating.

I think we max out at 512GB the garrisons in our lab, so i'll see how I
go and ask for testing help/access if needed.

Stewart Smith
OPAL Architect, IBM.

@dcrowell77
Copy link
Collaborator

Would work great if partition is purely a read-only, protected payload.

Even for this case, it only works to hide the decompression in the VMM if the partition is a 'load once and done' sort of thing, or we only access it after we have mainstore up. We likely don't have the cache memory available to keep everything resident, and we can't swap out single pages without a way to decompress an arbitrary piece out of the middle.

@ghost
Copy link
Author

ghost commented Sep 19, 2016

Dan notifications@github.com writes:

Would work great if partition is purely a read-only, protected payload.

Even for this case, it only works to hide the decompression in the VMM
if the partition is a 'load once and done' sort of thing, or we only
access it after we have mainstore up. We likely don't have the cache
memory available to keep everything resident, and we can't swap out
single pages without a way to decompress an arbitrary piece out of the
middle.

Is there a HOWTO anywhere on looking at memory usage/maps while in cache?

Stewart Smith
OPAL Architect, IBM.

@dcrowell77
Copy link
Collaborator

We don't have a huge amount of tooling, but what we do have uses the standard Hostboot debug framework, see src/build/debug/, the specific tools of interest are Hostboot/MemStats.pm and Hostboot/PageMgr.pm. Is that what you wanted or are you just curious about Hostboot memory usage in general?

@ghost ghost mentioned this pull request Oct 25, 2016
wghoffa pushed a commit to wghoffa/hostboot that referenced this pull request Sep 28, 2017
Changes Included for package witherspoon-xml, branch master:
7bec10c - Erich Hauptli - 2017-09-26 - Adding new WOF data
3d66657 - e-liner - 2017-09-21 - Merge pull request open-power#69 from e-liner/memd_binary
8b9fa55 - Elizabeth Liner - 2017-09-21 - Adding witherspoon MEMD binary
ac74311 - Erich Hauptli - 2017-09-21 - Updating Memory Attributes
c0b9bc1 - William Hoffa - 2017-09-15 - Mark Xbus Targets Deconfigurable (open-power#67)
5736f3e - Prachi Gupta - 2017-09-08 - sync with common_mrw_xml -- 09/07 (open-power#66)

Changes Included for package hostboot, branch master:
7f59b42 - Jacob Harvey - 2017-09-26 - Increment red_waterfall for low vdn fix
ad079f5 - Zane Shelley - 2017-09-26 - PRD: Nimbus DD2.0.1 workaround for nce/tce/mpe/impe
49d2286 - Matt K. Light - 2017-09-25 - remove cas_latency.H include from p9_mss_freq.H
4930d04 - Luke Mulkey - 2017-09-25 - Memory buffer vpd accessor functions
3027cb5 - Thi Tran - 2017-09-25 - L3 Update - p9_hcd_cache_stopclocks HWP
72b46fb - Ben Gass - 2017-09-25 - Fix DMI scom translation.
190d346 - Prem Shanker Jha - 2017-09-25 - 24x7: Corrected handling of MCA on a direct attached systems.
3245f4f - Louis Stermole - 2017-09-25 - Restore original training settings if mss_draminit_training_adv fails
ecb8cf7 - Sachin Gupta - 2017-09-25 - Added comment for INVALID enum value
7085e6b - Andre Marin - 2017-09-25 - Add Write CRC attributes to xml and eff_dimm
84e9979 - Andre Marin - 2017-09-25 - Modify VPD decoder to take into account deconfigured ports
b6c7737 - Matthew Hickman - 2017-09-25 - Changed two symbol correction disable to mnfg flag DISABLE_DRAM_REPAIRS
f0e99cd - Nick Klazynski - 2017-09-25 - Core workarounds for multiple issues.
d58dbd6 - Soma BhanuTej - 2017-09-25 - Nimbus DD22 support updates to ekb
c0719c3 - Ben Gass - 2017-09-25 - Updates for HW416934 and HW417233
bc88548 - Elizabeth Liner - 2017-09-25 - Removing first byte from MEMD binary
0e38c62 - Nick Bofferding - 2017-09-25 - Secure Boot: Direct signature temp files to specific scratch dir
54c1fc7 - Nick Bofferding - 2017-09-25 - Secure Boot: Support open signing with component IDs
1b3b999 - Christian Geddes - 2017-09-25 - Set variables to nullptr after they are deleted
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants