Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

meta_pe: fix rich header length check for hash calculation #50

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

knowmalware
Copy link
Contributor

The original Rich Signature write-up:
http://www.ntcore.com/files/richsign.htm
searches 400 bytes for the "Rich" string. The pefile module
searches 128 bytes for the string. I have found that 128 is
sometimes not enough, and 400 feels rather large, so I have
choosen a round (hex) value in between.

I also take a lesson from the original write-up and search for
NULL values, but added a search for the PE header as well.

Note that because we rely on pefile module for the Rich Header
Values, that array will be incomplete, and thus differ from that
used for hash calculation, until pefile itself is fixed.

The original Rich Signature write-up:
  http://www.ntcore.com/files/richsign.htm
searches 400 bytes for the "Rich" string.  The pefile module
searches 128 bytes for the string.  I have found that 128 is
sometimes not enough, and 400 feels rather large, so I have
choosen a round (hex) value in between.

I also take a lesson from the original write-up and search for
NULL values, but added a search for the PE header as well.

Note that because we rely on pefile module for the Rich Header
Values, that array will be incomplete, and thus differ from that
used for hash calculation, until pefile itself is fixed.
@marnao
Copy link
Contributor

marnao commented Jan 18, 2017

@knowmalware thanks for the pull request.. we definitely need to increase the search area for the rich header ending, although I'm not sure what the optimal value is. Your guess is as good as any.

I'm not sure I follow the other part of your modification, specifically around looking for the null values and PE header. When would this be useful? Do you have any samples you could point to?

@knowmalware
Copy link
Contributor Author

The other part of my modification only matters when the Rich header has been tampered with or replaced. The search for the PE header is the first attempt to find the end, as the Rich header should be right before the PE header. The search for NULL values is the fall-back, as the Rich header should not contain any NULL dwords. In practice, I usually see a set of NULL bytes before the PE header, so it made sense to me.

If you're uncomfortable with this, I can change the PR to just not produce a hash if the Rich string doesn't exist, so that the code doesn't cause an exception when analyzing a PE file produced by a non-Microsoft compiler. But I'd prefer to leave it as-is for malware analysis purposes, as any changes to the Rich header could still be interesting from a similarity perspective.

@marnao
Copy link
Contributor

marnao commented Feb 3, 2017

@wxsBSD would you mind reviewing this change? You're probably more familiar with this stuff than I am given your work on yara.

@wxsBSD
Copy link

wxsBSD commented Feb 5, 2017

Shouldn't it be possible to go from the Rich header start to (at most) the NT header start? IE: It should be from 0x80 to uint32(0x3c). You can also be extra careful and ensure it ends with DanS.

@wxsBSD
Copy link

wxsBSD commented Feb 5, 2017

Also, starting at 0x80 works because nobody ever changes the size of the DOS stub. The right thing to do is calculate the starting offset and ensure it is Rich.

@wxsBSD
Copy link

wxsBSD commented Feb 6, 2017

I haven't looked at Frank's code, but does erocarrera/pefile@a3e5d09 not look like it makes rich header parsing more robust, in the manner I'm describing? It seems to search for the ending up to the NT header.

@knowmalware
Copy link
Contributor Author

I half agree with Wes. I'll change my PR to search from 0x80 to pe.NT_HEADERS.get_file_offset(), which is what pefile should be doing instead of searching through to the OPTIONAL_HEADER.

If there's non-null bytes between 0x80 and the start of the PE header, I'd still be interested in that result from a malware analysis perspective, but perhaps it should be called something other than the rich sig in that case. I'll updated this PR accordingly.

Copy link

@agrajag9 agrajag9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mods at 247 look good, but mods at 257 can be better accounted for by code in issue 55 (#55)

The latest version of pefile provides easy access to the
deobfuscated rich header by accessing the 'clear_data' key
of the parsed rich header.
@knowmalware
Copy link
Contributor Author

Latest release of pefile makes this much easier. Updated the code to use the clear data exposed by pefile.

@knowmalware
Copy link
Contributor Author

should also fix #55

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants