Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More on endianness #720

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,10 @@ building the library or including the function definitions:
#define BOOST_JSON_STACK_BUFFER_SIZE 1024
#include <boost/json/src.hpp>
```
### Endianness

Boost.JSON uses [Boost.Endian](https://www.boost.org/doc/libs/release/libs/endian/doc/html/endian.html)
in order to support both little endian and big endian platforms.

### Supported Compilers

Expand Down
6 changes: 6 additions & 0 deletions doc/qbk/overview.qbk
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,12 @@ building the library or including the function definitions:
#include <boost/json/src.hpp>
```

[heading Endianness]

Boost.JSON uses
[@https://www.boost.org/doc/libs/release/libs/endian/doc/html/endian.html
Boost.Endian] in order to support both little endian and big endian platforms.

[heading Supported Compilers]

Boost.JSON has been tested with the following compilers:
Expand Down
4 changes: 2 additions & 2 deletions include/boost/json/detail/sse2.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ count_valid<false>(
uint8_t len = first & 0xFF;
if(BOOST_JSON_UNLIKELY(end - p < len))
break;
if(BOOST_JSON_UNLIKELY(! is_valid_utf8(p, first)))
if(BOOST_JSON_UNLIKELY(! is_valid_utf8_no_inline(p, first)))
break;
p += len;
}
Expand Down Expand Up @@ -185,7 +185,7 @@ count_valid<false>(
uint8_t len = first & 0xFF;
if(BOOST_JSON_UNLIKELY(end - p < len))
break;
if(BOOST_JSON_UNLIKELY(! is_valid_utf8(p, first)))
if(BOOST_JSON_UNLIKELY(! is_valid_utf8_no_inline(p, first)))
break;
p += len;
}
Expand Down
40 changes: 23 additions & 17 deletions include/boost/json/detail/utf8.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,8 @@ template<int N>
std::uint32_t
load_little_endian(void const* p)
{
std::uint32_t v = 0;
std::memcpy(&v, p, N);
endian::little_to_native_inplace(v);
return v;
auto const up = reinterpret_cast<unsigned char const*>(p);
return endian::endian_load<std::uint32_t, N, endian::order::little>(up);
}

inline
Expand Down Expand Up @@ -70,7 +68,7 @@ inline
bool
is_valid_utf8(const char* p, uint16_t first)
{
uint32_t v;
std::uint32_t v;
switch(first >> 8)
{
default:
Expand All @@ -81,38 +79,46 @@ is_valid_utf8(const char* p, uint16_t first)
v = load_little_endian<2>(p);
return (v & 0xC000) == 0x8000;

// 3 bytes, second byte [A0, BF]
case 2:
// 3 bytes, second byte [A0, BF]
case 2:
v = load_little_endian<3>(p);
return (v & 0xC0E000) == 0x80A000;

// 3 bytes, second byte [80, BF]
case 3:
// 3 bytes, second byte [80, BF]
case 3:
v = load_little_endian<3>(p);
return (v & 0xC0C000) == 0x808000;

// 3 bytes, second byte [80, 9F]
case 4:
// 3 bytes, second byte [80, 9F]
case 4:
v = load_little_endian<3>(p);
return (v & 0xC0E000) == 0x808000;

// 4 bytes, second byte [90, BF]
case 5:
// 4 bytes, second byte [90, BF]
case 5:
v = load_little_endian<4>(p);
return (v & 0xC0C0FF00) + 0x7F7F7000 <= 0x2F00;

// 4 bytes, second byte [80, BF]
case 6:
// 4 bytes, second byte [80, BF]
case 6:
v = load_little_endian<4>(p);
return (v & 0xC0C0C000) == 0x80808000;

// 4 bytes, second byte [80, 8F]
case 7:
// 4 bytes, second byte [80, 8F]
case 7:
v = load_little_endian<4>(p);
return (v & 0xC0C0F000) == 0x80808000;
}
}

BOOST_NOINLINE
inline
bool
is_valid_utf8_no_inline(const char* p, uint16_t first)
{
return is_valid_utf8(p, first);
}

class utf8_sequence
{
char seq_[4];
Expand Down
Loading