IEP: | 0007 |
---|---|
Title: | ISCC-UNIT Condent-Code Mixed |
Author: | Titusz Pan tp@iscc.foundation |
Comments: | #12 |
Status: | DRAFT |
Type: | Core |
License: | CC-BY-4.0 |
Created: | {{ git_creation_date_localized }} |
Updated: | {{ git_revision_date_localized }} |
!!! note
This document is a **DRAFT** contributed as input to
[ISO TC 46/SC 9/WG 18](https://www.iso.org/committee/48836.html). The final version is
developed at the International Organization for Standardization as
[ISO/DIS 24138](https://www.iso.org/standard/77899.html)
- The Content-Code Subtype Mixed (Mixed-Code) shall be a similarity preserving hash of a collection of assets of the same or different media types combined into a single multimedia file.
- An ISCC processor that supports the creation of Mixed-Codes shall publicly document the supported file formats and the rules by which it divides the different parts of a multimedia file.
- The Mixed-Code shall be robust against format conversions, scaling, compression, and minor edits of the individual parts of the multimedia file.
The Mixed-Code shall have the data format illustrated in Figure 9:
![Figure 9 - Data format of the Mixed-Code](images/iscc-iep-0007-f09-mixed-code.png) Figure 8 - Data format of the Mixed-Code!!! example "EXAMPLE 1: 64-bit Mixed-Code in its canonical form:"
ISCC:EQASD57JXX7U73P7
!!! example "EXAMPLE 2: 256-bit Mixed-Code in its canonical form:"
ISCC:EQDSD57JXX7U73P7HPPH2P3U5OXZM7PL65T3HZ5JZ76H577P77NO5ZY
- The input for calculating the Mixed-Code shall be the Content-Codes of the individual parts of the multimedia file.
- At least two Content-Codes shall be required as input to calculate a Mixed-Code.
Mixed-Code processing shall generate the following ISCC metadata output elements:
- iscc: the Mixed-Code in its canonical form (required);
- parts: the list of Content–Codes used for calculating the Mixed-Code (recommended);
- Additional metadata extracted from the multimedia file (optional).
An ISCC processor shall pre-process the multimedia file as follows:
- Generate individual Content-Codes for each part of the multimedia file according to the specifications in IEP-0003, IEP-0004, IEP-0005 and IEP-0006.
An ISCC processor shall calculate the Mixed-Code as follows:
- Create a byte sequence from each Content-Code retaining the first byte of the ISCC-HEADER concatenated with the bytes of the ISCC-BODY.
- Apply the similarity hash to the list of byte sequences from step 1 to calculate the ISCC-BODY of the Mixed-Code.
The normative behaviour of an ISCC processor in generating a Mixed Code is specified only for Content-Code inputs. An implementation of the Mixed-Code algorithm shall be regarded as conforming to the standard if it creates the same Mixed-Code as the reference implementation for the same Content-Code inputs.
!!! note "NOTE"
For further technical details see source-code in modules
[code_content_mixed.py](https://github.com/iscc/iscc-core/blob/main/iscc_core/code_content_mixed.py)
and [simhash.py](https://github.com/iscc/iscc-core/blob/main/iscc_core/simhash.py) of the
[reference implementation](https://github.com/iscc/iscc-core).