Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregation Key bits order #1253

Open
Natalia-user opened this issue Apr 18, 2024 · 6 comments
Open

Aggregation Key bits order #1253

Natalia-user opened this issue Apr 18, 2024 · 6 comments
Assignees
Labels
aggregate developer-input Question/feedback raised by a developer and posted here on their behalf for public discussion

Comments

@Natalia-user
Copy link

Am i correct or no when assume that:
source-side key piece bits in aggregation key is a 64 most significant bits in 128 bit integer
but this integer is encoded as big-endian, so in byte string this bits will goes first,
i.e. if we have, say,
source part encoded in hex is: 0123456789ABCDEF
trigger part encoded in hex is: 0000000000000000
so 128-bit integer representing the aggregation key will be:
0x0123456789ABCDEF0000000000000000
and the first bits of resulting bytes of the aggregation key will be:
0000 0001 0010 0011 ....

@apasel422 apasel422 added aggregate developer-input Question/feedback raised by a developer and posted here on their behalf for public discussion labels Apr 18, 2024
@johnivdel
Copy link
Contributor

Thanks for filing.

The aggregation key pieces all support a range of 128-bits. You can think of the source side and trigger side key pieces as 128-bit masks (which are parsed from hex-encoded strings).

In your example, if the registration looks like:

Source registration:

{
  ...
  aggregation_keys": {
    "key1": "0x0123456789ABCDEF",
  }
}

Trigger

{
  ... // existing fields, such as `event_trigger_data`

  "aggregatable_trigger_data": [
    {
      "key_piece": "0x0000000000000000",
      // Apply this key piece to:
      "source_keys": ["key1"]
    }]

Both of these keys are interpreted as 128-bit keys and then OR'd together, so the resulting aggregation-key (in hex) would be:

0x00000000000000000123456789ABCDEF

to get the behavior of concatenating 64-bits from each side you would need to shift your source-side key into the upper 64-bits:

0x0123456789ABCDEF0000000000000000

then let's say your trigger-side key is 0x00000000000000001, you would now get:

0x0123456789ABCDEF0000000000000001

the hex strings themselves have the higher bits encoded left to right. Hex strings are only used for registration in the API. The aggregatable reports themselves encode this as a 128-bit integer in cbor: https://wicg.github.io/attribution-reporting-api/#aggregatable-contribution-key

Let me know if any further clarification is needed (perhaps we can improve the explainer / specs to make this more clear)

@DRVTiny
Copy link

DRVTiny commented May 1, 2024

@johnivdel , It seems to me that the first part of the explanation contradicts the second.

So... finally, which result is right for source=0x0123456789ABCDEF and trigger=0x000000000000001:
0x00000000000000001123456789ABCDEF
or
0x0123456789ABCDEF0000000000000001
?

@linnan-github
Copy link
Collaborator

Hi @DRVTiny, the key_piece in both source and trigger registrations are 128-bits in hex.

So for source=0x0123456789ABCDEF, it's equivalent to 0x00000000000000000123456789ABCDEF with 0s padded on the left.
For trigger= 0x000000000000001, it's equivalent to 0x0000000000000000000000000000001, also with 0s padded on the left.

Therefore the resulting key which is OR'ed of those two is 0x00000000000000000123456789ABCDEF. Effectively the trigger side key piece doesn't contribute as the lowest bit of source key piece is already set.

@DRVTiny
Copy link

DRVTiny commented May 1, 2024

@johnivdel , OK, so we have, say, UInt128 number 0x4A000000000000000123456789ABCDEF

which bits of this number will be written at byte 0 of "bucket" field in domain.avro?
I assume, this will be 0100 1010, because this UInt128 will be written as big-endian?

In my real code i'm trying to generate domain.avro from domain.json, where "bucket" is a string consists of generally not-readable bytes. So it is some kind of problem for me - to create this "string" properly because i don't know which bits must be placed at a lower addresses and which - at a higher ones.

@DRVTiny
Copy link

DRVTiny commented May 2, 2024

In short, the main question is:
if we have aggregation_key == "0x4A000000000000000123456789ABCDEF" (hex string representing uint128)

  • how to encode it properly in the bucket field of the domain.avro?
    is it correct to place here the following byte sequence
    adr: 00 01 ... 08 09
    val: 4A 00 ... 01 23
    bits: 01001010...
    what to do with octets of the bytes?
    0x4A in big-endian is not 1010 0100, octets must be exchanged : 0100 0101
    So... i am in doubt, how to encode this uint128 number and place it to domain.avro? maybe you have some code example for this?

@linnan-github
Copy link
Collaborator

@hostirosti Could you please confirm and comment more about domain.avro?

As for the encoding, the 128-bit integer is encoded as big endian. So in the example, the first byte is 4A = 01001010, the second byte is 00 = 00000000, and so on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aggregate developer-input Question/feedback raised by a developer and posted here on their behalf for public discussion
Projects
None yet
Development

No branches or pull requests

5 participants