Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020112153 - OPTIMIZING LARGE SCALE DATA ANALYSIS

Note: Text based on automatic Optical Character Recognition processes. Please use the PDF version for legal matters

[ EN ]

What is claimed is:

1. A computer-implemented method, comprising:

obtaining, by an object grouping system, data for a plurality of sketches, wherein each sketch is stored using a set of registers and is a sampling of objects in a dataset, each object in the dataset being a target object for at least one digital campaign;

for each sketch of the plurality of sketches:

generating, using an identifier for a first object in the dataset, a hashed parameter for the first object, wherein the hashed parameter has a binary representation;

determining, based on the binary representation of the hashed parameter, whether the hashed parameter for the first object contributes to describing demographic attributes of the sampling of objects in the sketch; and

in response to determining that the hashed parameter contributes to describing the demographic attributes, storing, at the object grouping system, demographic attributes of the first object at a respective register of a set of registers, wherein each register in the set of registers stores data for a respective object in the sketch; and

generating, by the object grouping system, a reporting output that indicates:

a number of objects in the dataset that were reached by the digital campaign; and

demographic attributes about the number of objects in the dataset that were reached by the digital campaign.

2. The method of claim 1, wherein each object represents a user and generating the reporting output comprises:

generating a reporting output that describes a number of unique users that were reached by a particular digital campaign and a distribution of one or more unique users, that were reached by the particular digital campaign, across respective demographic categories that are each defined by at least two demographic attributes.

3. The method of claim 2, wherein a respective demographic category is defined at least by:

a male gender or female gender of a unique user; and

an age range of the unique user.

4. The method of claim 1, wherein determining that the hashed parameter contributes to describing the demographic attributes comprises:

identifying a number of leading zeros of the hashed parameter, the number of leading zeros being identified from the binary representation of the hashed parameter; and determining, based on the number of leading zeros of the hashed parameter, that the hashed parameter impacts an existing data value stored at the respective register of the set of registers.

5. The method of claim 4, wherein determining that the hashed parameter impacts an existing data value stored at the respective register comprises:

comparing the number of leading zeros in the hashed parameter for the first object to a number of leading zeros in the existing data value stored at the respective register; and

based on the comparing, determining that the existing data value stored at the respective register has fewer leading zeros than the number of leading zeros in the hashed parameter.

6. The method of claim 1, wherein the hashed parameter comprises at least one of: a hash of the identifier for the first object; or

a byte hash for the first object that is based on the identifier for the first object.

7. The method of claim 6, wherein the hashed parameter for the first object contributes to describing the demographic attributes of the sampling of objects in the sketch when:

a number of leading zeros in the binary representation of the hashed parameter exceeds a number of leading zeros in an existing data value stored at the respective register of the set of registers.

8. The method of claim 6, wherein the hashed parameter contributes to describing the demographic attributes of the sampling of objects in the sketch when:

a value of the byte hash for the first object is larger than a value of an existing byte hash stored at the respective register of the set of registers.

9. The method of claim 6, wherein storing the demographic attributes for the first object at the respective register comprises one or more of:

overwriting existing data stored at the respective register of the set of registers; storing the hash of the identifier for the first object; and

storing the byte hash for the first object.

10. The method of claim 9, wherein the demographic attributes for the first object comprises one or more of:

age of a user represented by the first object;

gender of a user represented by the first object;

geographic location of a user represented by the first object; or

a real-valued quantity associated with a user represented by the first object.

11. The method of claim 1, wherein generating the hashed parameter comprises at least one of:

generating, using a hashing and demographics module of the object grouping system, a hash of the identifier for the object; or

generating, using a hashing and demographics module of the object grouping system, a byte hash based on the identifier for the object.

12. The method of claim 1, further comprising:

generating, using a hashing and demographics module of the object grouping system, a notification that includes the reporting output, wherein the notification is generated in real-time and indicates demographic attributes about a number of objects that were reached by at least two distinct digital campaigns.

13. A system comprising:

one or more processing devices; and

one or more non-transitory machine-readable storage devices storing instructions that are executable by the one or more processing devices to cause performance of operations comprising:

obtaining, by an object grouping system, data for a plurality of sketches, wherein each sketch is stored using a set of registers and is a sampling of objects in a dataset, each object in the dataset being a target object for at least one digital campaign;

for each sketch of the plurality of sketches:

generating, using an identifier for a first object in the dataset, a hashed parameter for the first object, wherein the hashed parameter has a binary representation;

determining, based on the binary representation of the hashed parameter, whether the hashed parameter for the first object contributes to describing demographic attributes of the sampling of objects in the sketch; and

in response to determining that the hashed parameter contributes to describing the demographic attributes, storing, at the object grouping system, demographic attributes of the first object at a respective register of a set of registers, wherein each register in the set of registers stores data for a respective object in the sketch; and

generating, by the object grouping system, a reporting output that indicates: a number of objects in the dataset that were reached by the digital campaign; and

demographic attributes about the number of objects in the dataset that were reached by the digital campaign.

14. The system of claim 13, wherein each object represents a user and generating the reporting output comprises:

generating a reporting output that describes a number of unique users that were reached by a particular digital campaign and a distribution of one or more unique users, that were reached by the particular digital campaign, across respective demographic categories that are each defined by at least two demographic attributes.

15. The system of claim 13, wherein determining that the hashed parameter contributes to describing the demographic attributes comprises:

identifying a number of leading zeros of the hashed parameter, the number of leading zeros being identified from the binary representation of the hashed parameter; and determining, based on the number of leading zeros of the hashed parameter, that the hashed parameter impacts an existing data value stored at the respective register of the set of registers.

16. The system of claim 15, wherein determining that the hashed parameter impacts an existing data value stored at the respective register comprises:

comparing the number of leading zeros in the hashed parameter for the first object to a number of leading zeros in the existing data value stored at the respective register; and

based on the comparing, determining that the existing data value stored at the respective register has fewer leading zeros than the number of leading zeros in the hashed parameter.

17. The system of claim 13, wherein the hashed parameter comprises at least one of: a hash of the identifier for the first object; or

a byte hash for the first object that is based on the identifier for the first object.

18. The system of claim 17, wherein the hashed parameter for the first object contributes to describing the demographic attributes of the sampling of objects in the sketch when:

a number of leading zeros in the binary representation of the hashed parameter exceeds a number of leading zeros in an existing data value stored at the respective register of the set of registers.

19. The system of claim 17, wherein the hashed parameter contributes to describing the demographic attributes of the sampling of objects in the sketch when:

a value of the byte hash for the first object is larger than a value of an existing byte hash stored at the respective register of the set of registers.

20. The system of claim 17, wherein storing the demographic attributes for the first object at the respective register comprises one or more of:

overwriting existing data stored at the respective register of the set of registers; storing the hash of the identifier for the first object; and

storing the byte hash for the first object.

21. The system of claim 20, wherein the demographic attributes for the first object comprises one or more of:

age of a user represented by the first object;

gender of a user represented by the first object;

geographic location of a user represented by the first object; or

a real-valued quantity associated with a user represented by the first object.

22. The system of claim 13, wherein the operations further comprise:

generating, using a hashing and demographics module of the object grouping system, a notification that includes the reporting output, wherein the notification is generate in real-time and indicates demographic attributes about a number of objects that were reached by at least two distinct digital campaigns.

23. One or more non-transitory machine-readable storage devices storing instructions that are executable by one or more processing devices to cause performance of operations comprising:

obtaining, by an object grouping system, data for a plurality of sketches, wherein each sketch is stored using a set of registers and is a sampling of objects in a dataset, each object in the dataset being a target object for at least one digital campaign;

for each sketch of the plurality of sketches:

generating, using an identifier for a first object in the dataset, a hashed parameter for the first object, wherein the hashed parameter has a binary representation;

determining, based on the binary representation of the hashed parameter, whether the hashed parameter for the first object contributes to describing demographic attributes of the sampling of objects in the sketch; and

in response to determining that the hashed parameter contributes to describing the demographic attributes, storing, at the object grouping system,

demographic atributes of the first object at a respective register of a set of registers, wherein each register in the set of registers stores data for a respective object in the sketch; and

generating, by the object grouping system, a reporting output that indicates:

a number of objects in the dataset that were reached by the digital campaign; and

demographic atributes about the number of objects in the dataset that were reached by the digital campaign.