Search International and National Patent Collections

1. (WO2017040209) DATA PREPARATION FOR DATA MINING

Pub. No.:    WO/2017/040209    International Application No.:    PCT/US2016/048721
Publication Date: Fri Mar 10 00:59:59 CET 2017 International Filing Date: Fri Aug 26 01:59:59 CEST 2016
IPC: G06F 7/16
G06F 17/30
G06F 3/06
Applicants: BLOOMREACH, INC.
Inventors: PAN, Rong
YU, Yue
Title: DATA PREPARATION FOR DATA MINING
Abstract:
A system for preparing data for data mining can be utilized to automate translation of raw data to denormalized high-dimensional data in a format of vectors by processing the raw data in a computer cluster processing system. In embodiments, a system for preparing data for data mining includes a data assemble definition interface, a data assemble plan generator, a data assemble plan compiler, a cluster execution module, and a data warehouse module. A user may input a data schema that specifies the raw data input, feature extraction or data translate method, output attributes, and output layer attributes. Embodiments of the present disclosure can interpret the data schema, plan a large data processing work flow for a computer cluster, execute the computer cluster process, and output the data in the format specified by the user in the data schema.