Welcome to our Knowledge Base
2.2. KDD techniques supported
- Document Summarization – Summary Generation for documents or huge text data, it compresses the data and returns the most relevant abstract information that best represents the data. It will allow you to configure the output to your desired passage range (Maximum Words and Minimum number of words) and mode of summarization either as each page or for the entire document.
- Entity Recognition and Relation – Entity Recognition will provide the advanced deep learning capabilities for detecting and classifying the named entities and Entity Relation identification will find the relationship between any two entities using machine learning algorithms.
- Document Clustering – Document Clustering extracts the data from the text documents and it will seek for some indent and information hidden among the text where it will be represented in a spatial dimension and each document will be clustered according to the knowledge. You can view the word level deep clustering graphs and analyze the huge data repository with ease.
- Batch Process – Batch Process is a sophisticated pipeline with built in plugins for various KDD functions. It consists of KDD pipeline for Document Summarization, Named Entity Recognition, Entity Relationship Identification, Optical Character Recognition and Document Clustering. It integrates all the services in one pipeline where you can configure and apply KDD functions to your data.
- OCR – Optical Character Recognition pipeline enables you to extract data and convert into various formats. It contains the functions for recognitioning and pulling the tabular and text data from an image or pdf document.