hadoop_connector.png

Connector Hadoop

How to import data into Hadoop Distributed File System?


Free demo

How to import data into Hadoop Distributed File System?


As the Hadoop Distributed File System (HDFS) is designed to store very large sets of data and stream it at high speed to user applications, it is very suitable for big data processing and thus gaining popularity. Importing data, content and/or documents from any source into Hadoop has never been easier. Using Xillio's import connector, data will be imported from our Unified Data Model into HDFS in a uniform manner and without loss of quality.

Unified data model
The basis of our Hadoop import connector is the Unified Data Model (UDM), in which we store data in between extraction and import. Besides the UDM and its supporting database (MongoDB), we use the REST interface of the HDFS server for this connector.

The meta-information about the data to import resides in the UDM, while the documents are stored locally on the file system. We first convert the metadata of the extracted documents to Hadoop's format, after which the import is done automatically and flawlessly.

Video
Watch the demo video to see an example of an import into HDFS using our Content ETL Platform and Xill robots.

Free demo

Fill out the form and we will contact you as soon as possible.