Difference between revisions of "HDFSRemoteManager"

From GreenVulcano Wiki
Jump to: navigation, search
(Description)
(Description)
Line 2: Line 2:
  
 
Use this element for encapsulating the parameters required to connect to an Hadoop Distributed File System (HDFS).
 
Use this element for encapsulating the parameters required to connect to an Hadoop Distributed File System (HDFS).
 +
Hadoop Distributed File System is a Java-based distributed file system that provides scalable and reliable data storage that is designed to span large clusters of commodity servers. It allows to store and process big data in a distributed environment across clusters of computers. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
 +
HDFS follows the master-slave architecture and it has the following elements:
 +
 +
- Namenode: acts as the "master" server. It manages the file system namespace and regulates client’s access to files stored in the datanodes.
 +
- Datanode: where HDFS stores the data that are replicated in the nodes of the cluster. Datanodes perform read-write operations on the file systems and perform operations such as block creation, deletion, and replication according to the instructions of the namenode.
 +
- Secondary Namenode: is a separate service that keeps a copy of both the edit logs, and filesystem image, merging them periodically to keep the size reasonable.
  
 
=={{VULCON}} / {{GVCONSOLE}} Configuration==
 
=={{VULCON}} / {{GVCONSOLE}} Configuration==

Revision as of 21:45, 1 March 2015

Description

Use this element for encapsulating the parameters required to connect to an Hadoop Distributed File System (HDFS). Hadoop Distributed File System is a Java-based distributed file system that provides scalable and reliable data storage that is designed to span large clusters of commodity servers. It allows to store and process big data in a distributed environment across clusters of computers. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. HDFS follows the master-slave architecture and it has the following elements:

- Namenode: acts as the "master" server. It manages the file system namespace and regulates client’s access to files stored in the datanodes. - Datanode: where HDFS stores the data that are replicated in the nodes of the cluster. Datanodes perform read-write operations on the file systems and perform operations such as block creation, deletion, and replication according to the instructions of the namenode. - Secondary Namenode: is a separate service that keeps a copy of both the edit logs, and filesystem image, merging them periodically to keep the size reasonable.

VulCon / GV Console Configuration

If the target directory is on a HDFS file system, you can define the parameters for connection.

The HDFSRemoteManager Element is used by RemoteFileSystemMonitor and remotemanager-call. It has the following attributes:

Attribute Type Description
type fixed This attribute must assume the value remote-manager.
class fixed This attribute must assume the value it.greenvulcano.util.remotefs.hdfs.HDFSRemoteManager.
connectionURL required Server host name and port to contact HDFS. The entire URI is passed to the FileSystem instance's initialize method. Example: hdfs://[IP_NAME_NOME]:[PORT_NAME_NODE]. [1]
username required User name used to access to the HDFS file system.
password required User password. #Encrypted
autoConnect optional If true the instance use autoconnect/disconnect at each method invocation. Default: false.