Difference between revisions of "HDFSRemoteManager"

From GreenVulcano Wiki
Jump to: navigation, search
(Description)
 
(Description)
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
==Description==
 
==Description==
  
Use this element for encapsulating the parameters required to connect to an Hadoop Distributed File System (HDFS).
+
Use this element for encapsulating the parameters required to connect to an Hadoop Distributed File System (HDFS) [http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html] .
 +
Hadoop Distributed File System is a Java-based distributed file system that provides scalable and reliable data storage that is designed to span large clusters of commodity servers. It allows to store and process big data in a distributed environment across clusters of computers. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
 +
HDFS follows the master-slave architecture and it has the following elements:
 +
 
 +
* Namenode: acts as the "master" server. It manages the file system namespace and regulates client’s access to files stored in the datanodes.
 +
* Datanode: where HDFS stores the data that are replicated in the nodes of the cluster. Datanodes perform read-write operations on the file systems and perform operations such as block creation, deletion, and replication according to the instructions of the namenode.
 +
* Secondary Namenode: is a separate service that keeps a copy of both the edit logs, and filesystem image, merging them periodically to keep the size reasonable.
 +
 
 +
HDFSRemoteManager allows to execute the [[remotemanager-call]] on the HDFS filesystem.
 +
 
 +
=={{VULCON}} / {{GVCONSOLE}} Configuration==
 +
 
 +
If the target directory is on a HDFS file system, you can define the parameters for connection.
 +
 
 +
The HDFSRemoteManager Element is used by [[Fsmonitor-call#RemoteFileSystemMonitor|RemoteFileSystemMonitor]] and [[remotemanager-call]]. It has the following attributes:
 +
{|class="gvtable"
 +
! Attribute !! Type !! Description
 +
|-
 +
| type || fixed || This attribute must assume the value '''remote-manager'''.
 +
|-
 +
| class || fixed || This attribute must assume the value '''it.greenvulcano.util.remotefs.hdfs.HDFSRemoteManager.
 +
|-
 +
| connectionURL || required ||  Server host name and port to contact HDFS. The entire URI is passed to the FileSystem instance's initialize method. Example: hdfs://[IP_NAME_NOME]:[PORT_NAME_NODE]. [http://wiki.apache.org/hadoop/NameNode]
 +
|-
 +
| username || required || User name used to access to the HDFS file system.
 +
|-
 +
| password || required || User password. #Encrypted
 +
|-
 +
| autoConnect || optional || If true the instance use autoconnect/disconnect at each method invocation. Default: false.
 +
|}

Latest revision as of 22:00, 1 March 2015

Description

Use this element for encapsulating the parameters required to connect to an Hadoop Distributed File System (HDFS) [1] . Hadoop Distributed File System is a Java-based distributed file system that provides scalable and reliable data storage that is designed to span large clusters of commodity servers. It allows to store and process big data in a distributed environment across clusters of computers. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. HDFS follows the master-slave architecture and it has the following elements:

  • Namenode: acts as the "master" server. It manages the file system namespace and regulates client’s access to files stored in the datanodes.
  • Datanode: where HDFS stores the data that are replicated in the nodes of the cluster. Datanodes perform read-write operations on the file systems and perform operations such as block creation, deletion, and replication according to the instructions of the namenode.
  • Secondary Namenode: is a separate service that keeps a copy of both the edit logs, and filesystem image, merging them periodically to keep the size reasonable.

HDFSRemoteManager allows to execute the remotemanager-call on the HDFS filesystem.

VulCon / GV Console Configuration

If the target directory is on a HDFS file system, you can define the parameters for connection.

The HDFSRemoteManager Element is used by RemoteFileSystemMonitor and remotemanager-call. It has the following attributes:

Attribute Type Description
type fixed This attribute must assume the value remote-manager.
class fixed This attribute must assume the value it.greenvulcano.util.remotefs.hdfs.HDFSRemoteManager.
connectionURL required Server host name and port to contact HDFS. The entire URI is passed to the FileSystem instance's initialize method. Example: hdfs://[IP_NAME_NOME]:[PORT_NAME_NODE]. [2]
username required User name used to access to the HDFS file system.
password required User password. #Encrypted
autoConnect optional If true the instance use autoconnect/disconnect at each method invocation. Default: false.