Difference between revisions of "HDFSRemoteManager"
(→Description) |
(→Description) |
||
(3 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
==Description== | ==Description== | ||
− | Use this element for encapsulating the parameters required to connect to an Hadoop Distributed File System (HDFS). | + | Use this element for encapsulating the parameters required to connect to an Hadoop Distributed File System (HDFS) [http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html] . |
+ | Hadoop Distributed File System is a Java-based distributed file system that provides scalable and reliable data storage that is designed to span large clusters of commodity servers. It allows to store and process big data in a distributed environment across clusters of computers. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. | ||
+ | HDFS follows the master-slave architecture and it has the following elements: | ||
+ | |||
+ | * Namenode: acts as the "master" server. It manages the file system namespace and regulates client’s access to files stored in the datanodes. | ||
+ | * Datanode: where HDFS stores the data that are replicated in the nodes of the cluster. Datanodes perform read-write operations on the file systems and perform operations such as block creation, deletion, and replication according to the instructions of the namenode. | ||
+ | * Secondary Namenode: is a separate service that keeps a copy of both the edit logs, and filesystem image, merging them periodically to keep the size reasonable. | ||
+ | |||
+ | HDFSRemoteManager allows to execute the [[remotemanager-call]] on the HDFS filesystem. | ||
=={{VULCON}} / {{GVCONSOLE}} Configuration== | =={{VULCON}} / {{GVCONSOLE}} Configuration== |
Latest revision as of 22:00, 1 March 2015
Description
Use this element for encapsulating the parameters required to connect to an Hadoop Distributed File System (HDFS) [1] . Hadoop Distributed File System is a Java-based distributed file system that provides scalable and reliable data storage that is designed to span large clusters of commodity servers. It allows to store and process big data in a distributed environment across clusters of computers. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. HDFS follows the master-slave architecture and it has the following elements:
- Namenode: acts as the "master" server. It manages the file system namespace and regulates client’s access to files stored in the datanodes.
- Datanode: where HDFS stores the data that are replicated in the nodes of the cluster. Datanodes perform read-write operations on the file systems and perform operations such as block creation, deletion, and replication according to the instructions of the namenode.
- Secondary Namenode: is a separate service that keeps a copy of both the edit logs, and filesystem image, merging them periodically to keep the size reasonable.
HDFSRemoteManager allows to execute the remotemanager-call on the HDFS filesystem.
VulCon / GV Console Configuration
If the target directory is on a HDFS file system, you can define the parameters for connection.
The HDFSRemoteManager Element is used by RemoteFileSystemMonitor and remotemanager-call. It has the following attributes:
Attribute | Type | Description |
---|---|---|
type | fixed | This attribute must assume the value remote-manager. |
class | fixed | This attribute must assume the value it.greenvulcano.util.remotefs.hdfs.HDFSRemoteManager. |
connectionURL | required | Server host name and port to contact HDFS. The entire URI is passed to the FileSystem instance's initialize method. Example: hdfs://[IP_NAME_NOME]:[PORT_NAME_NODE]. [2] |
username | required | User name used to access to the HDFS file system. |
password | required | User password. #Encrypted |
autoConnect | optional | If true the instance use autoconnect/disconnect at each method invocation. Default: false. |