This is a native golang client for hdfs. It connects directly to the namenode using
the protocol buffers API.
It tries to be idiomatic by aping the stdlib os package, where possible, and
implements the interfaces from it, including os.FileInfo and os.PathError.
Here's what it looks like in action:
client, _:=hdfs.New("namenode:8020")
file, _:=client.Open("/mobydick.txt")
buf:=make([]byte, 59)
file.ReadAt(buf, 48847)
fmt.Println(string(buf))
// => Abominable are the tumblers into which he pours his poison.
Along with the library, this repo contains a commandline client for HDFS. Like
the library, its primary aim is to be idiomatic, by enabling your favorite unix
verbs:
$ hdfs --help
Usage: hdfs COMMAND
The flags available are a subset of the POSIX ones, but should behave similarly.
Valid commands:
ls [-lah] [FILE]...
rm [-rf] FILE...
mv [-fT] SOURCE... DEST
mkdir [-p] FILE...
touch [-amc] FILE...
chmod [-R] OCTAL-MODE FILE...
chown [-R] OWNER[:GROUP] FILE...
cat SOURCE...
head [-n LINES | -c BYTES] SOURCE...
tail [-n LINES | -c BYTES] SOURCE...
du [-sh] FILE...
checksum FILE...
get SOURCE [DEST]
getmerge SOURCE DEST
put SOURCE DEST
Since it doesn't have to wait for the JVM to start up, it's also a lot faster
hadoop -fs:
$ time hadoop fs -ls / > /dev/null
real 0m2.218s
user 0m2.500s
sys 0m0.376s
$ time hdfs ls / > /dev/null
real 0m0.015s
user 0m0.004s
sys 0m0.004s
Best of all, it comes with bash tab completion for paths!
Installing the commandline client
Grab a tarball from the releases page
and unzip it wherever you like.
To configure the client, make sure one or both of these environment variables
point to your Hadoop configuration (core-site.xml and hdfs-site.xml). On
systems with Hadoop installed, they should already be set.
By default on non-kerberized clusters, the HDFS user is set to the
currently-logged-in user. You can override this with another environment
variable:
$ export HADOOP_USER_NAME=username
Using the commandline client with Kerberos authentication
Like hadoop fs, the commandline client expects a ccache file in the default
location: /tmp/krb5cc_<uid>. That means it should 'just work' to use kinit:
请发表评论