public class ICMPTrainRecordReader
extends java.lang.Object
This reader parses and prints out contents of binary icmptrain output file, possibly compressed. These are the files that we use to record data from Internet censuses and surveys.
| Name | Length | Comment |
|---|---|---|
| type | 1 byte | ICMP type field |
| code | 1 byte | ICMP code field |
| typeandcode | 2 bytes | ICMP type and code concatenated |
| typecode | 1 byte | ICMP type and code fiedls bunched together (legacy) |
| ttl | 1 byte | ICMP response remaining time to live |
| time | 4 bytes | Seconds since the Epoch that the probe was sent |
| rtt | 4 bytes | Round trip time in microseconds (if replied or zero) |
| flags | 1 byte | Format flags |
| probeaddr | 4 bytes | probed IP adddress (or zero if didn't match) |
| replyaddr | 4 bytes | IP address of the responder (if any or zero) |
Examples:
-D stream.recordreader.icmptrain.keys=probeaddr
-D stream.recordreader.icmptrain.vals=type_code_typecode
-D stream.recordreader.icmptrain.vals=type,code,typeandcode
As mentioned earlier that data to this reader can be a compressed stream.
Few codecs split the input file into chunks while many process one whole file.
If a codec can handle split file, it will be done so seamlessly. But this behavior
can be over-ridden by setting option mapred.input.codec.nosplitting to true
as follows:
-D mapred.input.codec.nosplitting=true
As said earlier that ICMP data is binary. So whenever ICMP file is split, this
reader needs to align itself with an ICMP record boundary. This reader uses
few heuristics for that. The strength of this heuristic can be controlled by
the option stream.recordreader.icmptrain.lookaheadcount as follows:
-D stream.recordreader.icmptrain.lookaheadcount=100
In this case the reader will use 100 ICMP records to verify that its alignment
choice is right.
HADOOP=$HADOOP_HOME/bin/hadoop
STREAMING_JAR=$HADOOP_HOME/build/hadoop-streaming.jar
INPUTREADER_JAR=$HADOOP_HOME/build/hadoop-icmptrain.jar
INPUTREADER_CLASS=edu.isi.hadoop.icmptrain.ICMPTrainRecordReader ... export
INPUTFORMAT_CLASS=edu.isi.hadoop.icmptrain.ICMPTrainInputFormat
HADOOP_CLASSPATH=${INPUTREADER_JAR} ... $HADOOP jar $STREAMING_JAR -file
$INPUTREADER_JAR -inputreader $INPUTREADER_CLASS ... ...
or
$INPUTREADER_JAR -inputformat $INPUTFORMAT_CLASS ... ...
| Constructor and Description |
|---|
ICMPTrainRecordReader(Configuration job,
FileSplit split) |
ICMPTrainRecordReader(FSDataInputStream in,
FileSplit split,
Reporter reporter,
JobConf job,
FileSystem fs) |
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
Text |
createKey() |
Text |
createValue() |
long |
getPos() |
float |
getProgress() |
boolean |
next(Text key,
Text value) |
public ICMPTrainRecordReader(FSDataInputStream in,
FileSplit split,
Reporter reporter,
JobConf job,
FileSystem fs)
throws java.io.IOException,
java.lang.NumberFormatException
java.io.IOExceptionjava.lang.NumberFormatExceptionpublic ICMPTrainRecordReader(Configuration job,
FileSplit split)
throws java.io.IOException,
java.lang.NumberFormatException
java.io.IOExceptionjava.lang.NumberFormatExceptionpublic boolean next(Text key,
Text value)
throws java.io.IOException
java.io.IOExceptionpublic Text createKey()
public Text createValue()
public float getProgress()
throws java.io.IOException
java.io.IOExceptionpublic long getPos()
throws java.io.IOException
java.io.IOExceptionpublic void close()
throws java.io.IOException
java.io.IOException