file statistics :: Earth Volumetric Studio Help

file statistics

The file statistics module reads an analytical, lithology, or geologic file (.apdv, .aidv, .pgf, .geo, .gmf, .lpdv, .lsdv, .dg) and reports summary statistics and a frequency distribution for a selected data component. During execution, the module reads the file, displays an error message if the file contains errors in format or numeric values, and then displays the statistical results in the EVS Information window. The sample data and a glyphed renderable can be passed downstream from the output ports for visualization.

Ports

Direction	Name	Type	Description
Input	Input Filename	String	File used to display data.
Output	Output Filename	String	File used to display data.
Output	Sample Data	Field	A field containing the sample point data.
Output	Sample Object	Renderable	A renderable object displaying the sample data.
Output	Numeric Output 1	Number	A numeric value computed from the expression.
Output	Numeric Output 2	Number	A numeric value computed from the expression.
Output	String Output	String	A string value computed from the expression.

Properties

Property	Type	Description
Allow Run	Boolean	This toggle can prevent the module from running, allowing the user to make changes to large data sets without waiting for updates.
Execute	Button	This button will force the module to run even if the Allow Run toggle has been turned off. This allows the user to make a number of changes before updating.
Filename	String	The file name to process for display.
Use Application Origin	Boolean	When true, the module will apply the Application Origin. When false, data will be left in internal model space. Turn off when loading data intended to use as a glyph or similar.

Data Processing

Property	Type	Description
Component Or Layer	Integer	The Data Component is used to select which file data component to process for display.
Data Processing	Choice: Linear Processing, Log Processing	Data Processing allows the module to be run in either linear or log space.
Z Scale	Double	The Z Scale is the vertical exaggeration to be applied to the output object.
Log Post Processing Clip Min	Double	Log Post Processing Clip Min replaces, after data processing, any sample property value that is less than the specified number in log space.
Linear Post Processing Clip Min	Double	Linear Post Processing Clip Min replaces, after data processing, any sample property value that is less than the specified number in linear space.
Detection Limit	Double	The Detection Limit value affects any file values set with the ‘ND’ or other non-detect flags (for a list of these flags open the help for the APDV file format). When the module encounters this flag in the file it will insert a value equal to (Detection Limit * LT Multiplier).
Less Than Multiplier	Double	The Less Than Multiplier is the value applied to any sample with the ‘<’ less than flag.

Statistic Settings

Property	Type	Description
Minimum Data Level	Double	The Minimum Data Level is used to set the lower limit on the data bins for statistical analysis. The default value is the minimum in the selected data component. If the statistical distribution should focus on only a portion of the data, this value can be changed to reflect only that desired range of data.
Maximum Data Level	Double	The Maximum Data Level is used to set the upper limit on the data bins for statistical analysis. The default value is the maximum in the selected data component. If the statistical distribution should focus on only a portion of the data, this value can be changed to reflect only that desired range of data.
Number Of Bins	Integer	The Number of Bins is used to set the number of distribution bins to be used in the analysis. The default is 10 and the range is from 2 to 255.
Summary Statistics	Statistics	A compound control that opens the full statistics view (histogram, percentiles, summary table) and exposes the computed values to the read-only fields below.
Data Min	Double	The minimum value in the selected data component. Read-only display field.
Data Max	Double	The maximum value in the selected data component. Read-only display field.
Processing	String	The data processing mode (Linear or Log). Read-only display field.
Mean	Double	The mean of the selected data component. Read-only display field.
Median	Double	The median of the selected data component. Read-only display field.
Std Dev	Double	The standard deviation of the selected data component. Read-only display field.

Time Settings

Property	Type	Description
Chem File Is Time Domain	Boolean	The Chem File Is Time Domain toggle turns on date interpolation for time-domain analyte (e.g. chemistry) files.
Specify Date By Component	Boolean	The Specify Date By Component toggle causes the Date field to be ignored and the date to be selected using the Data Component.
Date For Interpolation	Date	The Date For Interpolation field is the date being interpolated to. For example, if there is an analyte value of 2 on 1/01/05 and a value of 4 on 1/03/05 and the date is set to 1/02/05 with Direct Interpolation, the value would be 3. The Date can be either set here or passed in via the Date port.
Analyte Name	String	The Analyte Name field is used for AIDV and APDV time files, where the dates take up the spots in these files usually reserved for analyte names.
Interpolation Type	Choice: Direct Interpolation Only, Interpolate Only, Interpolate and Extrapolate Beyond, Interpolate and Extrapolate	The interpolation method defines how to interpolate analyte values to the chosen date. See Interpolation Methods below for a complete description of each option.
Use Nearest Measured Data	Boolean	The Use Nearest Measured Data toggle causes the sample at the interpolated date to take the value of the nearest measured date instead of an interpolated value.

Interpolation Methods

Each interpolation method defines how to interpolate when given Missing values in a file. Non-Detect values are equal to either the Detection Limit or the Pre Clip Min. If the Date is set to the same time as a Non-Detect in the file, the sample will be a Non-Detect and not an interpolated value.

Direct Interpolation Only: The most basic interpolation method, and the most accurate in terms of representing the data as it has been entered. Looks at the two dates surrounding the input Date. If either date, or both dates, have Missing as values, the value for that sample will be Missing and no interpolation will occur.
Interpolate Only: Looks through the date columns both before and after the set Date for values that are not Missing, then interpolates between those numbers. If it fails to find a non-Missing value before and after the set date, it sets the data to Missing. Useful for files with a small amount of Missing values.
Interpolate and Extrapolate Beyond: Looks through the date columns both before and after the set Date for the first instance of a value that is not Missing. If it does not find a valid non-Missing date after the input Date, it extrapolates beyond the last useable date to the input Date.
Interpolate and Extrapolate: Looks through the date columns both before and after the set Date for the first instance of a value that is not Missing. If it fails to find one, it extrapolates the first value backwards to the input Date. If the date after the input Date is Missing, it looks forward through the time columns until it finds a date that is not Missing. It also extrapolates beyond the last valid date in the file.

Glyph Settings

Property	Type	Description
Points As Glyphs	Boolean	The Points As Glyphs toggle causes the points to be displayed as a user-selected glyph.
Point Width	Integer	The Point Width sets the size of the rendered pixels. The default is 0, which is equivalent to 1.
Glyph Size	Double	The Glyph Size value is used to scale the glyphs in all directions. The default is automatically computed based on the input data.
Priority	Choice: Maximum, Minimum	The Priority of the glyph reverses the scaling so that the smallest sample values have the largest size.
Minimum Scale Factor	Double	The Minimum Scale Factor scales the sample values with the least Priority.
Maximum Scale Factor	Double	The Maximum Scale Factor scales the sample values with the greatest Priority.
Use Log Data	Boolean	The Use Log Data toggle forces the size of the glyph to be based on the log10 of the selected data.
Generated Glyph	Choice: Sphere, Cube, Cone, Cylinder, Polygon, Disk	The Generated Glyph choice selects the type of glyph that is automatically generated.
Sphere Subdivisions	Integer	The Sphere Subdivisions value defines how finely the sample spheres are rendered. Higher values mean smoother spheres but at a higher memory cost.
Glyph Resolution	Integer	The resolution for generated cone, polygon, cylinder, and disk glyphs.
Primary Axis Factor	Double	The scale factor for the primary axis of the glyph.
Secondary Axis Factor	Double	The scale factor for the secondary axis of the glyph.
Heading Dip	Heading/Dip	The Heading and Dip values are used to align the glyphs to a constant orientation.
Roll	Double	The roll of the glyph along its primary axis.

Output Port Settings

Property	Type	Description
Numeric Expression 1	String	Python expression for the first numeric output port. Use variable names like Mean, Median, DataMin, DataMax, StdDev, Variance, FirstQuartile, ThirdQuartile, IQR, NumberOfPoints, NumberOfCells to reference statistics values.
Numeric Expression 2	String	Python expression for the second numeric output port.
String Expression	String	Python f-string expression for the string output port. Use {VariableName} placeholders such as {Processing}, {Units}, {Report} to reference statistics values.