Unique


Unique(a, I, position, caseInsensitive, resultIndex, mapToUnique, condition)

(In Analytica 5.0 or later) Without the index parameter «I», returns a list containing all the unique atoms in «a». The resulting list has any duplicates removed. «a» can be a multidimensional array, and the result contains the unique atoms (cells) across all dimensions.

When an index «I» is specified, it returns a maximal subset of index «I» such that each indicated slice of array «a» along «I» is unique. This usage can be used to remove duplicate slices from an array, or to identify a single member of each equivalence class.

Note: Prior to Analytica 5.0, the «I» parameter was required.

Example

Let:

Variable DataSet :=
Field ▶
PersonNum ▼ LastName FirstName Company
1 Smith Bob Acme
2 Jones John Acme
3 Johnson Bob Floorworks
4 Smith Bob Acme

Then:

Unique(DataSet) → ['Smith', 'Jones', 'Johnson', 'Bob', 'John', 'Acme', 'Floorworks' ]    {Requires Analytica 5.0}
Unique(DataSet, PersonNum) → [1, 2, 3]
Unique(DataSet[Field = 'Company'], PersonNum) → [1, 3]

Optional parameters

I

The index parameter «I» is optional as of Analytica 5.0, but is most often specified. When specified, it finds the unique slices along «I». When omitted, it finds the unique atomic values among all cells of «a».

Position

By default, Unique returns the elements of the index. Setting the optional parameter «position» equal true (position: true) will return the positions of the elements in «I» , rather than the elements themselves (see Associative vs. Positional Indexing).

This parameter is not used when the «I» index parameter is omitted.

CaseInsensitive

When applying Unique to text values, values are considered by default in a case-sensitive fashion, for example, "Apple" and "apple" are considered distinct elements.

Specifying caseInsensitive: true ignores differences in upper and lower case in text values when determining if values are unique.

ResultIndex

If you provide an Index Result for parameter «ResultIndex», the resulting unique values are in an array indexed by Result. If Result is shorter than the number of unique items, it omits the unique values after the first n items that fit, where n is the size of the index. When Result is too long, it fills out the extra cells with null.

«ResultIndex» is useful when you want to array abstract. For example, in a 2-D array A, you may want to identify the unique items along I separately for each item in index J:

For jj := J Do Unique(a[J = jj], I, resultIndex: I)

Without the «resultIndex» parameter, each iteration would return a list, and the For loop would then need to combine lists with incompatible implicit indexes, which would give an error. By ensuring that each result has an explicit index -- I in this example -- the results can be successfully combined.

This For loop example is not equivalent to:

Unique(a, I, resultIndex: I)

The reason is that Unique(a, I) compares entire slices -- it isn't operating over each slice of the exogenous dimensions separately as most other array functions do.

mapToUnique

(new to Analytica 5.0)

Unique(A,I,mapToUnique:true) returns an array indexed by «I» which maps from each element of «I» back to the first element in A that has that same value. The first element with a given value maps to itself, and is the element that would have been returned if «mapToUnique» was not specified. For example:

Index I := 'a'..'e'
Variable A := <code>Table(I)(4,2,4,3,2)
Unique(A, I, mapToUnique:true) → Array(I,['a','b','a','c','b'])
Unique(A, I, position:true, mapToUnique:true) → Array(I,[1,2,1,3,2])

One situation where this is useful is when you use Unique(A,I) to find the unique slices, so that you can compute an expensive function only on the unique slices. But then you figure out which result to use for each of the other slices, which «mapToUnique» gives you.

Since «mapToUnique» returns the position along «I», you must specify the index «I» when using «mapToUnique».

condition

(new to Analytica 5.3)

When you specify «condition», it finds the unique values from only those items that match the «condition». For example, if you don't want to include Null values, you can use:

Unique(A, condition:A<>Null)

or to find only unique text values (when there might be other data types present such as numbers):

Unique(A, condition:IsText(A))

When «mapToUnique» is true, Null is returned for unmatched items along «I». When «condition» has an index not present in «A», the item of «A» is included if «condition» is true anywhere along that extra index

Notes

The Set Functions such as SetDifference or SetUnion ensure that no duplicates exist in the final result, and hence can also be used to find the unique elements. For example, when L is a 1-D array or list

#SetDifference(\L)

and

Unique(L)

do the same thing. Since no set is being subtracted, SetDifference returns the set \L after duplicates are removed. In some instances, SetDifference has advantages over Unique. For example, if you also want to ignore certain values, including Null or others, you could compute the unique elements and then follow that with a call to SetDifference to remove the other values, but when so doing, you might as well skip the call to Unique entirely since SetDifference already does that for you.

The ordering of elements in the result follows the ordering of the elements in «a», which often feels arbitrary. Hence, it is common to wrap the call to Unique is a call to SortIndex such as

SortIndex(Unique(a))

History

The «I» parameter was first made optional in Analytica 5.0. Prior to that, the Unique(a) usage was not available. Without to omitted index, use a[I = Unique(a, I)] or #SetDifference(\a) to define an index.

The «mapToUnique» parameter was introduced in Analytica 5.0.

The «resultIndex» parameter is present in Analytica 4.3 and later, but hidden (doesn't show up in Expression Assist, etc.) until Analytica 5.0. You can still use it in those earlier releases even though it is hidden.

See also

Comments


You are not allowed to post comments.