Reference Length Expression

New in v2

The ReferenceLengthExpression class is new in VRS v2, and was designed as a means for compact encoding of large ambiguous sequence states following VOCA normalization.

Reference length expressions are used for expressing the state of Alleles where normalization results in a state other than an unambiguous indel or complete deletion (where length = 0). This feature allows for compact representation of the sequence as an expression of a reference subsequence that can be expanded or contracted to the designated length to result in the sequence state. See Allele Normalization for more details.

Reference length expressions also allow for the optional expression of the literal sequence derived from the reference in cases where it is convenient to do so.

Definition and Information Model

Note

This data class is at a trial use maturity level and may change in future releases. Maturity levels are described in the GKS Maturity Model.

Computational Definition

An expression of a sequence that is derived from repeating a subsequence of an associated Sequence Location.

GA4GH Digest

Prefix

Inherent

None

[‘length’, ‘repeatSubunitLength’, ‘type’]

Information Model

Some ReferenceLengthExpression attributes are inherited from Sequence Expression.

Field

Flags

Type

Limits

Description

id

string

0..1

The ‘logical’ identifier of the Entity in the system of record, e.g. a UUID. This ‘id’ is unique within a given system, but may or may not be globally unique outside the system. It is used within a system to reference an object from another.

name

string

0..1

A primary name for the entity.

description

string

0..1

A free-text description of the Entity.

aliases

string

0..m

Alternative name(s) for the Entity.

extensions

Extension

0..m

A list of extensions to the Entity, that allow for capture of information not directly supported by elements defined in the model.

type

string

1..1

MUST be “ReferenceLengthExpression”

length

integer | Range

1..1

The number of residues in the expressed sequence.

sequence

sequenceString

0..1

the literal Sequence encoded by the Reference Length Expression.

repeatSubunitLength

integer

1..1

The number of residues in the repeat subunit.

Example

{
    "type": "ReferenceLengthExpression",
    "length": 11,
    "repeatSubunitLength": 3,
    "sequence": "CTCCTCCTCCT"
}