Sequence Reference
New in v2
In VRS v1.x, sequence references were limited to the refget sequence accession within Sequence Location objects. This made it difficult to indicate in a message that the referenced sequence was, for example, “GRCh38 chr11”. The SequenceReference class was created to enable the addition of such metadata.
The SequenceReference class is used to refer to a sequence by its refget accession. The class also allows implementations to optionally specify extra characteristics about the sequence, such as the alphabet used (nucleic acid or amino acid), if the sequence represents a circular molecule, and labels used to describe the sequence.
Definition and Information Model
Note
This data class is at a trial use maturity level and may change in future releases. Maturity levels are described in the GKS Maturity Model.
Computational Definition
A sequence of nucleic or amino acid character codes.
GA4GH Digest
Prefix |
Inherent |
---|---|
None |
[‘refgetAccession’, ‘type’] |
Information Model
Some SequenceReference attributes are inherited from Entity.
Field |
Flags |
Type |
Limits |
Description |
---|---|---|---|---|
id |
string |
0..1 |
The ‘logical’ identifier of the Entity in the system of record, e.g. a UUID. This ‘id’ is unique within a given system, but may or may not be globally unique outside the system. It is used within a system to reference an object from another. |
|
name |
string |
0..1 |
A primary name for the entity. |
|
description |
string |
0..1 |
A free-text description of the Entity. |
|
aliases |
⋮ | string |
0..m |
Alternative name(s) for the Entity. |
extensions |
⋮ | 0..m |
A list of extensions to the Entity, that allow for capture of information not directly supported by elements defined in the model. |
|
type |
string |
1..1 |
MUST be “SequenceReference” |
|
refgetAccession |
string |
1..1 |
A GA4GH RefGet identifier for the referenced sequence, using the sha512t24u digest. |
|
residueAlphabet |
string |
0..1 |
The interpretation of the character codes referred to by the refget accession, where “aa” specifies an amino acid character set, and “na” specifies a nucleic acid character set. |
|
sequence |
0..1 |
A sequenceString that is a literal representation of the referenced sequence. |
||
moleculeType |
string |
0..1 |
Molecule types as defined by RefSeq (see Table 1). MUST be one of “genomic”, “RNA”, “mRNA”, or “protein”. |
|
circular |
boolean |
0..1 |
A boolean indicating whether the molecule represented by the sequence is circular (true) or linear (false). |
Example
{
"type": "SequenceReference",
"refgetAccession": "SQ.F-LrLMe1SRpfUZHkQmvkVKFEGaoDeHul",
"label": "NC_000007.14"
}