Adjacency

New in v2

The Adjacency class was added in v2 to describe structural variation.

The adjacency class is a core concept for structural variation, representing the junction point of two adjoined molecules. This class can be used on its own (e.g. for junctions of chimeric transcript fusions) or in higher order structures such as Derivative Molecule to represent molecules derived from multiple adjacencies (e.g. for translocations).

Definition and Information Model

Note

This data class is at a trial use maturity level and may change in future releases. Maturity levels are described in the GKS Maturity Model.

Computational Definition

The Adjacency class represents the adjoining of the end of a sequence with the beginning of an adjacent sequence, potentially with an intervening linker sequence.

GA4GH Digest

Prefix

Inherent

AJ

[‘adjoinedSequences’, ‘linker’, ‘type’]

Information Model

Some Adjacency attributes are inherited from Variation.

Field

Flags

Type

Limits

Description

id

string

0..1

The ‘logical’ identifier of the Entity in the system of record, e.g. a UUID. This ‘id’ is unique within a given system, but may or may not be globally unique outside the system. It is used within a system to reference an object from another.

name

string

0..1

A primary name for the entity.

description

string

0..1

A free-text description of the Entity.

aliases

string

0..m

Alternative name(s) for the Entity.

extensions

Extension

0..m

A list of extensions to the Entity, that allow for capture of information not directly supported by elements defined in the model.

digest

string

0..1

A sha512t24u digest created using the VRS Computed Identifier algorithm.

expressions

Expression

0..m

type

string

1..1

MUST be “Adjacency”.

adjoinedSequences

iriReference | Location

2..2

The terminal sequence or pair of adjoined sequences that defines in the adjacency.

linker

Sequence Expression

0..1

The sequence found between adjoined sequences.

homology

D

boolean

0..1

A flag indicating if coordinate ambiguity in the adjoined sequences is from sequence homology (true) or other uncertainty, such as instrument ambiguity (false).

Example

{
   "id": "ga4gh:AJ.O0IbSYyhnBAtUsR51bpdoqeSo4YaDMFo",
   "type": "Adjacency",
   "adjoinedSequences": [
     {
       "type": "SequenceLocation",
       "sequenceReference": {
           "type": "SequenceReference",
           "refgetAccession": "SQ.9KdcA9ZpY1Cpvxvg8bMSLYDUpsX6GDLO",
           "residueAlphabet": "na",
           "id": "NC_000002.11"
       },
       "start": 456
     },
     {
       "type": "SequenceLocation",
       "sequenceReference": {
           "type": "SequenceReference",
           "refgetAccession": "SQ.S_KjnFVz-FE7M0W6yoaUDgYxLPc1jyWU",
            "residueAlphabet": "na",
            "id": "NC_000001.10"
       },
       "end": 123
     }
   ]
}

Implementation Guidance

Sequence Locations and Directionality

Structural variants on double-stranded nucleic acids may have an adjoined partner that is a reverse complement of the provided Sequence Reference. These types of adjacencies are common in structural variation, and can be found, for example, on either end of a chromosomal inversion.

To represent this, the Sequence Location used by each partner of the adjacency is defined using only one of the start or end attributes. Defining the location by start means that the sequence content extends right (increases) on the Sequence Reference, and defining the location by end means that the sequence content extends left (decreases) on the Sequence Reference.

../../_images/ex_simple_breakpoint.png

An example simple Adjacency. The chromosome 1 sequence extends left from position 1:123 and so is defined by the location start. The chromosome 2 sequence extends right from position 2:456 and so is defined by the location end.

../../_images/ex_revcomp_breakpoint.png

An example Adjacency with a reverse complement partner. The chromosome 1 sequence extends left from position 1:87337011 and so is defined by the location start. The chromosome 10 sequence also extends left from position 10:36119127 and so is also defined by the location start. Reading left-to-right along this adjacency one would expect reference sequence up to the adjacency and reverse complement sequence following.

Normalization

Conventions for ordering sequences and handling ambiguous sequence Adjacencies are described in Adjacency Normalization.

Linker Sequences

Intervening sequences between adjoined sequences in an adjacency are called linker sequences and may be specified with a Sequence Expression.`